WO2024107837A1 - Semantic heat map for robot object search - Google Patents

Semantic heat map for robot object search Download PDF

Info

Publication number
WO2024107837A1
WO2024107837A1 PCT/US2023/079815 US2023079815W WO2024107837A1 WO 2024107837 A1 WO2024107837 A1 WO 2024107837A1 US 2023079815 W US2023079815 W US 2023079815W WO 2024107837 A1 WO2024107837 A1 WO 2024107837A1
Authority
WO
WIPO (PCT)
Prior art keywords
environment
heat map
object class
robotic device
area
Prior art date
Application number
PCT/US2023/079815
Other languages
French (fr)
Inventor
Ignacio Pablo Mellado BATALLER
Original Assignee
Google Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google Llc filed Critical Google Llc
Publication of WO2024107837A1 publication Critical patent/WO2024107837A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations

Definitions

  • Robotic devices may be used for applications involving material handling, transportation, welding, assembly, and dispensing, among others.
  • the manner in which these robotic systems operate is becoming more intelligent, efficient, and intuitive.
  • robotic systems become increasingly prevalent in numerous aspects of modem life, it is desirable for robotic systems to be efficient. Therefore, a demand for efficient robotic systems has helped open up a field of innovation in actuators, movement, sensing techniques, as well as component design and assembly.
  • Example embodiments involve a semantic heat map determination method for object searches.
  • a robotic device may collect sensor data and determine object detections identifying object classes based on the objects that appear in the sensor data. Based on the object detections, the robotic device may determine a heat map of a particular object class that may identify areas in the environment where objects of the particular object class appear more frequently than other areas. The robotic device may navigate in the environment based on the heat map.
  • a method includes receiving, from at least one sensor on at least one robotic device, data representative of a plurality of locations in an environment.
  • the method also includes determining a plurality of object detections for the data representative of the environment. Each object detection identifies an object class corresponding to an object detected at one of the plurality of locations in the environment.
  • the method further includes determining a heat map of a particular object class based on the plurality of object detections. Each cell in the heat map corresponds to an area in the environment and is associated with a measurement of how frequently the particular object class was detected at the area in the environment.
  • a robotic device includes at least one sensor and a control system.
  • the control system is configured to receive, from the at least one sensor on the robotic device, data representative of a plurality of locations in an environment.
  • the control system is also configured to determine a plurality of object detections for the data representative of the environment, where each object detection identifies an object class corresponding to an object detected at one of the plurality of locations in the environment.
  • the control system is further configured to determine a heat map of a particular object class based on the plurality of object detections.
  • Each cell in the heat map corresponds to an area in the environment and is associated with a measurement of how frequently the particular object class was detected at the area in the environment.
  • the control system is additionally configured to cause the robotic device to navigate towards a target area in the environment corresponding to a cell in the heat map based on a respective measurement of how frequently the particular object class was detected at the target area in the environment.
  • a non-transitory computer readable medium which includes program instructions executable by at least one processor to cause the at least one processor to perform functions.
  • the functions include receiving, from the at least one sensor on the robotic device, data representative of a plurality of locations in an environment.
  • the functions further include determining a plurality of object detections for the data representative of the environment. Each object detection identifies an object class corresponding to an object detected at one of the plurality of locations in the environment.
  • the functions also include determining a heat map of a particular object class based on the plurality of object detections. Each cell in the heat map corresponds to an area in the environment and is associated with a measurement of how frequently the particular object class was detected at the area in the environment.
  • the functions also include causing the robotic device to navigate towards a target area in the environment corresponding to a cell in the heat map based on a respective measurement of how frequently the particular object class was detected at the target area in the environment.
  • a system in a further embodiment, includes means for receiving, from the at least one sensor on the robotic device, data representative of a plurality of locations in an environment.
  • the system also includes means for determining a plurality of object detections for the data representative of the environment, where each object detection identifies an object class corresponding to an object detected at one of the plurality of locations in the environment.
  • the system further includes means for determining a heat map of a particular object class based on the plurality of object detections, where each cell in the heat map corresponds to an area in the environment and is associated with a measurement of how frequently the particular object class was detected at the area in the environment.
  • the system additionally includes means for causing the robotic device to navigate towards a target area in the environment corresponding to a cell in the heat map based on a respective measurement of how frequently the particular object class was detected at the target area in the environment.
  • Figure 1 illustrates a configuration of a robotic system, in accordance with example embodiments.
  • Figure 2 illustrates a mobile robot, in accordance with example embodiments.
  • Figure 3 illustrates an exploded view of a mobile robot, in accordance with example embodiments.
  • Figure 4 illustrates a robotic arm, in accordance with example embodiments.
  • Figure 5 is a diagram illustrating training and inference phases of a machine learning model, in accordance with example embodiments.
  • Figure 6 is a block diagram of a method, in accordance with example embodiments.
  • Figure 7 depicts an environment, in accordance with example embodiments.
  • Figure 8 depicts data representative of an environment, in accordance with example embodiments.
  • Figure 9 depicts a heat map of an environment, in accordance with example embodiments.
  • Figure 10 depicts a path based on a heat map of an environment, in accordance with example embodiments.
  • Example methods, devices, and systems are described herein. It should be understood that the words “example” and “exemplary” are used herein to mean “serving as an example, instance, or illustration.” Any embodiment or feature described herein as being an “example” or “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments or features unless indicated as such. Other embodiments can be utilized, and other changes can be made, without departing from the scope of the subject matter presented herein. [0020] Thus, the example embodiments described herein are not meant to be limiting. It will be readily understood that the aspects of the present disclosure, as generally described herein, and illustrated in the figures, can be arranged, substituted, combined, separated, and designed in a wide variety of different configurations.
  • any enumeration of elements, blocks, or steps in this specification or the claims is for purposes of clarity. Thus, such enumeration should not be interpreted to require or imply that these elements, blocks, or steps adhere to a particular arrangement or are carried out in a particular order.
  • a robotic device or a fleet of robotic devices may be tasked with performing various tasks in an environment, such as vacuuming, clearing dishes, removing spills, organizing packages, among other missions.
  • the robotic device may include one or more LIDAR and/or camera sensors, which the robotic device may use to determine a type of one or more objects in the environment.
  • the robotic device may determine whether an object is relevant to its task and react accordingly. For example, the robotic device may determine that an image captured by a camera on the robotic device likely contains a package. In response, the robotic device may move the package to another, perhaps more centralized and convenient location. As another example, the robotic device may determine that an image captured by a camera on the robotic device likely contains dishes. In response, the robotic device may move the dishes to the sink.
  • a given environment may include one or more of these robotic devices, each carrying out one or more tasks.
  • the environment may be a large office building with many rooms and containing a robotic device tasked with moving packages, another robotic device tasked with clearing tables, and another robotic device tasked with vacuuming.
  • it may be important that the robotic device is navigating effectively in the environment. Issues may arise where the robotic devices do not have adequate information to navigate effectively in the environment.
  • Each robotic device may have a general map of the building in which they are operating, but each robotic device may have little context for where the relevant objects are located. For example, a robotic device tasked with moving packages may not have information indicating where the packages are and a robotic device tasked with clearing tables may not have information indicating where the tables with dishes to be cleared are.
  • the robotic device may collect data of various locations in the environment and analyze the data for object detections indicating the presence of an object of a particular object class at the location in the environment.
  • the robotic device may take an image at a location in the environment and determine whether the image contains an object of a particular object class.
  • the image may be of a plate on a table, and the robotic device may determine that the location at which or of which the image was taken contains a mug.
  • the robotic device may repeat this process for additional locations in the environment, storing each object detection and the associated location.
  • the robotic device may determine a heat map, where each cell in the heat map represents an area in the environment and corresponds to a measurement of how frequently the object class was detected in that area in the environment and how confident the model is that the object belongs to each class. For example, the robotic device may determine that, for a given detected object, the confidence value associated with a classification of apple is 0.7, the confidence value associated with a classification of orange is 0.2, and the confidence value associated with a classification of banana is 0.1. The robotic device may then add each one of these scores to its corresponding class heat map for each cell that the point cloud of the object touches.
  • the robotic device may use the heat map to determine a path through which to navigate.
  • the robotic device may analyze the heat map for target areas where object detections of a particular object class occur more frequently to determine which locations the robotic device may first navigate to so that the robotic device may efficiently navigate in the environment.
  • the robotic device may determine a target location as being associated with a cell in the heat map, where the cell in the heat map is associated with a measurement indicating that object detections at that location occur more frequently than any other location represented in the heat map.
  • the robotic device may determine a path from its location to the target location, and the robotic device may navigate along the determined path.
  • the robotic device may determine a plurality of target locations.
  • Each target location may be associated with a cell in the heat map, and the cell may be associated with a measurement indicating that objects of the particular object class are detected more frequently than other cells.
  • the robotic device may determine a path that includes each of the target locations, and the robotic device may navigate along the determined path.
  • one or more additional robotic devices may be operating in the environment, and the additional robotic devices may also collect data and determine object detections from the data representative of the environment.
  • the robotic devices may send the data to a server device, and the server device may determine the object detections.
  • each robotic device may analyze the data that it collects to determine object detections and send the determined object detections to a server device to determine an aggregate heat map.
  • the server device may send the requested heat map to the robotic device and the robotic device may use the heat map to determine a path through which to navigate.
  • the robotic device may request for a path through which to navigate in order to complete a task, and the server device may determine and send the requested path to the robotic device, causing the robotic device to navigate towards one or more target areas in the environment.
  • the heat map may be associated with a particular day part.
  • robotic devices operating in an environment may collect data in the morning and in the afternoon.
  • a computing system may determine a heat map associated with the data collected in the morning and a heat map associated with data collected in the afternoon.
  • a robotic device may base navigation on a heat map associated with the day part during which it is navigating.
  • a robotic device operating in the environment during the afternoon may base navigation on a heat map associated with the afternoon. Determining different heat maps for various day parts may be useful given that object locations may be based on the time of day. For example, if the robotic device is tasked with clearing cups off of tables, the majority of cups may be located on desks in the mornings and in the cafeteria area around lunch time.
  • a computing device may determine and/or update a heat map periodically. For example, the computing system may receive data and/or object detections every evening, and the computing system may determine one or more new heat maps every evening based on the data and/or object detections. Additionally and/or alternatively, the computing system may update one or more existing heat maps with the data and/or object detections received every evening.
  • FIG. 1 illustrates an example configuration of a robotic system that may be used in connection with the implementations described herein.
  • Robotic system 100 may be configured to operate autonomously, semi -autonomously, or using directions provided by user(s).
  • Robotic system 100 may be implemented in various forms, such as a robotic arm, industrial robot, or some other arrangement. Some example implementations involve a robotic system 100 engineered to be low cost at scale and designed to support a variety of tasks.
  • Robotic system 100 may be designed to be capable of operating around people.
  • Robotic system 100 may also be optimized for machine learning.
  • robotic system 100 may also be referred to as a robot, robotic device, or mobile robot, among other designations.
  • robotic system 100 may include processor(s) 102, data storage 104, and controller(s) 108, which together may be part of control system 118.
  • Robotic system 100 may also include sensor(s) 112, power source(s) 114, mechanical components 110, and electrical components 116. Nonetheless, robotic system 100 is shown for illustrative purposes, and may include more or fewer components.
  • the various components of robotic system 100 may be connected in any manner, including wired or wireless connections. Further, in some examples, components of robotic system 100 may be distributed among multiple physical entities rather than a single physical entity. Other example illustrations of robotic system 100 may exist as well.
  • Processor(s) 102 may operate as one or more general-purpose hardware processors or special purpose hardware processors (e.g., digital signal processors, application specific integrated circuits, etc.). Processor(s) 102 may be configured to execute computer-readable program instructions 106, and manipulate data 107, both of which are stored in data storage 104. Processor(s) 102 may also directly or indirectly interact with other components of robotic system 100, such as sensor(s) 112, power source(s) 114, mechanical components 110, or electrical components 116.
  • Data storage 104 may be one or more types of hardware memory.
  • data storage 104 may include or take the form of one or more computer-readable storage media that can be read or accessed by processor(s) 102.
  • the one or more computer-readable storage media can include volatile or non-volatile storage components, such as optical, magnetic, organic, or another type of memory or storage, which can be integrated in whole or in part with processor(s) 102.
  • data storage 104 can be a single physical device.
  • data storage 104 can be implemented using two or more physical devices, which may communicate with one another via wired or wireless communication.
  • data storage 104 may include the computer-readable program instructions 106 and data 107.
  • Data 107 may be any type of data, such as configuration data, sensor data, or diagnostic data, among other possibilities.
  • Controller 108 may include one or more electrical circuits, units of digital logic, computer chips, or microprocessors that are configured to (perhaps among other tasks), interface between any combination of mechanical components 110, sensor(s) 112, power source(s) 114, electrical components 116, control system 118, or a user of robotic system 100.
  • controller 108 may be a purpose-built embedded device for performing specific operations with one or more subsystems of the robotic system 100.
  • Control system 118 may monitor and physically change the operating conditions of robotic system 100. In doing so, control system 118 may serve as a link between portions of robotic system 100, such as between mechanical components 110 or electrical components 116. In some instances, control system 118 may serve as an interface between robotic system 100 and another computing device. Further, control system 118 may serve as an interface between robotic system 100 and a user. In some instances, control system 118 may include various components for communicating with robotic system 100, including a joystick, buttons, or ports, etc. The example interfaces and communications noted above may be implemented via a wired or wireless connection, or both. Control system 118 may perform other operations for robotic system 100 as well.
  • control system 118 may communicate with other systems of robotic system 100 via wired or wireless connections, and may further be configured to communicate with one or more users of the robot.
  • control system 118 may receive an input (e.g., from a user or from another robot) indicating an instruction to perform a requested task, such as to pick up and move an object from one location to another location. Based on this input, control system 118 may perform operations to cause the robotic system 100 to make a sequence of movements to perform the requested task.
  • a control system may receive an input indicating an instruction to move to a requested location.
  • control system 118 (perhaps with the assistance of other components or systems) may determine a direction and speed to move robotic system 100 through an environment en route to the requested location.
  • control system 118 Operations of control system 118 may be carried out by processor(s) 102. Alternatively, these operations may be carried out by controlled s) 108, or a combination of processor(s) 102 and controller(s) 108. In some implementations, control system 118 may partially or wholly reside on a device other than robotic system 100, and therefore may at least in part control robotic system 100 remotely.
  • Mechanical components 110 represent hardware of robotic system 100 that may enable robotic system 100 to perform physical operations.
  • robotic system 100 may include one or more physical members, such as an arm, an end effector, a head, a neck, a torso, a base, and wheels.
  • the physical members or other parts of robotic system 100 may further include actuators arranged to move the physical members in relation to one another.
  • Robotic system 100 may also include one or more structured bodies for housing control system 118 or other components, and may further include other types of mechanical components.
  • the particular mechanical components 110 used in a given robot may vary based on the design of the robot, and may also be based on the operations or tasks the robot may be configured to perform.
  • mechanical components 110 may include one or more removable components.
  • Robotic system 100 may be configured to add or remove such removable components, which may involve assistance from a user or another robot.
  • robotic system 100 may be configured with removable end effectors or digits that can be replaced or changed as needed or desired.
  • robotic system 100 may include one or more removable or replaceable battery units, control systems, power systems, bumpers, or sensors. Other types of removable components may be included within some implementations.
  • Robotic system 100 may include sensor(s) 112 arranged to sense aspects of robotic system 100.
  • Sensor(s) 112 may include one or more force sensors, torque sensors, velocity sensors, acceleration sensors, position sensors, proximity sensors, motion sensors, location sensors, load sensors, temperature sensors, touch sensors, depth sensors, ultrasonic range sensors, infrared sensors, object sensors, or cameras, among other possibilities.
  • robotic system 100 may be configured to receive sensor data from sensors that are physically separated from the robot (e.g., sensors that are positioned on other robots or located within the environment in which the robot is operating).
  • Sensor(s) 112 may provide sensor data to processor(s) 102 (perhaps by way of data 107) to allow for interaction of robotic system 100 with its environment, as well as monitoring of the operation of robotic system 100.
  • the sensor data may be used in evaluation of various factors for activation, movement, and deactivation of mechanical components 110 and electrical components 116 by control system 118.
  • sensor(s) 112 may capture data corresponding to the terrain of the environment or location of nearby objects, which may assist with environment recognition and navigation.
  • sensor(s) 112 may include RADAR (e.g., for long-range object detection, distance determination, or speed determination), LIDAR (e.g., for short-range object detection, distance determination, or speed determination), SONAR (e.g., for underwater object detection, distance determination, or speed determination), VIC ON® (e.g., for motion capture), one or more cameras (e.g., stereoscopic cameras for 3D vision), a global positioning system (GPS) transceiver, or other sensors for capturing information of the environment in which robotic system 100 is operating.
  • Sensor(s) 112 may monitor the environment in real time, and detect obstacles, elements of the terrain, weather conditions, temperature, or other aspects of the environment.
  • sensor(s) 112 may capture data corresponding to one or more characteristics of a target or identified object, such as a size, shape, profile, structure, or orientation of the object.
  • robotic system 100 may include sensor(s) 112 configured to receive information indicative of the state of robotic system 100, including sensor(s) 112 that may monitor the state of the various components of robotic system 100.
  • Sensor(s) 112 may measure activity of systems of robotic system 100 and receive information based on the operation of the various features of robotic system 100, such as the operation of an extendable arm, an end effector, or other mechanical or electrical features of robotic system 100.
  • the data provided by sensor(s) 112 may enable control system 118 to determine errors in operation as well as monitor overall operation of components of robotic system 100.
  • robotic system 100 may use force/torque sensors to measure load on various components of robotic system 100.
  • robotic system 100 may include one or more force/torque sensors on an arm or end effector to measure the load on the actuators that move one or more members of the arm or end effector.
  • the robotic system 100 may include a force/torque sensor at or near the wrist or end effector, but not at or near other joints of a robotic arm.
  • robotic system 100 may use one or more position sensors to sense the position of the actuators of the robotic system. For instance, such position sensors may sense states of extension, retraction, positioning, or rotation of the actuators on an arm or end effector.
  • sensor(s) 112 may include one or more velocity or acceleration sensors.
  • sensor(s) 112 may include an inertial measurement unit (IMU).
  • IMU inertial measurement unit
  • the IMU may sense velocity and acceleration in the world frame, with respect to the gravity vector. The velocity and acceleration sensed by the IMU may then be translated to that of robotic system 100 based on the location of the IMU in robotic system 100 and the kinematics of robotic system 100.
  • Robotic system 100 may include other types of sensors not explicitly discussed herein. Additionally or alternatively, the robotic system may use particular sensors for purposes not enumerated herein.
  • Robotic system 100 may also include one or more power source(s) 114 configured to supply power to various components of robotic system 100.
  • robotic system 100 may include a hydraulic system, electrical system, batteries, or other types of power systems.
  • robotic system 100 may include one or more batteries configured to provide charge to components of robotic system 100.
  • Some of mechanical components 110 or electrical components 116 may each connect to a different power source, may be powered by the same power source, or be powered by multiple power sources.
  • robotic system 100 may include a hydraulic system configured to provide power to mechanical components 110 using fluid power. Components of robotic system 100 may operate based on hydraulic fluid being transmitted throughout the hydraulic system to various hydraulic motors and hydraulic cylinders, for example. The hydraulic system may transfer hydraulic power by way of pressurized hydraulic fluid through tubes, flexible hoses, or other links between components of robotic system 100. Power source(s) 114 may charge using various types of charging, such as wired connections to an outside power source, wireless charging, combustion, or other examples.
  • Electrical components 116 may include various mechanisms capable of processing, transferring, or providing electrical charge or electric signals.
  • electrical components 116 may include electrical wires, circuitry, or wireless communication transmitters and receivers to enable operations of robotic system 100. Electrical components 116 may interwork with mechanical components 110 to enable robotic system 100 to perform various operations. Electrical components 116 may be configured to provide power from power source(s) 114 to the various mechanical components 110, for example. Further, robotic system 100 may include electric motors. Other examples of electrical components 116 may exist as well.
  • Robotic system 100 may include a body, which may connect to or house appendages and components of the robotic system.
  • the structure of the body may vary within examples and may further depend on particular operations that a given robot may have been designed to perform.
  • a robot developed to carry heavy loads may have a wide body that enables placement of the load.
  • a robot designed to operate in tight spaces may have a relatively tall, narrow body.
  • the body or the other components may be developed using various types of materials, such as metals or plastics.
  • a robot may have a body with a different structure or made of various types of materials.
  • the body or the other components may include or carry sensor(s) 112. These sensors may be positioned in various locations on the robotic system 100, such as on a body, a head, a neck, a base, a torso, an arm, or an end effector, among other examples.
  • Robotic system 100 may be configured to carry a load, such as a type of cargo that is to be transported.
  • the load may be placed by the robotic system 100 into a bin or other container attached to the robotic system 100.
  • the load may also represent external batteries or other types of power sources (e.g., solar panels) that the robotic system 100 may utilize. Carrying the load represents one example use for which the robotic system 100 may be configured, but the robotic system 100 may be configured to perform other operations as well.
  • robotic system 100 may include various types of appendages, wheels, end effectors, gripping devices and so on.
  • robotic system 100 may include a mobile base with wheels, treads, or some other form of locomotion.
  • robotic system 100 may include a robotic arm or some other form of robotic manipulator.
  • the base may be considered as one of mechanical components 110 and may include wheels, powered by one or more of actuators, which allow for mobility of a robotic arm in addition to the rest of the body.
  • FIG. 2 illustrates a mobile robot, in accordance with example embodiments.
  • Figure 3 illustrates an exploded view of the mobile robot, in accordance with example embodiments.
  • a robot 200 may include a mobile base 202, a midsection 204, an arm 206, an end-of-arm system (EOAS) 208, a mast 210, a perception housing 212, and a perception suite 214.
  • the robot 200 may also include a compute box 216 stored within mobile base 202.
  • EOAS end-of-arm system
  • the mobile base 202 includes two drive wheels positioned at a front end of the robot 200 in order to provide locomotion to robot 200.
  • the mobile base 202 also includes additional casters (not shown) to facilitate motion of the mobile base 202 over a ground surface.
  • the mobile base 202 may have a modular architecture that allows compute box 216 to be easily removed. Compute box 216 may serve as a removable control system for robot 200 (rather than a mechanically integrated control system). After removing external shells, the compute box 216 can be easily removed and/or replaced.
  • the mobile base 202 may also be designed to allow for additional modularity. For example, the mobile base 202 may also be designed so that a power system, a battery, and/or external bumpers can all be easily removed and/or replaced.
  • the midsection 204 may be attached to the mobile base 202 at a front end of the mobile base 202.
  • the midsection 204 includes a mounting column which is fixed to the mobile base 202.
  • the midsection 204 additionally includes a rotational joint for arm 206. More specifically, the midsection 204 includes the first two degrees of freedom for arm 206 (a shoulder yaw JO joint and a shoulder pitch JI joint).
  • the mounting column and the shoulder yaw JO joint may form a portion of a stacked tower at the front of mobile base 202.
  • the mounting column and the shoulder yaw JO joint may be coaxial.
  • the length of the mounting column of midsection 204 may be chosen to provide the arm 206 with sufficient height to perform manipulation tasks at commonly encountered height levels (e.g., coffee table top and counter top levels).
  • the length of the mounting column of midsection 204 may also allow the shoulder pitch J 1 joint to rotate the arm 206 over the mobile base 202 without contacting the mobile base 202.
  • the arm 206 may be a 7DOF robotic arm when connected to the midsection 204. As noted, the first two DOFs of the arm 206 may be included in the midsection 204. The remaining five DOFs may be included in a standalone section of the arm 206 as illustrated in Figures 2 and 3.
  • the arm 206 may be made up of plastic monolithic link structures. Inside the arm 206 may be housed standalone actuator modules, local motor drivers, and thru bore cabling.
  • the EOAS 208 may be an end effector at the end of arm 206. EOAS 208 may allow the robot 200 to manipulate objects in the environment. As shown in Figures 2 and 3, EOAS 208 may be a gripper, such as an underactuated pinch gripper.
  • the gripper may include one or more contact sensors such as force/torque sensors and/or non-contact sensors such as one or more cameras to facilitate object detection and gripper control.
  • EOAS 208 may also be a different type of gripper such as a suction gripper or a different type of tool such as a drill or a brush.
  • EOAS 208 may also be swappable or include swappable components such as gripper digits.
  • the mast 210 may be a relatively long, narrow component between the shoulder yaw JO joint for arm 206 and perception housing 212.
  • the mast 210 may be part of the stacked tower at the front of mobile base 202.
  • the mast 210 may be fixed relative to the mobile base 202.
  • the mast 210 may be coaxial with the midsection 204.
  • the length of the mast 210 may facilitate perception by perception suite 214 of objects being manipulated by EOAS 208.
  • the mast 210 may have a length such that when the shoulder pitch JI joint is rotated vertical up, a topmost point of a bicep of the arm 206 is approximately aligned with a top of the mast 210. The length of the mast 210 may then be sufficient to prevent a collision between the perception housing 212 and the arm 206 when the shoulder pitch JI joint is rotated vertical up.
  • the mast 210 may include a 3D lidar sensor configured to collect depth information about the environment.
  • the 3D lidar sensor may be coupled to a carved-out portion of the mast 210 and fixed at a downward angle.
  • the lidar position may be optimized for localization, navigation, and for front cliff detection.
  • the perception housing 212 may include at least one sensor making up perception suite 214.
  • the perception housing 212 may be connected to a pan/tilt control to allow for reorienting of the perception housing 212 (e.g., to view objects being manipulated by EOAS 208).
  • the perception housing 212 may be a part of the stacked tower fixed to the mobile base 202. A rear portion of the perception housing 212 may be coaxial with the mast 210.
  • the perception suite 214 may include a suite of sensors configured to collect sensor data representative of the environment of the robot 200.
  • the perception suite 214 may include an infrared(IR)-assisted stereo depth sensor.
  • the perception suite 214 may additionally include a wide-angled red-green-blue (RGB) camera for human-robot interaction and context information.
  • the perception suite 214 may additionally include a high resolution RGB camera for object classification.
  • a face light ring surrounding the perception suite 214 may also be included for improved human-robot interaction and scene illumination.
  • the perception suite 214 may also include a projector configured to project images and/or video into the environment.
  • FIG. 4 illustrates a robotic arm, in accordance with example embodiments.
  • the robotic arm includes 7 DOFs: a shoulder yaw JO joint, a shoulder pitch JI joint, a bicep roll J2 joint, an elbow pitch J3 joint, a forearm roll J4 joint, a wrist pitch J5 joint, and wrist roll J6 joint.
  • Each of the joints may be coupled to one or more actuators.
  • the actuators coupled to the joints may be operable to cause movement of links down the kinematic chain (as well as any end effector attached to the robot arm).
  • the shoulder yaw JO joint allows the robot arm to rotate toward the front and toward the back of the robot.
  • One beneficial use of this motion is to allow the robot to pick up an object in front of the robot and quickly place the object on the rear section of the robot (as well as the reverse motion).
  • Another beneficial use of this motion is to quickly move the robot arm from a stowed configuration behind the robot to an active position in front of the robot (as well as the reverse motion).
  • the shoulder pitch JI joint allows the robot to lift the robot arm (e.g., so that the bicep is up to perception suite level on the robot) and to lower the robot arm (e.g., so that the bicep is just above the mobile base).
  • This motion is beneficial to allow the robot to efficiently perform manipulation operations (e.g., top grasps and side grasps) at different target height levels in the environment.
  • the shoulder pitch JI joint may be rotated to a vertical up position to allow the robot to easily manipulate objects on a table in the environment.
  • the shoulder pitch JI joint may be rotated to a vertical down position to allow the robot to easily manipulate objects on a ground surface in the environment.
  • the bicep roll J2 joint allows the robot to rotate the bicep to move the elbow and forearm relative to the bicep. This motion may be particularly beneficial for facilitating a clear view of the EOAS by the robot’s perception suite.
  • the robot may kick out the elbow and forearm to improve line of sight to an object held in a gripper of the robot.
  • alternating pitch and roll joints (a shoulder pitch JI joint, a bicep roll J2 joint, an elbow pitch J3 joint, a forearm roll J4 joint, a wrist pitch J5 joint, and wrist roll J6 joint) are provided to improve the manipulability of the robotic arm.
  • the axes of the wrist pitch J5 joint, the wrist roll J6 joint, and the forearm roll J4 joint are intersecting for reduced arm motion to reorient objects.
  • the wrist roll J6 point is provided instead of two pitch joints in the wrist in order to improve object rotation.
  • a robotic arm such as the one illustrated in Figure 4 may be capable of operating in a teach mode.
  • teach mode may be an operating mode of the robotic arm that allows a user to physically interact with and guide robotic arm towards carrying out and recording various movements.
  • an external force is applied (e.g., by the user) to the robotic arm based on a teaching input that is intended to teach the robot regarding how to carry out a specific task.
  • the robotic arm may thus obtain data regarding how to carry out the specific task based on instructions and guidance from the user.
  • Such data may relate to a plurality of configurations of mechanical components, joint position data, velocity data, acceleration data, torque data, force data, and power data, among other possibilities.
  • the user may grasp onto the EOAS or wrist in some examples or onto any part of robotic arm in other examples, and provide an external force by physically moving robotic arm.
  • the user may guide the robotic arm towards grasping onto an object and then moving the object from a first location to a second location.
  • the robot may obtain and record data related to the movement such that the robotic arm may be configured to independently carry out the task at a future time during independent operation (e.g., when the robotic arm operates independently outside of teach mode).
  • external forces may also be applied by other entities in the physical workspace such as by other objects, machines, or robotic systems, among other possibilities.
  • Figure 5 shows diagram 500 illustrating a training phase 502 and an inference phase 504 of trained machine learning model(s) 532, in accordance with example embodiments.
  • Some machine learning techniques involve training one or more machine learning algorithms, on an input set of training data to recognize patterns in the training data and provide output inferences and/or predictions about (patterns in the) training data.
  • the resulting trained machine learning algorithm can be referred to as a trained machine learning model.
  • Figure 5 shows training phase 502 where one or more machine learning algorithms 520 are being trained on training data 510 to become trained machine learning model(s) 532.
  • trained machine learning model(s) 532 can receive input data 530 and one or more inference/prediction requests 540 (perhaps as part of input data 530) and responsively provide as an output one or more inferences and/or prediction(s) 550.
  • trained machine learning model(s) 532 can include one or more models of one or more machine learning algorithms 520.
  • Machine learning algorithm(s) 520 may include, but are not limited to: an artificial neural network (e.g., a herein-described convolutional neural networks, a recurrent neural network, a Bayesian network, a hidden Markov model, a Markov decision process, a logistic regression function, a support vector machine, a suitable statistical machine learning algorithm, and/or a heuristic machine learning system).
  • Machine learning algorithm(s) 520 may be supervised or unsupervised, and may implement any suitable combination of online and offline learning.
  • machine learning algorithm(s) 520 and/or trained machine learning model(s) 532 can be accelerated using on-device coprocessors, such as graphic processing units (GPUs), tensor processing units (TPUs), digital signal processors (DSPs), and/or application specific integrated circuits (ASICs).
  • on-device coprocessors can be used to speed up machine learning algorithm(s) 520 and/or trained machine learning model(s) 532.
  • trained machine learning model(s) 532 can be trained, reside and execute to provide inferences on a particular computing device, and/or otherwise can make inferences for the particular computing device.
  • machine learning algorithm(s) 520 can be trained by providing at least training data 510 as training input using unsupervised, supervised, semisupervised, and/or reinforcement learning techniques.
  • Unsupervised learning involves providing a portion (or all) of training data 510 to machine learning algorithm(s) 520 and machine learning algorithm(s) 520 determining one or more output inferences based on the provided portion (or all) of training data 510.
  • Supervised learning involves providing a portion of training data 510 to machine learning algorithm(s) 520, with machine learning algorithm(s) 520 determining one or more output inferences based on the provided portion of training data 510, and the machine learning model may be refined based on correct results associated with training data 510.
  • supervised learning of machine learning algorithm(s) 520 can be governed by a set of rules and/or a set of labels for the training input, and the set of rules and/or set of labels may be used to correct inferences of machine learning algorithm(s) 520.
  • Semi-supervised learning involves having correct results for part, but not all, of training data 510.
  • semi-supervised learning supervised learning is used for a portion of training data 510 having correct results
  • unsupervised learning is used for a portion of training data 510 not having correct results.
  • Reinforcement learning involves machine learning algorithm(s) 520 receiving a reward signal regarding a prior inference, where the reward signal can be a numerical value.
  • machine learning algorithm(s) 520 can output an inference and receive a reward signal in response, where machine learning algorithm(s) 520 are configured to try to maximize the numerical value of the reward signal.
  • reinforcement learning also utilizes a value function that provides a numerical value representing an expected total of the numerical values provided by the reward signal over time.
  • machine learning algorithm(s) 520 and/or trained machine learning model(s) 532 can be trained using other machine learning techniques, including but not limited to, incremental learning and curriculum learning.
  • machine learning algorithm(s) 520 and/or trained machine learning model(s) 532 can use transfer learning techniques.
  • transfer learning techniques can involve trained machine learning model(s) 532 being pre-trained on one set of data and additionally trained using training data 510.
  • machine learning algorithm(s) 520 can be pre-trained on data from one or more computing devices and a resulting trained machine learning model provided to computing device CD1, where CD1 is intended to execute the trained machine learning model during inference phase 504. Then, during training phase 502, the pre-trained machine learning model can be additionally trained using training data 510, where training data 510 can be derived from kernel and non -kernel data of computing device CD1.
  • This further training of the machine learning algorithm(s) 520 and/or the pretrained machine learning model using training data 510 of CDl’s data can be performed using either supervised or unsupervised learning.
  • training phase 502 can be completed.
  • the trained resulting machine learning model can be utilized as at least one of trained machine learning model(s) 532.
  • trained machine learning model(s) 532 can be provided to a computing device, if not already on the computing device.
  • Inference phase 504 can begin after trained machine learning model(s) 532 are provided to computing device CD1.
  • trained machine learning model(s) 532 can receive input data 530 and generate and output one or more corresponding inferences and/or prediction(s) 550 about input data 530.
  • input data 530 can be used as an input to trained machine learning model(s) 532 for providing corresponding inference(s) and/or prediction(s) 550 to kernel components and non-kemel components.
  • trained machine learning model(s) 532 can generate inference(s) and/or prediction(s) 550 in response to one or more inference/prediction requests 540.
  • trained machine learning model(s) 532 can be executed by a portion of other software.
  • trained machine learning model(s) 532 can be executed by an inference or prediction daemon to be readily available to provide inferences and/or predictions upon request.
  • Input data 530 can include data from computing device CD1 executing trained machine learning model(s) 532 and/or input data from one or more computing devices other than CD1. [0081] Input data 530 can include training data described herein. Other types of input data are possible as well.
  • Inference(s) and/or prediction(s) 550 can include task outputs, numerical values, and/or other output data produced by trained machine learning model(s) 532 operating on input data 530 (and training data 510).
  • trained machine learning model(s) 532 can use output inference(s) and/or prediction(s) 550 as input feedback 560.
  • Trained machine learning model(s) 532 can also rely on past inferences as inputs for generating new inferences.
  • the trained version of the neural network can be an example of trained machine learning model(s) 532.
  • an example of the one or more inference / prediction request(s) 540 can be a request to predict a classification for an input training example and a corresponding example of inferences and/or prediction(s) 550 can be a predicted classification output.
  • FIG. 6 is a block diagram of method 600, in accordance with example embodiments. Blocks 602, 604, and 606 may collectively be referred to as method 600.
  • method 600 of Figure 6 may be carried out by a control system, such as control system 118 of robotic system 100.
  • method 600 of Figure 6 may be carried out by a computing device or a server device remote from the robotic device.
  • method 600 may be carried out by one or more processors, such as processor(s) 102, executing program instructions, such as program instructions 106, stored in a data storage, such as data storage 104. Execution of method 600 may involve a robotic device, such as the robotic device illustrated and described with respect to Figures 1-4.
  • execution of method 600 may involve a computing device or a server device remote from the robotic device and robotic system 100.
  • Other robotic devices may also be used in the performance of method 600.
  • some or all of the blocks of method 600 may be performed by a control system remote from the robotic device.
  • different blocks of method 600 may be performed by different control systems, located on and/or remote from a robotic device.
  • each block of the block diagram may represent a module, a segment, or a portion of program code, which includes one or more instructions executable by one or more processors for implementing specific logical functions or steps in the process.
  • each block may represent circuitry that is wired to perform the specific logical functions in the process.
  • Alternative implementations are included within the scope of the example implementations of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrent or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art.
  • method 600 includes receiving, from at least one sensor on at least one robotic device, data representative of a plurality of locations in an environment.
  • Figure 7 depicts environment 700 in which robotic device 702 may be navigating.
  • Environment 700 may be a building and may include additional robotic devices that may also carry out the methods described herein.
  • Robotic device 702 may be a robotic device illustrated and described with respect to Figures 1-4, and may include at least one sensor, which may be a LIDAR sensor or a camera.
  • Robotic device 702 may be carrying out various tasks in environment 700, including, for example, clearing and/or cleaning tables, moving delivered packages, vacuuming specific areas of the ground, among other examples.
  • robotic device 702 may determine a navigation path through which to navigate.
  • robotic device 702 or another robotic device may first determine areas of interest where it may check for tables to be cleared, areas where packages may have been delivered, ground areas to be vacuumed, etc, and the robotic device 702 may then use those determined areas of interest to navigate through the environment and complete its tasks. The present method may facilitate such a determination.
  • robotic device 702 or another robotic device may collect data to determine target areas where it may be useful to navigate through for the purpose of completing one or more tasks.
  • robotic device 702 may capture an image of environment 700, e.g., of table 712 and objects 710, perhaps while passively navigating through the environment or while carrying out various tasks.
  • Robotic device 702 may also capture data at other areas of the environment.
  • robotic device 702 may capture data of objects 720, objects 730, objects 740, and/or objects 750.
  • Robotic device 702 may use one or more LIDAR sensors and/or one or more cameras to collect data of or at various locations in the environment.
  • method 600 includes determining a plurality of object detections for the data representative of the environment, where each object detection may identify an object class corresponding to an object detected at one of the plurality of locations in the environment.
  • the object class may represent one or more objects that are stationary for at least a period of time, including, for example, dishes, packages, spills, among other examples.
  • robotic device 702 may analyze the data captured in environment 700 to determine whether the data includes a plate, a mug, or some other object classification that may be of interest. Robotic device 702 may carry out this classification of objects while capturing data of environment 700 and/or after a particular period of time of navigating and collecting data in environment 700 (e.g., after 24 hours, a week, etc.).
  • Figure 8 depicts data collected of an environment, in accordance with example embodiments.
  • robotic device 702 of environment 700 may collect image 800 and/or image 850 while navigating in environment 700.
  • Images 800 and 850 may depict various objects in the environment.
  • images 800 may depict table 812, plate 814, and bottle 816, while image 850 may depict table 812.
  • a computing system may analyze images 800 and 850 to determine which objects are present in each image.
  • the computing system may use an object segmentation model to segment the image into various portions and classify each portion of the image into various object classes. For example, the computing system may segment image 800 into various portions and determine object classifications including table 812, bottle 816, and/or plate 814. And the computing system may segment image 850 into various portions and classify the various portions of the image to include table 812. Additionally or alternatively, the computing device may use a pre-trained machine learning model to classify objects that appear in images 800 and 850. The computing system may likewise determine that image 800 includes a bottle, plate, and table, and that image 850 includes table 812.
  • the computing system may store the various object classifications in a database. For example, the computing system may determine that data collected at a location includes object classifications corresponding to a table, bottle, and/or plate, and the computing system may store information indicating that that particular location contains objects corresponding to a table, bottle, and plate. In some examples, the computing system may store the data in the form of a 3 -dimensional matrix, such that each two dimensional slice of the matrix corresponds to an object classification and each cell within the two dimensional slice of the matrix corresponds to a location and whether an object of the object classification is located in an area corresponding to the cell.
  • the computing system may store the data for each object classification in the form of a three-dimensional matrix, such that each cell within the three-dimensional matrix corresponds to a location, a height, and whether an object of the object classification is located in an area corresponding to the cell.
  • the computing system may determine and/or store object classifications based on a mission that a robotic device is or will be carrying out. For example, if the robotic device is carrying out a mission to collect used dishes from tables, the object classifications may include dishes, trays, mugs, among other objects. However, the object classifications may not include and/or the robotic device may not store unrelated information, including, for example, areas where packages were present. Additionally or alternatively, the computing system may determine and store object classifications for additional missions that the robotic device or another robotic device may carry out.
  • method 600 includes determining a heat map of a particular object class based on the plurality of object detections, where each cell in the heat map corresponds to an area in the environment and is associated with a measurement of how frequently the particular object class was detected at the area in the environment.
  • Figure 9 depicts heat map 900 of an environment, in accordance with example embodiments. Heat map 900 as depicted by Figure 9 may be overlaid on a map of environment 700. Heat map 900 may include one or more cells, where each of the cells in heat map 900 may correspond to an area in the environment for which the robotic device determined an object detection or did not determine an object detection.
  • Heat map 900 includes cells 910, 920, 930, and 940, and each cell in heat map 900 may be associated with a measurement of how frequently a particular object class was detected at the area in the environment.
  • heat map 900 may include cells with various patterns, such that each pattern represents a different measurement of how frequently the particular object class was detected at the area in the environment and/or how confident the classifier is that the object belongs to a certain class.
  • cells 922, 930, 940 as displayed with a striped pattern may be associated with a measurement of 1/9
  • cell 910 as displayed with a grid pattern may be associated with a measurement of 2/9
  • cell 920 as displayed with a fine grid pattern may be associated with a measurement of 3/9.
  • the measurement may be based on a number of object detections of the object class at the area in the environment and a total number of object detections of the object class in the environment.
  • the measurement may represent a probability of observing the object class in the area represented by the cell, assuming that future occurrences of the object class will be in approximately the same locations as previous occurrences of the object class.
  • the measurement may be a ratio of the number of object detections of the object class at the area in the environment and the total number of object detections of the object class in the environment.
  • measurements for each cell in the heat map may be calculated using the following equation: p(Z
  • heat map 900 may be associated with nine total object detections, and cell 920 of heat map 900 may be associated with a measurement of 3/9, as three detections of plates were detected at the location corresponding to cell 920.
  • cell 922 may be associated with a measurement of 1/9, as one plate was detected at the location corresponding to cell 922.
  • these measurements may be collected over time and the measurement in the associated cell may be associated with how many times an object of the given class appeared in the cell. For example, plates may have appeared in cell 920 three times over the period of a day and eleven total plate detections in the environment may have been observed over the course of the same period. Therefore, cell 920 may be associated with a measurement of 3/11.
  • the computing device may update heat map 900 as additional object detections are collected. For example, the computing device may collect additional data at the location indicated by cell 920 and 922, and the computing device may determine that an additional plate was detected at the location represented by cell 920 but no additional plate was detected at the location represented by cell 922. The computing device may subsequently update the heat map such that cell 920 is associated with a measurement of 4/10 and that cell 922 is associated with a measurement of 1/10. The computing device may similarly update the measurements of the other cells in the heat map. Updating measurements of the heat map may be performed periodically (e.g., every day, after every morning and after every afternoon, every week, etc.) or as data is received.
  • the computing device may collect additional data at the location indicated by cell 920 and 922, and the computing device may determine that an additional plate was detected at the location represented by cell 920 but no additional plate was detected at the location represented by cell 922.
  • the computing device may subsequently update the heat map such that cell 920 is associated with a measurement of 4/10 and
  • the computing device may determine measurements of heat map 900 based on how confident the computing device is that the detected object belongs to the class. For example, the computing device may determine that 0.7 is the confidence measure that a plate was detected at the location represented by cell 920, and the computing device may associate the corresponding cell in heat map 900 with that confidence measure. At the location represented by cell 920, the computing device may also classify the detected object as an orange with a 0.2 confidence measurement and as a banana with a 0.1 confidence measurement. Further, the computing device may assign these measurements to each cell with which the detected object comes into contact.
  • the computing device determining heat map 900 may store the measurement of the number of object detections of the object class at the area in the environment separately from the total number of object detections of the object class in the environment. For example, the computing device may store only the measurements of the number of object detections of the object class at the area in the environment and determine the total number of object detections of the object class at the time that the heat map is requested. In this manner, the computing system may facilitate the determination of heat maps, since the computing system may simply increase the count of the number of object detections at the location at which the object class was detected, rather than updating every measurement in the heat map.
  • the computing device may determine a resolution of heat map 900 (e.g., the size of the cells in the heat map) based on a quantity of object detections and a quantity of locations in the environment. For example, the computing device may determine a heat map with a higher resolution (e.g., smaller cells) when more data and/or a higher quantity of object detections have been detected in the environment in a higher quantity of locations in the environment. And the computing device may determine a heat map with a lower resolution (e.g., larger cells) when less data has been collected of the environment in a lower quantity of locations in the environment.
  • a resolution of heat map 900 e.g., the size of the cells in the heat map
  • the computing device may determine heat maps associated with various day parts, week parts, month parts, and/or other granularity that is multiple of the time between heat map calculations.
  • the computing device may determine a heat map for a particular object class associated with object detections from data collected in the morning and the computing device may determine another heat map for the particular object class associated with object detections from data collected in the afternoon.
  • a robotic device may determine navigation based on the time of day. For example, if the robotic device is navigating in the morning, the robotic device may base navigation paths based on the heat map associated with object detections collected in the morning. If the robotic device is navigating in the afternoon, the robotic device may base navigation paths based on the heat map associated with object detections collected in the afternoon. Basing navigation based on time of day may result in more efficient navigation as the presence of objects may be based on the time of day (e.g., package delivery may occur at a specific time of day, dishes and bottles may be leftover from lunch around noon, etc.).
  • method 600 includes causing a robotic device to navigate towards a target area in the environment corresponding to a cell in the heat map based on a respective measurement of how frequently the particular object class was detected at the target area in the environment.
  • Figure 10 depicts path 1050 based on heat map 1000 of the environment.
  • Heat map 1000 may include cells 1010, 1020, 1022, 1030, and 1040, each of which may represent a measurement of how frequently an object class is detected at each respective location in the environment.
  • cells 1010 and 1020 may represent areas at which an object class was detected more frequently than areas represented by cells 1022, 1030, and 1040.
  • the object class may be dishes and robotic device 1002 may be carrying out a mission to clear dishes, which may be leftover from a lunch break.
  • a computing device perhaps of robotic device 1002, may determine path 1050 based on heat map 1000 such that robotic device 1002 may check for dishes more efficiently.
  • the computing device may determine an efficient path from the location of robotic device 1002 to a location where the object class was frequently detected in the past.
  • the computing device may determine target areas including the target area based on a plurality of respective measurements of how frequently the particular object class was detected at the target areas in the environment. Based on these target areas, the computing device may determine a path that includes one or more of the target areas, and the computing device may cause the robotic device to traverse the determined path.
  • the computing device may determine that target areas are the two areas where dishes were observed most frequently are the areas represented by cells 1020 and 1010, whereas dishes are observed less frequently in areas represented by cells 1022, 1030, and 1040 are not target areas.
  • the computing device may thus determine a path from robotic device 1002 to the area represented by cell 1020 to the area represented by cell 1010, as shown by path 1050. And the computing device may cause robotic device 1002 to traverse through this path, perhaps to more efficiently determine areas where objects of the object class may be located.
  • the computing device may also determine target areas based on how frequently a plurality of object classes was detected at the target area in the environment. For example, dishes and bottles may both need to be cleared from table tops. The computing device may thus determine target areas based on how frequently dishes and bottles were detected at each location in the environment, where dishes and bottles may each be associated with a heat map. In particular, the computing device may combine measurements from a heat map corresponding to dishes and measurements from a heat map corresponding to bottles to determine target areas.
  • method 600 may be carried out by one or more computing devices. For example, determining the plurality of object detections and the heat map may be carried out by a server device.
  • One or more robotic devices operating in the same environment may collect data corresponding to the environment and send the data to the server device, and the server device may analyze the data to determine object detections and aggregate the data into a heat map. Additionally and/or alternatively, the one or more robotic devices may each collect and analyze the data and transmit the data to a server device.
  • the server device may aggregate the transmitted object detections.
  • a robotic device (perhaps one of the robotic devices that was collecting the data) may request a heat map of an object class, and the server device may transmit the heat map of the requested object class.
  • the measurement of how frequently the particular object class was detected at the area in the environment is based on a number of object detections of the particular object class at the area in the environment and a total number of object detections of the particular object class in the heat map.
  • the measurement of how frequently the particular object class was detected at the area in the environment is a ratio of the number of object detections of the particular object class at the area in the environment to the total number of object detections of the particular object class in the heat map.
  • the number of object detections of the particular object class at the area in the environment and the total number of object detections of the particular object class in the heat map are stored separately.
  • the data comprises a plurality of images, where determining a plurality of object detections is based on applying a pre-trained machine learning model to the plurality of images.
  • method 600 may further include determining a plurality of heat maps of a plurality of object classes, including the heat map of the particular object class, where causing the robotic device to navigate towards the target area in the environment corresponding to the cell in the heat map is based on an additional measurement of how frequently the plurality of object classes was detected at the target area in the environment.
  • method 600 may further include receiving, from the at least one sensor on the at least one robotic device, additional data representative of the environment, determining a plurality of additional object detections for the additional data, and updating the heat map of the particular object class based on the plurality of additional object detections.
  • determining the plurality of object detections for the data and determining the heat map of the particular object class are carried out by a server device.
  • determining the heat map of the particular object class includes transmitting the plurality of object detections for the data to a server device and receiving, from the server device, the heat map of the particular object class.
  • determining the heat map of the particular object class further includes updating, by a server device, an existing heat map of one or more particular object classes including the particular object class.
  • the measurement corresponding to each cell in the heat map represents a probability of observing the particular object class in the area represented by the cell.
  • a size of the area represented by each cell in the heat map is based on a quantity of the plurality of object detections and a quantity of the plurality of locations in the environment.
  • the object class is based on one or more tasks that the robotic device is carrying out.
  • the at least one sensor on the at least one robotic device comprises a LIDAR sensor or a camera.
  • the object class represents one or more objects that are stationary for at least a period of time.
  • method 400 further includes updating the heat map of the particular object class periodically with one or more object detections from the at least one robotic device.
  • determining the heat map of the particular class comprises determining a plurality of heat maps for the particular class during a plurality of day parts, where the plurality of heat maps includes a particular heat map associated with a day part, where causing the robotic device to navigate towards the target area in the environment during the day part is based on the particular heat map associated with the day part.
  • one or more cells of the heat map correspond to an area in the environment for which the at least one robotic device did not determine an object detection.
  • the environment is a building, where each of the at least one robotic device is operating in the building.
  • each of the at least one robotic device is operating in the same environment and has access to the heat map of the particular object class.
  • causing the robotic device to navigate towards the target area comprises determining a plurality of target areas including the target area based on a plurality of respective measurement of how frequently the particular object class was detected at the plurality of target areas in the environment, determining a path including one or more of the target areas, and causing the robotic device to traverse the determined path.
  • a block that represents a processing of information may correspond to circuitry that can be configured to perform the specific logical functions of a herein-described method or technique.
  • a block that represents a processing of information may correspond to a module, a segment, or a portion of program code (including related data).
  • the program code may include one or more instructions executable by a processor for implementing specific logical functions or actions in the method or technique.
  • the program code or related data may be stored on any type of computer readable medium such as a storage device including a disk or hard drive or other storage medium.
  • the computer readable medium may also include non-transitory computer readable media such as computer-readable media that stores data for short periods of time like register memory, processor cache, and random access memory (RAM).
  • the computer readable media may also include non-transitory computer readable media that stores program code or data for longer periods of time, such as secondary or persistent long term storage, like read only memory (ROM), optical or magnetic disks, compact-disc read only memory (CD-ROM), for example.
  • the computer readable media may also be any other volatile or non-volatile storage systems.
  • a computer readable medium may be considered a computer readable storage medium, for example, or a tangible storage device.
  • a block that represents one or more information transmissions may correspond to information transmissions between software or hardware modules in the same physical device. However, other information transmissions may be between software modules or hardware modules in different physical devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Manipulator (AREA)

Abstract

A method includes receiving data representative of a plurality of locations in an environment and determining a plurality of object detections for the data representative of the environment. Each object detection identifies an object class corresponding to an object detected at one of the plurality of locations in the environment. The method further includes determining a heat map of a particular object class based on the plurality of object detections. Each cell in the heat map corresponds to an area in the environment and is associated with a measurement of how frequently the particular object class was detected at the area in the environment. The method additionally includes causing a robotic device to navigate towards a target area in the environment corresponding to a cell in the heat map based on a respective measurement of how frequently the particular object class was detected at the target area in the environment.

Description

Semantic Heat Map for Robot Object Search
CROSS-REFERENCE TO RELATED DISCLOSURE
[0001] This application claims priority to U.S. Provisional Patent Application No. 63/383,812, filed on November 15, 2022, which is incorporated herein by reference in its entirety.
BACKGROUND
[0002] As technology advances, various types of robotic devices are being created for performing a variety of functions that may assist users. Robotic devices may be used for applications involving material handling, transportation, welding, assembly, and dispensing, among others. Over time, the manner in which these robotic systems operate is becoming more intelligent, efficient, and intuitive. As robotic systems become increasingly prevalent in numerous aspects of modem life, it is desirable for robotic systems to be efficient. Therefore, a demand for efficient robotic systems has helped open up a field of innovation in actuators, movement, sensing techniques, as well as component design and assembly.
SUMMARY
[0003] Example embodiments involve a semantic heat map determination method for object searches. A robotic device may collect sensor data and determine object detections identifying object classes based on the objects that appear in the sensor data. Based on the object detections, the robotic device may determine a heat map of a particular object class that may identify areas in the environment where objects of the particular object class appear more frequently than other areas. The robotic device may navigate in the environment based on the heat map.
[0004] In an embodiment, a method includes receiving, from at least one sensor on at least one robotic device, data representative of a plurality of locations in an environment. The method also includes determining a plurality of object detections for the data representative of the environment. Each object detection identifies an object class corresponding to an object detected at one of the plurality of locations in the environment. The method further includes determining a heat map of a particular object class based on the plurality of object detections. Each cell in the heat map corresponds to an area in the environment and is associated with a measurement of how frequently the particular object class was detected at the area in the environment. The method additionally includes causing a robotic device to navigate towards a target area in the environment corresponding to a cell in the heat map based on a respective measurement of how frequently the particular object class was detected at the target area in the environment. [0005] In another embodiment, a robotic device includes at least one sensor and a control system. The control system is configured to receive, from the at least one sensor on the robotic device, data representative of a plurality of locations in an environment. The control system is also configured to determine a plurality of object detections for the data representative of the environment, where each object detection identifies an object class corresponding to an object detected at one of the plurality of locations in the environment. The control system is further configured to determine a heat map of a particular object class based on the plurality of object detections. Each cell in the heat map corresponds to an area in the environment and is associated with a measurement of how frequently the particular object class was detected at the area in the environment. The control system is additionally configured to cause the robotic device to navigate towards a target area in the environment corresponding to a cell in the heat map based on a respective measurement of how frequently the particular object class was detected at the target area in the environment.
[0006] In a further embodiment, a non-transitory computer readable medium is provided which includes program instructions executable by at least one processor to cause the at least one processor to perform functions. The functions include receiving, from the at least one sensor on the robotic device, data representative of a plurality of locations in an environment. The functions further include determining a plurality of object detections for the data representative of the environment. Each object detection identifies an object class corresponding to an object detected at one of the plurality of locations in the environment. The functions also include determining a heat map of a particular object class based on the plurality of object detections. Each cell in the heat map corresponds to an area in the environment and is associated with a measurement of how frequently the particular object class was detected at the area in the environment. The functions also include causing the robotic device to navigate towards a target area in the environment corresponding to a cell in the heat map based on a respective measurement of how frequently the particular object class was detected at the target area in the environment.
[0007] In a further embodiment, a system is provided that includes means for receiving, from the at least one sensor on the robotic device, data representative of a plurality of locations in an environment. The system also includes means for determining a plurality of object detections for the data representative of the environment, where each object detection identifies an object class corresponding to an object detected at one of the plurality of locations in the environment. The system further includes means for determining a heat map of a particular object class based on the plurality of object detections, where each cell in the heat map corresponds to an area in the environment and is associated with a measurement of how frequently the particular object class was detected at the area in the environment. The system additionally includes means for causing the robotic device to navigate towards a target area in the environment corresponding to a cell in the heat map based on a respective measurement of how frequently the particular object class was detected at the target area in the environment. [0008] The foregoing summary is illustrative only and is not intended to be in any way limiting. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features will become apparent by reference to the figures and the following detailed description and the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] Figure 1 illustrates a configuration of a robotic system, in accordance with example embodiments.
[0010] Figure 2 illustrates a mobile robot, in accordance with example embodiments.
[0011] Figure 3 illustrates an exploded view of a mobile robot, in accordance with example embodiments.
[0012] Figure 4 illustrates a robotic arm, in accordance with example embodiments.
[0013] Figure 5 is a diagram illustrating training and inference phases of a machine learning model, in accordance with example embodiments.
[0014] Figure 6 is a block diagram of a method, in accordance with example embodiments. [0015] Figure 7 depicts an environment, in accordance with example embodiments.
[0016] Figure 8 depicts data representative of an environment, in accordance with example embodiments.
[0017] Figure 9 depicts a heat map of an environment, in accordance with example embodiments.
[0018] Figure 10 depicts a path based on a heat map of an environment, in accordance with example embodiments.
DETAILED DESCRIPTION
[0019] Example methods, devices, and systems are described herein. It should be understood that the words “example” and “exemplary” are used herein to mean “serving as an example, instance, or illustration.” Any embodiment or feature described herein as being an “example” or “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments or features unless indicated as such. Other embodiments can be utilized, and other changes can be made, without departing from the scope of the subject matter presented herein. [0020] Thus, the example embodiments described herein are not meant to be limiting. It will be readily understood that the aspects of the present disclosure, as generally described herein, and illustrated in the figures, can be arranged, substituted, combined, separated, and designed in a wide variety of different configurations.
[0021] Throughout this description, the articles “a” or “an” are used to introduce elements of the example embodiments. Any reference to “a” or “an” refers to “at least one,” and any reference to “the” refers to “the at least one,” unless otherwise specified, or unless the context clearly dictates otherwise. The intent of using the conjunction “or” within a described list of at least two terms is to indicate any of the listed terms or any combination of the listed terms.
[0022] The use of ordinal numbers such as “first,” “second,” “third” and so on is to distinguish respective elements rather than to denote a particular order of those elements. For purpose of this description, the terms “multiple” and “a plurality of’ refer to “two or more” or “more than one.”
[0023] Further, unless context suggests otherwise, the features illustrated in each of the figures may be used in combination with one another. Thus, the figures should be generally viewed as component aspects of one or more overall embodiments, with the understanding that not all illustrated features are necessary for each embodiment. In the figures, similar symbols typically identify similar components, unless context dictates otherwise. Further, unless otherwise noted, figures are not drawn to scale and are used for illustrative purposes only. Moreover, the figures are representational only and not all components are shown. For example, additional structural or restraining components might not be shown.
[0024] Additionally, any enumeration of elements, blocks, or steps in this specification or the claims is for purposes of clarity. Thus, such enumeration should not be interpreted to require or imply that these elements, blocks, or steps adhere to a particular arrangement or are carried out in a particular order.
I. Overview
[0025] A robotic device or a fleet of robotic devices may be tasked with performing various tasks in an environment, such as vacuuming, clearing dishes, removing spills, organizing packages, among other missions. To perform these tasks, the robotic device may include one or more LIDAR and/or camera sensors, which the robotic device may use to determine a type of one or more objects in the environment. In particular, the robotic device may determine whether an object is relevant to its task and react accordingly. For example, the robotic device may determine that an image captured by a camera on the robotic device likely contains a package. In response, the robotic device may move the package to another, perhaps more centralized and convenient location. As another example, the robotic device may determine that an image captured by a camera on the robotic device likely contains dishes. In response, the robotic device may move the dishes to the sink.
[0026] A given environment may include one or more of these robotic devices, each carrying out one or more tasks. For example, the environment may be a large office building with many rooms and containing a robotic device tasked with moving packages, another robotic device tasked with clearing tables, and another robotic device tasked with vacuuming. In order for each robotic device to efficiently perform its assigned task, it may be important that the robotic device is navigating effectively in the environment. Issues may arise where the robotic devices do not have adequate information to navigate effectively in the environment. Each robotic device may have a general map of the building in which they are operating, but each robotic device may have little context for where the relevant objects are located. For example, a robotic device tasked with moving packages may not have information indicating where the packages are and a robotic device tasked with clearing tables may not have information indicating where the tables with dishes to be cleared are.
[0027] Provided herein are methods that may facilitate the determination of efficient navigation paths. As a robotic device operates in an environment, the robotic device may collect data of various locations in the environment and analyze the data for object detections indicating the presence of an object of a particular object class at the location in the environment. In particular, the robotic device may take an image at a location in the environment and determine whether the image contains an object of a particular object class. For example, the image may be of a plate on a table, and the robotic device may determine that the location at which or of which the image was taken contains a mug. The robotic device may repeat this process for additional locations in the environment, storing each object detection and the associated location.
[0028] From these object detections and the associated locations, the robotic device may determine a heat map, where each cell in the heat map represents an area in the environment and corresponds to a measurement of how frequently the object class was detected in that area in the environment and how confident the model is that the object belongs to each class. For example, the robotic device may determine that, for a given detected object, the confidence value associated with a classification of apple is 0.7, the confidence value associated with a classification of orange is 0.2, and the confidence value associated with a classification of banana is 0.1. The robotic device may then add each one of these scores to its corresponding class heat map for each cell that the point cloud of the object touches. If the robotic device detected a plate on the table such that the plate is associated with a confidence score of 0.3, where the plate touches three cells, then a cell representing the location of the table may be associated with a measurement of 0.3 and the total count of the heatmap may be 0.3*3 = 0.9. The measurement at a cell where a plate was detected and the total measurement of the heatmap may be stored separately but may be combined when determining where to navigate the robotic device. For example, assuming no other plates are detected in the environment, the actual measurement at the cell may be 0.3/0.9 = 1/3.
[0029] Later, the robotic device may use the heat map to determine a path through which to navigate. In particular, after having determined the heat map, the robotic device may analyze the heat map for target areas where object detections of a particular object class occur more frequently to determine which locations the robotic device may first navigate to so that the robotic device may efficiently navigate in the environment. For example, the robotic device may determine a target location as being associated with a cell in the heat map, where the cell in the heat map is associated with a measurement indicating that object detections at that location occur more frequently than any other location represented in the heat map. The robotic device may determine a path from its location to the target location, and the robotic device may navigate along the determined path. As another example, the robotic device may determine a plurality of target locations. Each target location may be associated with a cell in the heat map, and the cell may be associated with a measurement indicating that objects of the particular object class are detected more frequently than other cells. The robotic device may determine a path that includes each of the target locations, and the robotic device may navigate along the determined path.
[0030] In some examples, one or more additional robotic devices may be operating in the environment, and the additional robotic devices may also collect data and determine object detections from the data representative of the environment. The robotic devices may send the data to a server device, and the server device may determine the object detections. Additionally or alternatively, each robotic device may analyze the data that it collects to determine object detections and send the determined object detections to a server device to determine an aggregate heat map. When a robotic device operating in the environment sends a request for a heat map, perhaps a heat map associated with a particular object class, the server device may send the requested heat map to the robotic device and the robotic device may use the heat map to determine a path through which to navigate. Additionally and/or alternatively, the robotic device may request for a path through which to navigate in order to complete a task, and the server device may determine and send the requested path to the robotic device, causing the robotic device to navigate towards one or more target areas in the environment.
[0031] Further, the heat map may be associated with a particular day part. For example, robotic devices operating in an environment may collect data in the morning and in the afternoon. A computing system may determine a heat map associated with the data collected in the morning and a heat map associated with data collected in the afternoon. During navigation in the environment, a robotic device may base navigation on a heat map associated with the day part during which it is navigating. For example, a robotic device operating in the environment during the afternoon may base navigation on a heat map associated with the afternoon. Determining different heat maps for various day parts may be useful given that object locations may be based on the time of day. For example, if the robotic device is tasked with clearing cups off of tables, the majority of cups may be located on desks in the mornings and in the cafeteria area around lunch time.
[0032] A computing device may determine and/or update a heat map periodically. For example, the computing system may receive data and/or object detections every evening, and the computing system may determine one or more new heat maps every evening based on the data and/or object detections. Additionally and/or alternatively, the computing system may update one or more existing heat maps with the data and/or object detections received every evening.
II. Example Robotic Systems
[0033] Figure 1 illustrates an example configuration of a robotic system that may be used in connection with the implementations described herein. Robotic system 100 may be configured to operate autonomously, semi -autonomously, or using directions provided by user(s). Robotic system 100 may be implemented in various forms, such as a robotic arm, industrial robot, or some other arrangement. Some example implementations involve a robotic system 100 engineered to be low cost at scale and designed to support a variety of tasks. Robotic system 100 may be designed to be capable of operating around people. Robotic system 100 may also be optimized for machine learning. Throughout this description, robotic system 100 may also be referred to as a robot, robotic device, or mobile robot, among other designations.
[0034] As shown in Figure 1, robotic system 100 may include processor(s) 102, data storage 104, and controller(s) 108, which together may be part of control system 118. Robotic system 100 may also include sensor(s) 112, power source(s) 114, mechanical components 110, and electrical components 116. Nonetheless, robotic system 100 is shown for illustrative purposes, and may include more or fewer components. The various components of robotic system 100 may be connected in any manner, including wired or wireless connections. Further, in some examples, components of robotic system 100 may be distributed among multiple physical entities rather than a single physical entity. Other example illustrations of robotic system 100 may exist as well.
[0035] Processor(s) 102 may operate as one or more general-purpose hardware processors or special purpose hardware processors (e.g., digital signal processors, application specific integrated circuits, etc.). Processor(s) 102 may be configured to execute computer-readable program instructions 106, and manipulate data 107, both of which are stored in data storage 104. Processor(s) 102 may also directly or indirectly interact with other components of robotic system 100, such as sensor(s) 112, power source(s) 114, mechanical components 110, or electrical components 116.
[0036] Data storage 104 may be one or more types of hardware memory. For example, data storage 104 may include or take the form of one or more computer-readable storage media that can be read or accessed by processor(s) 102. The one or more computer-readable storage media can include volatile or non-volatile storage components, such as optical, magnetic, organic, or another type of memory or storage, which can be integrated in whole or in part with processor(s) 102. In some implementations, data storage 104 can be a single physical device. In other implementations, data storage 104 can be implemented using two or more physical devices, which may communicate with one another via wired or wireless communication. As noted previously, data storage 104 may include the computer-readable program instructions 106 and data 107. Data 107 may be any type of data, such as configuration data, sensor data, or diagnostic data, among other possibilities.
[0037] Controller 108 may include one or more electrical circuits, units of digital logic, computer chips, or microprocessors that are configured to (perhaps among other tasks), interface between any combination of mechanical components 110, sensor(s) 112, power source(s) 114, electrical components 116, control system 118, or a user of robotic system 100. In some implementations, controller 108 may be a purpose-built embedded device for performing specific operations with one or more subsystems of the robotic system 100.
[0038] Control system 118 may monitor and physically change the operating conditions of robotic system 100. In doing so, control system 118 may serve as a link between portions of robotic system 100, such as between mechanical components 110 or electrical components 116. In some instances, control system 118 may serve as an interface between robotic system 100 and another computing device. Further, control system 118 may serve as an interface between robotic system 100 and a user. In some instances, control system 118 may include various components for communicating with robotic system 100, including a joystick, buttons, or ports, etc. The example interfaces and communications noted above may be implemented via a wired or wireless connection, or both. Control system 118 may perform other operations for robotic system 100 as well.
[0039] During operation, control system 118 may communicate with other systems of robotic system 100 via wired or wireless connections, and may further be configured to communicate with one or more users of the robot. As one possible illustration, control system 118 may receive an input (e.g., from a user or from another robot) indicating an instruction to perform a requested task, such as to pick up and move an object from one location to another location. Based on this input, control system 118 may perform operations to cause the robotic system 100 to make a sequence of movements to perform the requested task. As another illustration, a control system may receive an input indicating an instruction to move to a requested location. In response, control system 118 (perhaps with the assistance of other components or systems) may determine a direction and speed to move robotic system 100 through an environment en route to the requested location.
[0040] Operations of control system 118 may be carried out by processor(s) 102. Alternatively, these operations may be carried out by controlled s) 108, or a combination of processor(s) 102 and controller(s) 108. In some implementations, control system 118 may partially or wholly reside on a device other than robotic system 100, and therefore may at least in part control robotic system 100 remotely.
[0041] Mechanical components 110 represent hardware of robotic system 100 that may enable robotic system 100 to perform physical operations. As a few examples, robotic system 100 may include one or more physical members, such as an arm, an end effector, a head, a neck, a torso, a base, and wheels. The physical members or other parts of robotic system 100 may further include actuators arranged to move the physical members in relation to one another. Robotic system 100 may also include one or more structured bodies for housing control system 118 or other components, and may further include other types of mechanical components. The particular mechanical components 110 used in a given robot may vary based on the design of the robot, and may also be based on the operations or tasks the robot may be configured to perform.
[0042] In some examples, mechanical components 110 may include one or more removable components. Robotic system 100 may be configured to add or remove such removable components, which may involve assistance from a user or another robot. For example, robotic system 100 may be configured with removable end effectors or digits that can be replaced or changed as needed or desired. In some implementations, robotic system 100 may include one or more removable or replaceable battery units, control systems, power systems, bumpers, or sensors. Other types of removable components may be included within some implementations. [0043] Robotic system 100 may include sensor(s) 112 arranged to sense aspects of robotic system 100. Sensor(s) 112 may include one or more force sensors, torque sensors, velocity sensors, acceleration sensors, position sensors, proximity sensors, motion sensors, location sensors, load sensors, temperature sensors, touch sensors, depth sensors, ultrasonic range sensors, infrared sensors, object sensors, or cameras, among other possibilities. Within some examples, robotic system 100 may be configured to receive sensor data from sensors that are physically separated from the robot (e.g., sensors that are positioned on other robots or located within the environment in which the robot is operating).
[0044] Sensor(s) 112 may provide sensor data to processor(s) 102 (perhaps by way of data 107) to allow for interaction of robotic system 100 with its environment, as well as monitoring of the operation of robotic system 100. The sensor data may be used in evaluation of various factors for activation, movement, and deactivation of mechanical components 110 and electrical components 116 by control system 118. For example, sensor(s) 112 may capture data corresponding to the terrain of the environment or location of nearby objects, which may assist with environment recognition and navigation.
[0045] In some examples, sensor(s) 112 may include RADAR (e.g., for long-range object detection, distance determination, or speed determination), LIDAR (e.g., for short-range object detection, distance determination, or speed determination), SONAR (e.g., for underwater object detection, distance determination, or speed determination), VIC ON® (e.g., for motion capture), one or more cameras (e.g., stereoscopic cameras for 3D vision), a global positioning system (GPS) transceiver, or other sensors for capturing information of the environment in which robotic system 100 is operating. Sensor(s) 112 may monitor the environment in real time, and detect obstacles, elements of the terrain, weather conditions, temperature, or other aspects of the environment. In another example, sensor(s) 112 may capture data corresponding to one or more characteristics of a target or identified object, such as a size, shape, profile, structure, or orientation of the object.
[0046] Further, robotic system 100 may include sensor(s) 112 configured to receive information indicative of the state of robotic system 100, including sensor(s) 112 that may monitor the state of the various components of robotic system 100. Sensor(s) 112 may measure activity of systems of robotic system 100 and receive information based on the operation of the various features of robotic system 100, such as the operation of an extendable arm, an end effector, or other mechanical or electrical features of robotic system 100. The data provided by sensor(s) 112 may enable control system 118 to determine errors in operation as well as monitor overall operation of components of robotic system 100.
[0047] As an example, robotic system 100 may use force/torque sensors to measure load on various components of robotic system 100. In some implementations, robotic system 100 may include one or more force/torque sensors on an arm or end effector to measure the load on the actuators that move one or more members of the arm or end effector. In some examples, the robotic system 100 may include a force/torque sensor at or near the wrist or end effector, but not at or near other joints of a robotic arm. In further examples, robotic system 100 may use one or more position sensors to sense the position of the actuators of the robotic system. For instance, such position sensors may sense states of extension, retraction, positioning, or rotation of the actuators on an arm or end effector.
[0048] As another example, sensor(s) 112 may include one or more velocity or acceleration sensors. For instance, sensor(s) 112 may include an inertial measurement unit (IMU). The IMU may sense velocity and acceleration in the world frame, with respect to the gravity vector. The velocity and acceleration sensed by the IMU may then be translated to that of robotic system 100 based on the location of the IMU in robotic system 100 and the kinematics of robotic system 100.
[0049] Robotic system 100 may include other types of sensors not explicitly discussed herein. Additionally or alternatively, the robotic system may use particular sensors for purposes not enumerated herein.
[0050] Robotic system 100 may also include one or more power source(s) 114 configured to supply power to various components of robotic system 100. Among other possible power systems, robotic system 100 may include a hydraulic system, electrical system, batteries, or other types of power systems. As an example illustration, robotic system 100 may include one or more batteries configured to provide charge to components of robotic system 100. Some of mechanical components 110 or electrical components 116 may each connect to a different power source, may be powered by the same power source, or be powered by multiple power sources.
[0051] Any type of power source may be used to power robotic system 100, such as electrical power or a gasoline engine. Additionally or alternatively, robotic system 100 may include a hydraulic system configured to provide power to mechanical components 110 using fluid power. Components of robotic system 100 may operate based on hydraulic fluid being transmitted throughout the hydraulic system to various hydraulic motors and hydraulic cylinders, for example. The hydraulic system may transfer hydraulic power by way of pressurized hydraulic fluid through tubes, flexible hoses, or other links between components of robotic system 100. Power source(s) 114 may charge using various types of charging, such as wired connections to an outside power source, wireless charging, combustion, or other examples.
[0052] Electrical components 116 may include various mechanisms capable of processing, transferring, or providing electrical charge or electric signals. Among possible examples, electrical components 116 may include electrical wires, circuitry, or wireless communication transmitters and receivers to enable operations of robotic system 100. Electrical components 116 may interwork with mechanical components 110 to enable robotic system 100 to perform various operations. Electrical components 116 may be configured to provide power from power source(s) 114 to the various mechanical components 110, for example. Further, robotic system 100 may include electric motors. Other examples of electrical components 116 may exist as well.
[0053] Robotic system 100 may include a body, which may connect to or house appendages and components of the robotic system. As such, the structure of the body may vary within examples and may further depend on particular operations that a given robot may have been designed to perform. For example, a robot developed to carry heavy loads may have a wide body that enables placement of the load. Similarly, a robot designed to operate in tight spaces may have a relatively tall, narrow body. Further, the body or the other components may be developed using various types of materials, such as metals or plastics. Within other examples, a robot may have a body with a different structure or made of various types of materials.
[0054] The body or the other components may include or carry sensor(s) 112. These sensors may be positioned in various locations on the robotic system 100, such as on a body, a head, a neck, a base, a torso, an arm, or an end effector, among other examples.
[0055] Robotic system 100 may be configured to carry a load, such as a type of cargo that is to be transported. In some examples, the load may be placed by the robotic system 100 into a bin or other container attached to the robotic system 100. The load may also represent external batteries or other types of power sources (e.g., solar panels) that the robotic system 100 may utilize. Carrying the load represents one example use for which the robotic system 100 may be configured, but the robotic system 100 may be configured to perform other operations as well. [0056] As noted above, robotic system 100 may include various types of appendages, wheels, end effectors, gripping devices and so on. In some examples, robotic system 100 may include a mobile base with wheels, treads, or some other form of locomotion. Additionally, robotic system 100 may include a robotic arm or some other form of robotic manipulator. In the case of a mobile base, the base may be considered as one of mechanical components 110 and may include wheels, powered by one or more of actuators, which allow for mobility of a robotic arm in addition to the rest of the body.
[0057] Figure 2 illustrates a mobile robot, in accordance with example embodiments. Figure 3 illustrates an exploded view of the mobile robot, in accordance with example embodiments. More specifically, a robot 200 may include a mobile base 202, a midsection 204, an arm 206, an end-of-arm system (EOAS) 208, a mast 210, a perception housing 212, and a perception suite 214. The robot 200 may also include a compute box 216 stored within mobile base 202.
[0058] The mobile base 202 includes two drive wheels positioned at a front end of the robot 200 in order to provide locomotion to robot 200. The mobile base 202 also includes additional casters (not shown) to facilitate motion of the mobile base 202 over a ground surface. The mobile base 202 may have a modular architecture that allows compute box 216 to be easily removed. Compute box 216 may serve as a removable control system for robot 200 (rather than a mechanically integrated control system). After removing external shells, the compute box 216 can be easily removed and/or replaced. The mobile base 202 may also be designed to allow for additional modularity. For example, the mobile base 202 may also be designed so that a power system, a battery, and/or external bumpers can all be easily removed and/or replaced.
[0059] The midsection 204 may be attached to the mobile base 202 at a front end of the mobile base 202. The midsection 204 includes a mounting column which is fixed to the mobile base 202. The midsection 204 additionally includes a rotational joint for arm 206. More specifically, the midsection 204 includes the first two degrees of freedom for arm 206 (a shoulder yaw JO joint and a shoulder pitch JI joint). The mounting column and the shoulder yaw JO joint may form a portion of a stacked tower at the front of mobile base 202. The mounting column and the shoulder yaw JO joint may be coaxial. The length of the mounting column of midsection 204 may be chosen to provide the arm 206 with sufficient height to perform manipulation tasks at commonly encountered height levels (e.g., coffee table top and counter top levels). The length of the mounting column of midsection 204 may also allow the shoulder pitch J 1 joint to rotate the arm 206 over the mobile base 202 without contacting the mobile base 202.
[0060] The arm 206 may be a 7DOF robotic arm when connected to the midsection 204. As noted, the first two DOFs of the arm 206 may be included in the midsection 204. The remaining five DOFs may be included in a standalone section of the arm 206 as illustrated in Figures 2 and 3. The arm 206 may be made up of plastic monolithic link structures. Inside the arm 206 may be housed standalone actuator modules, local motor drivers, and thru bore cabling. [0061] The EOAS 208 may be an end effector at the end of arm 206. EOAS 208 may allow the robot 200 to manipulate objects in the environment. As shown in Figures 2 and 3, EOAS 208 may be a gripper, such as an underactuated pinch gripper. The gripper may include one or more contact sensors such as force/torque sensors and/or non-contact sensors such as one or more cameras to facilitate object detection and gripper control. EOAS 208 may also be a different type of gripper such as a suction gripper or a different type of tool such as a drill or a brush. EOAS 208 may also be swappable or include swappable components such as gripper digits.
[0062] The mast 210 may be a relatively long, narrow component between the shoulder yaw JO joint for arm 206 and perception housing 212. The mast 210 may be part of the stacked tower at the front of mobile base 202. The mast 210 may be fixed relative to the mobile base 202. The mast 210 may be coaxial with the midsection 204. The length of the mast 210 may facilitate perception by perception suite 214 of objects being manipulated by EOAS 208. The mast 210 may have a length such that when the shoulder pitch JI joint is rotated vertical up, a topmost point of a bicep of the arm 206 is approximately aligned with a top of the mast 210. The length of the mast 210 may then be sufficient to prevent a collision between the perception housing 212 and the arm 206 when the shoulder pitch JI joint is rotated vertical up.
[0063] As shown in Figures 2 and 3, the mast 210 may include a 3D lidar sensor configured to collect depth information about the environment. The 3D lidar sensor may be coupled to a carved-out portion of the mast 210 and fixed at a downward angle. The lidar position may be optimized for localization, navigation, and for front cliff detection.
[0064] The perception housing 212 may include at least one sensor making up perception suite 214. The perception housing 212 may be connected to a pan/tilt control to allow for reorienting of the perception housing 212 (e.g., to view objects being manipulated by EOAS 208). The perception housing 212 may be a part of the stacked tower fixed to the mobile base 202. A rear portion of the perception housing 212 may be coaxial with the mast 210.
[0065] The perception suite 214 may include a suite of sensors configured to collect sensor data representative of the environment of the robot 200. The perception suite 214 may include an infrared(IR)-assisted stereo depth sensor. The perception suite 214 may additionally include a wide-angled red-green-blue (RGB) camera for human-robot interaction and context information. The perception suite 214 may additionally include a high resolution RGB camera for object classification. A face light ring surrounding the perception suite 214 may also be included for improved human-robot interaction and scene illumination. In some examples, the perception suite 214 may also include a projector configured to project images and/or video into the environment.
[0066] Figure 4 illustrates a robotic arm, in accordance with example embodiments. The robotic arm includes 7 DOFs: a shoulder yaw JO joint, a shoulder pitch JI joint, a bicep roll J2 joint, an elbow pitch J3 joint, a forearm roll J4 joint, a wrist pitch J5 joint, and wrist roll J6 joint. Each of the joints may be coupled to one or more actuators. The actuators coupled to the joints may be operable to cause movement of links down the kinematic chain (as well as any end effector attached to the robot arm).
[0067] The shoulder yaw JO joint allows the robot arm to rotate toward the front and toward the back of the robot. One beneficial use of this motion is to allow the robot to pick up an object in front of the robot and quickly place the object on the rear section of the robot (as well as the reverse motion). Another beneficial use of this motion is to quickly move the robot arm from a stowed configuration behind the robot to an active position in front of the robot (as well as the reverse motion).
[0068] The shoulder pitch JI joint allows the robot to lift the robot arm (e.g., so that the bicep is up to perception suite level on the robot) and to lower the robot arm (e.g., so that the bicep is just above the mobile base). This motion is beneficial to allow the robot to efficiently perform manipulation operations (e.g., top grasps and side grasps) at different target height levels in the environment. For instance, the shoulder pitch JI joint may be rotated to a vertical up position to allow the robot to easily manipulate objects on a table in the environment. The shoulder pitch JI joint may be rotated to a vertical down position to allow the robot to easily manipulate objects on a ground surface in the environment.
[0069] The bicep roll J2 joint allows the robot to rotate the bicep to move the elbow and forearm relative to the bicep. This motion may be particularly beneficial for facilitating a clear view of the EOAS by the robot’s perception suite. By rotating the bicep roll J2 joint, the robot may kick out the elbow and forearm to improve line of sight to an object held in a gripper of the robot.
[0070] Moving down the kinematic chain, alternating pitch and roll joints (a shoulder pitch JI joint, a bicep roll J2 joint, an elbow pitch J3 joint, a forearm roll J4 joint, a wrist pitch J5 joint, and wrist roll J6 joint) are provided to improve the manipulability of the robotic arm. The axes of the wrist pitch J5 joint, the wrist roll J6 joint, and the forearm roll J4 joint are intersecting for reduced arm motion to reorient objects. The wrist roll J6 point is provided instead of two pitch joints in the wrist in order to improve object rotation. [0071] In some examples, a robotic arm such as the one illustrated in Figure 4 may be capable of operating in a teach mode. In particular, teach mode may be an operating mode of the robotic arm that allows a user to physically interact with and guide robotic arm towards carrying out and recording various movements. In a teaching mode, an external force is applied (e.g., by the user) to the robotic arm based on a teaching input that is intended to teach the robot regarding how to carry out a specific task. The robotic arm may thus obtain data regarding how to carry out the specific task based on instructions and guidance from the user. Such data may relate to a plurality of configurations of mechanical components, joint position data, velocity data, acceleration data, torque data, force data, and power data, among other possibilities.
[0072] During teach mode the user may grasp onto the EOAS or wrist in some examples or onto any part of robotic arm in other examples, and provide an external force by physically moving robotic arm. In particular, the user may guide the robotic arm towards grasping onto an object and then moving the object from a first location to a second location. As the user guides the robotic arm during teach mode, the robot may obtain and record data related to the movement such that the robotic arm may be configured to independently carry out the task at a future time during independent operation (e.g., when the robotic arm operates independently outside of teach mode). In some examples, external forces may also be applied by other entities in the physical workspace such as by other objects, machines, or robotic systems, among other possibilities.
[0073] Figure 5 shows diagram 500 illustrating a training phase 502 and an inference phase 504 of trained machine learning model(s) 532, in accordance with example embodiments. Some machine learning techniques involve training one or more machine learning algorithms, on an input set of training data to recognize patterns in the training data and provide output inferences and/or predictions about (patterns in the) training data. The resulting trained machine learning algorithm can be referred to as a trained machine learning model. For example, Figure 5 shows training phase 502 where one or more machine learning algorithms 520 are being trained on training data 510 to become trained machine learning model(s) 532. Then, during inference phase 504, trained machine learning model(s) 532 can receive input data 530 and one or more inference/prediction requests 540 (perhaps as part of input data 530) and responsively provide as an output one or more inferences and/or prediction(s) 550.
[0074] As such, trained machine learning model(s) 532 can include one or more models of one or more machine learning algorithms 520. Machine learning algorithm(s) 520 may include, but are not limited to: an artificial neural network (e.g., a herein-described convolutional neural networks, a recurrent neural network, a Bayesian network, a hidden Markov model, a Markov decision process, a logistic regression function, a support vector machine, a suitable statistical machine learning algorithm, and/or a heuristic machine learning system). Machine learning algorithm(s) 520 may be supervised or unsupervised, and may implement any suitable combination of online and offline learning.
[0075] In some examples, machine learning algorithm(s) 520 and/or trained machine learning model(s) 532 can be accelerated using on-device coprocessors, such as graphic processing units (GPUs), tensor processing units (TPUs), digital signal processors (DSPs), and/or application specific integrated circuits (ASICs). Such on-device coprocessors can be used to speed up machine learning algorithm(s) 520 and/or trained machine learning model(s) 532. In some examples, trained machine learning model(s) 532 can be trained, reside and execute to provide inferences on a particular computing device, and/or otherwise can make inferences for the particular computing device.
[0076] During training phase 502, machine learning algorithm(s) 520 can be trained by providing at least training data 510 as training input using unsupervised, supervised, semisupervised, and/or reinforcement learning techniques. Unsupervised learning involves providing a portion (or all) of training data 510 to machine learning algorithm(s) 520 and machine learning algorithm(s) 520 determining one or more output inferences based on the provided portion (or all) of training data 510. Supervised learning involves providing a portion of training data 510 to machine learning algorithm(s) 520, with machine learning algorithm(s) 520 determining one or more output inferences based on the provided portion of training data 510, and the machine learning model may be refined based on correct results associated with training data 510. In some examples, supervised learning of machine learning algorithm(s) 520 can be governed by a set of rules and/or a set of labels for the training input, and the set of rules and/or set of labels may be used to correct inferences of machine learning algorithm(s) 520.
[0077] Semi-supervised learning involves having correct results for part, but not all, of training data 510. During semi-supervised learning, supervised learning is used for a portion of training data 510 having correct results, and unsupervised learning is used for a portion of training data 510 not having correct results. Reinforcement learning involves machine learning algorithm(s) 520 receiving a reward signal regarding a prior inference, where the reward signal can be a numerical value. During reinforcement learning, machine learning algorithm(s) 520 can output an inference and receive a reward signal in response, where machine learning algorithm(s) 520 are configured to try to maximize the numerical value of the reward signal. In some examples, reinforcement learning also utilizes a value function that provides a numerical value representing an expected total of the numerical values provided by the reward signal over time. In some examples, machine learning algorithm(s) 520 and/or trained machine learning model(s) 532 can be trained using other machine learning techniques, including but not limited to, incremental learning and curriculum learning.
[0078] In some examples, machine learning algorithm(s) 520 and/or trained machine learning model(s) 532 can use transfer learning techniques. For example, transfer learning techniques can involve trained machine learning model(s) 532 being pre-trained on one set of data and additionally trained using training data 510. More particularly, machine learning algorithm(s) 520 can be pre-trained on data from one or more computing devices and a resulting trained machine learning model provided to computing device CD1, where CD1 is intended to execute the trained machine learning model during inference phase 504. Then, during training phase 502, the pre-trained machine learning model can be additionally trained using training data 510, where training data 510 can be derived from kernel and non -kernel data of computing device CD1. This further training of the machine learning algorithm(s) 520 and/or the pretrained machine learning model using training data 510 of CDl’s data can be performed using either supervised or unsupervised learning. Once machine learning algorithm(s) 520 and/or the pre-trained machine learning model has been trained on at least training data 510, training phase 502 can be completed. The trained resulting machine learning model can be utilized as at least one of trained machine learning model(s) 532.
[0079] In particular, once training phase 502 has been completed, trained machine learning model(s) 532 can be provided to a computing device, if not already on the computing device. Inference phase 504 can begin after trained machine learning model(s) 532 are provided to computing device CD1.
[0080] During inference phase 504, trained machine learning model(s) 532 can receive input data 530 and generate and output one or more corresponding inferences and/or prediction(s) 550 about input data 530. As such, input data 530 can be used as an input to trained machine learning model(s) 532 for providing corresponding inference(s) and/or prediction(s) 550 to kernel components and non-kemel components. For example, trained machine learning model(s) 532 can generate inference(s) and/or prediction(s) 550 in response to one or more inference/prediction requests 540. In some examples, trained machine learning model(s) 532 can be executed by a portion of other software. For example, trained machine learning model(s) 532 can be executed by an inference or prediction daemon to be readily available to provide inferences and/or predictions upon request. Input data 530 can include data from computing device CD1 executing trained machine learning model(s) 532 and/or input data from one or more computing devices other than CD1. [0081] Input data 530 can include training data described herein. Other types of input data are possible as well.
[0082] Inference(s) and/or prediction(s) 550 can include task outputs, numerical values, and/or other output data produced by trained machine learning model(s) 532 operating on input data 530 (and training data 510). In some examples, trained machine learning model(s) 532 can use output inference(s) and/or prediction(s) 550 as input feedback 560. Trained machine learning model(s) 532 can also rely on past inferences as inputs for generating new inferences.
[0083] After training, the trained version of the neural network can be an example of trained machine learning model(s) 532. In this approach, an example of the one or more inference / prediction request(s) 540 can be a request to predict a classification for an input training example and a corresponding example of inferences and/or prediction(s) 550 can be a predicted classification output.
[0084] Figure 6 is a block diagram of method 600, in accordance with example embodiments. Blocks 602, 604, and 606 may collectively be referred to as method 600. In some examples, method 600 of Figure 6 may be carried out by a control system, such as control system 118 of robotic system 100. In further examples, method 600 of Figure 6 may be carried out by a computing device or a server device remote from the robotic device. In still further examples, method 600 may be carried out by one or more processors, such as processor(s) 102, executing program instructions, such as program instructions 106, stored in a data storage, such as data storage 104. Execution of method 600 may involve a robotic device, such as the robotic device illustrated and described with respect to Figures 1-4. Further, execution of method 600 may involve a computing device or a server device remote from the robotic device and robotic system 100. Other robotic devices may also be used in the performance of method 600. In further examples, some or all of the blocks of method 600 may be performed by a control system remote from the robotic device. In yet further examples, different blocks of method 600 may be performed by different control systems, located on and/or remote from a robotic device. [0085] Those skilled in the art will understand that the block diagram of Figure 6 illustrates functionality and operation of certain implementations of the present disclosure. In this regard, each block of the block diagram may represent a module, a segment, or a portion of program code, which includes one or more instructions executable by one or more processors for implementing specific logical functions or steps in the process. The program code may be stored on any type of computer readable medium, for example, such as a storage device including a disk or hard drive. [0086] In addition, each block may represent circuitry that is wired to perform the specific logical functions in the process. Alternative implementations are included within the scope of the example implementations of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrent or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art.
[0087] At block 602, method 600 includes receiving, from at least one sensor on at least one robotic device, data representative of a plurality of locations in an environment. For example, Figure 7 depicts environment 700 in which robotic device 702 may be navigating. Environment 700 may be a building and may include additional robotic devices that may also carry out the methods described herein. Robotic device 702 may be a robotic device illustrated and described with respect to Figures 1-4, and may include at least one sensor, which may be a LIDAR sensor or a camera. Robotic device 702 may be carrying out various tasks in environment 700, including, for example, clearing and/or cleaning tables, moving delivered packages, vacuuming specific areas of the ground, among other examples.
[0088] When executing one or more of these tasks, robotic device 702 may determine a navigation path through which to navigate. In some examples, robotic device 702 or another robotic device may first determine areas of interest where it may check for tables to be cleared, areas where packages may have been delivered, ground areas to be vacuumed, etc, and the robotic device 702 may then use those determined areas of interest to navigate through the environment and complete its tasks. The present method may facilitate such a determination.
[0089] In some examples, robotic device 702 or another robotic device may collect data to determine target areas where it may be useful to navigate through for the purpose of completing one or more tasks. For example, robotic device 702 may capture an image of environment 700, e.g., of table 712 and objects 710, perhaps while passively navigating through the environment or while carrying out various tasks. Robotic device 702 may also capture data at other areas of the environment. For example, robotic device 702 may capture data of objects 720, objects 730, objects 740, and/or objects 750. Robotic device 702 may use one or more LIDAR sensors and/or one or more cameras to collect data of or at various locations in the environment.
[0090] Referring back to Figure 6, at block 604, method 600 includes determining a plurality of object detections for the data representative of the environment, where each object detection may identify an object class corresponding to an object detected at one of the plurality of locations in the environment. The object class may represent one or more objects that are stationary for at least a period of time, including, for example, dishes, packages, spills, among other examples. For example, robotic device 702 may analyze the data captured in environment 700 to determine whether the data includes a plate, a mug, or some other object classification that may be of interest. Robotic device 702 may carry out this classification of objects while capturing data of environment 700 and/or after a particular period of time of navigating and collecting data in environment 700 (e.g., after 24 hours, a week, etc.).
[0091] Figure 8 depicts data collected of an environment, in accordance with example embodiments. For example, robotic device 702 of environment 700 may collect image 800 and/or image 850 while navigating in environment 700. Images 800 and 850 may depict various objects in the environment. For example, images 800 may depict table 812, plate 814, and bottle 816, while image 850 may depict table 812.
[0092] A computing system, perhaps robotic device 702, may analyze images 800 and 850 to determine which objects are present in each image. In some examples, the computing system may use an object segmentation model to segment the image into various portions and classify each portion of the image into various object classes. For example, the computing system may segment image 800 into various portions and determine object classifications including table 812, bottle 816, and/or plate 814. And the computing system may segment image 850 into various portions and classify the various portions of the image to include table 812. Additionally or alternatively, the computing device may use a pre-trained machine learning model to classify objects that appear in images 800 and 850. The computing system may likewise determine that image 800 includes a bottle, plate, and table, and that image 850 includes table 812.
[0093] The computing system may store the various object classifications in a database. For example, the computing system may determine that data collected at a location includes object classifications corresponding to a table, bottle, and/or plate, and the computing system may store information indicating that that particular location contains objects corresponding to a table, bottle, and plate. In some examples, the computing system may store the data in the form of a 3 -dimensional matrix, such that each two dimensional slice of the matrix corresponds to an object classification and each cell within the two dimensional slice of the matrix corresponds to a location and whether an object of the object classification is located in an area corresponding to the cell. In further examples, the computing system may store the data for each object classification in the form of a three-dimensional matrix, such that each cell within the three-dimensional matrix corresponds to a location, a height, and whether an object of the object classification is located in an area corresponding to the cell. [0094] In some examples, the computing system may determine and/or store object classifications based on a mission that a robotic device is or will be carrying out. For example, if the robotic device is carrying out a mission to collect used dishes from tables, the object classifications may include dishes, trays, mugs, among other objects. However, the object classifications may not include and/or the robotic device may not store unrelated information, including, for example, areas where packages were present. Additionally or alternatively, the computing system may determine and store object classifications for additional missions that the robotic device or another robotic device may carry out.
[0095] Referring back to Figure 6, at block 606, method 600 includes determining a heat map of a particular object class based on the plurality of object detections, where each cell in the heat map corresponds to an area in the environment and is associated with a measurement of how frequently the particular object class was detected at the area in the environment. Figure 9 depicts heat map 900 of an environment, in accordance with example embodiments. Heat map 900 as depicted by Figure 9 may be overlaid on a map of environment 700. Heat map 900 may include one or more cells, where each of the cells in heat map 900 may correspond to an area in the environment for which the robotic device determined an object detection or did not determine an object detection.
[0096] Heat map 900 includes cells 910, 920, 930, and 940, and each cell in heat map 900 may be associated with a measurement of how frequently a particular object class was detected at the area in the environment. As shown, heat map 900 may include cells with various patterns, such that each pattern represents a different measurement of how frequently the particular object class was detected at the area in the environment and/or how confident the classifier is that the object belongs to a certain class. For example, cells 922, 930, 940 as displayed with a striped pattern may be associated with a measurement of 1/9, cell 910 as displayed with a grid pattern may be associated with a measurement of 2/9, and cell 920 as displayed with a fine grid pattern may be associated with a measurement of 3/9. The measurement may be based on a number of object detections of the object class at the area in the environment and a total number of object detections of the object class in the environment. In practice, the measurement may represent a probability of observing the object class in the area represented by the cell, assuming that future occurrences of the object class will be in approximately the same locations as previous occurrences of the object class.
[0097] In particular, the measurement may be a ratio of the number of object detections of the object class at the area in the environment and the total number of object detections of the object class in the environment. This measurement may represent the probability of observing an object of the object class at the location represented by the cell, which may be represented as p( \c) = - — — , where p(l|c) represents the probability of observing an object at location
Figure imgf000024_0001
I, given that its class is c, p(l) represents the probability of observing an object at location I, p(c) represents the probability of observing class c. In practice, measurements for each cell in the heat map may be calculated using the following equation: p(Z|c) where IV
Figure imgf000024_0002
represents the number of observations of class c at location /, Nc represents the number of observations of class c, Nt represents the number of observations at location Z, and N represents the number of all observations.
[0098] For example, heat map 900 may be associated with nine total object detections, and cell 920 of heat map 900 may be associated with a measurement of 3/9, as three detections of plates were detected at the location corresponding to cell 920. As another example, cell 922 may be associated with a measurement of 1/9, as one plate was detected at the location corresponding to cell 922. In some examples, these measurements may be collected over time and the measurement in the associated cell may be associated with how many times an object of the given class appeared in the cell. For example, plates may have appeared in cell 920 three times over the period of a day and eleven total plate detections in the environment may have been observed over the course of the same period. Therefore, cell 920 may be associated with a measurement of 3/11.
[0099] The computing device may update heat map 900 as additional object detections are collected. For example, the computing device may collect additional data at the location indicated by cell 920 and 922, and the computing device may determine that an additional plate was detected at the location represented by cell 920 but no additional plate was detected at the location represented by cell 922. The computing device may subsequently update the heat map such that cell 920 is associated with a measurement of 4/10 and that cell 922 is associated with a measurement of 1/10. The computing device may similarly update the measurements of the other cells in the heat map. Updating measurements of the heat map may be performed periodically (e.g., every day, after every morning and after every afternoon, every week, etc.) or as data is received.
[00100] In some examples, the computing device may determine measurements of heat map 900 based on how confident the computing device is that the detected object belongs to the class. For example, the computing device may determine that 0.7 is the confidence measure that a plate was detected at the location represented by cell 920, and the computing device may associate the corresponding cell in heat map 900 with that confidence measure. At the location represented by cell 920, the computing device may also classify the detected object as an orange with a 0.2 confidence measurement and as a banana with a 0.1 confidence measurement. Further, the computing device may assign these measurements to each cell with which the detected object comes into contact. For example, the computing device may detect a portion of the plate to be in a cell adjacent to cell 920, and the computing device may update the cell adjacent to cell 920 to also be associated with a measurement of 0.7. Therefore, both of these cells (cell 920 and the cell adjacent to cell 920) may be associated with an actual measurement of 0.7/1.4 = 1/2, assuming that no other plates were detected in the environment.
[0101] In some examples, the computing device determining heat map 900 may store the measurement of the number of object detections of the object class at the area in the environment separately from the total number of object detections of the object class in the environment. For example, the computing device may store only the measurements of the number of object detections of the object class at the area in the environment and determine the total number of object detections of the object class at the time that the heat map is requested. In this manner, the computing system may facilitate the determination of heat maps, since the computing system may simply increase the count of the number of object detections at the location at which the object class was detected, rather than updating every measurement in the heat map.
[0102] The computing device may determine a resolution of heat map 900 (e.g., the size of the cells in the heat map) based on a quantity of object detections and a quantity of locations in the environment. For example, the computing device may determine a heat map with a higher resolution (e.g., smaller cells) when more data and/or a higher quantity of object detections have been detected in the environment in a higher quantity of locations in the environment. And the computing device may determine a heat map with a lower resolution (e.g., larger cells) when less data has been collected of the environment in a lower quantity of locations in the environment.
[0103] In some examples, the computing device may determine heat maps associated with various day parts, week parts, month parts, and/or other granularity that is multiple of the time between heat map calculations. In some examples, the heat map of a given environment between time 0 and time T may be represented as heatmap(0, T) = heatmap(0, T-l) + heatmap(T-l, T). The computing device may carry out this computation periodically, perhaps hourly, daily, weekly, monthly, etc. The computing device may store the sequence of all the heatmaps that are computed in this manner, thereby allowing any heatmap between T1 and T2 to be extracted as heatmap(Tl, T2) = heatmap(0, T2) - heatmap(Tl, 0) when T2-Tl=k*P where P is the period of heatmap recalculation, and k=0, 1, 2...
[0104] For example, the computing device may determine a heat map for a particular object class associated with object detections from data collected in the morning and the computing device may determine another heat map for the particular object class associated with object detections from data collected in the afternoon. With these heat maps associated with various day parts, a robotic device may determine navigation based on the time of day. For example, if the robotic device is navigating in the morning, the robotic device may base navigation paths based on the heat map associated with object detections collected in the morning. If the robotic device is navigating in the afternoon, the robotic device may base navigation paths based on the heat map associated with object detections collected in the afternoon. Basing navigation based on time of day may result in more efficient navigation as the presence of objects may be based on the time of day (e.g., package delivery may occur at a specific time of day, dishes and bottles may be leftover from lunch around noon, etc.).
[0105] Referring back to Figure 6, at block 608, method 600 includes causing a robotic device to navigate towards a target area in the environment corresponding to a cell in the heat map based on a respective measurement of how frequently the particular object class was detected at the target area in the environment. Figure 10 depicts path 1050 based on heat map 1000 of the environment. Heat map 1000 may include cells 1010, 1020, 1022, 1030, and 1040, each of which may represent a measurement of how frequently an object class is detected at each respective location in the environment. For example, cells 1010 and 1020 may represent areas at which an object class was detected more frequently than areas represented by cells 1022, 1030, and 1040.
[0106] In some examples, the object class may be dishes and robotic device 1002 may be carrying out a mission to clear dishes, which may be leftover from a lunch break. A computing device, perhaps of robotic device 1002, may determine path 1050 based on heat map 1000 such that robotic device 1002 may check for dishes more efficiently. In particular, the computing device may determine an efficient path from the location of robotic device 1002 to a location where the object class was frequently detected in the past. The computing device may determine target areas including the target area based on a plurality of respective measurements of how frequently the particular object class was detected at the target areas in the environment. Based on these target areas, the computing device may determine a path that includes one or more of the target areas, and the computing device may cause the robotic device to traverse the determined path. [0107] For example, the computing device may determine that target areas are the two areas where dishes were observed most frequently are the areas represented by cells 1020 and 1010, whereas dishes are observed less frequently in areas represented by cells 1022, 1030, and 1040 are not target areas. The computing device may thus determine a path from robotic device 1002 to the area represented by cell 1020 to the area represented by cell 1010, as shown by path 1050. And the computing device may cause robotic device 1002 to traverse through this path, perhaps to more efficiently determine areas where objects of the object class may be located.
[0108] The computing device may also determine target areas based on how frequently a plurality of object classes was detected at the target area in the environment. For example, dishes and bottles may both need to be cleared from table tops. The computing device may thus determine target areas based on how frequently dishes and bottles were detected at each location in the environment, where dishes and bottles may each be associated with a heat map. In particular, the computing device may combine measurements from a heat map corresponding to dishes and measurements from a heat map corresponding to bottles to determine target areas.
[0109] In some examples, method 600 may be carried out by one or more computing devices. For example, determining the plurality of object detections and the heat map may be carried out by a server device. One or more robotic devices operating in the same environment may collect data corresponding to the environment and send the data to the server device, and the server device may analyze the data to determine object detections and aggregate the data into a heat map. Additionally and/or alternatively, the one or more robotic devices may each collect and analyze the data and transmit the data to a server device. The server device may aggregate the transmitted object detections. A robotic device (perhaps one of the robotic devices that was collecting the data) may request a heat map of an object class, and the server device may transmit the heat map of the requested object class.
[0110] In some examples, the measurement of how frequently the particular object class was detected at the area in the environment is based on a number of object detections of the particular object class at the area in the environment and a total number of object detections of the particular object class in the heat map.
[OHl] In some examples, the measurement of how frequently the particular object class was detected at the area in the environment is a ratio of the number of object detections of the particular object class at the area in the environment to the total number of object detections of the particular object class in the heat map. [0112] In some examples, the number of object detections of the particular object class at the area in the environment and the total number of object detections of the particular object class in the heat map are stored separately.
[0113] In some examples, the data comprises a plurality of images, where determining a plurality of object detections is based on applying a pre-trained machine learning model to the plurality of images.
[0114] In some examples, method 600 may further include determining a plurality of heat maps of a plurality of object classes, including the heat map of the particular object class, where causing the robotic device to navigate towards the target area in the environment corresponding to the cell in the heat map is based on an additional measurement of how frequently the plurality of object classes was detected at the target area in the environment.
[0115] In some examples, method 600 may further include receiving, from the at least one sensor on the at least one robotic device, additional data representative of the environment, determining a plurality of additional object detections for the additional data, and updating the heat map of the particular object class based on the plurality of additional object detections.
[0116] In some examples, determining the plurality of object detections for the data and determining the heat map of the particular object class are carried out by a server device.
[0117] In some examples, determining the heat map of the particular object class includes transmitting the plurality of object detections for the data to a server device and receiving, from the server device, the heat map of the particular object class.
[0118] In some examples, determining the heat map of the particular object class further includes updating, by a server device, an existing heat map of one or more particular object classes including the particular object class.
[0119] In some examples, the measurement corresponding to each cell in the heat map represents a probability of observing the particular object class in the area represented by the cell.
[0120] In some examples, a size of the area represented by each cell in the heat map is based on a quantity of the plurality of object detections and a quantity of the plurality of locations in the environment.
[0121] In some examples, the object class is based on one or more tasks that the robotic device is carrying out.
[0122] In some examples, the at least one sensor on the at least one robotic device comprises a LIDAR sensor or a camera. [0123] In some examples, the object class represents one or more objects that are stationary for at least a period of time.
[0124] In some examples, method 400 further includes updating the heat map of the particular object class periodically with one or more object detections from the at least one robotic device. [0125] In some examples, determining the heat map of the particular class comprises determining a plurality of heat maps for the particular class during a plurality of day parts, where the plurality of heat maps includes a particular heat map associated with a day part, where causing the robotic device to navigate towards the target area in the environment during the day part is based on the particular heat map associated with the day part.
[0126] In some examples, one or more cells of the heat map correspond to an area in the environment for which the at least one robotic device did not determine an object detection.
[0127] In some examples, the environment is a building, where each of the at least one robotic device is operating in the building.
[0128] In some examples, each of the at least one robotic device is operating in the same environment and has access to the heat map of the particular object class.
[0129] In some examples, causing the robotic device to navigate towards the target area comprises determining a plurality of target areas including the target area based on a plurality of respective measurement of how frequently the particular object class was detected at the plurality of target areas in the environment, determining a path including one or more of the target areas, and causing the robotic device to traverse the determined path.
III. Conclusion
[0130] The present disclosure is not to be limited in terms of the particular embodiments described in this application, which are intended as illustrations of various aspects. Many modifications and variations can be made without departing from its spirit and scope, as will be apparent to those skilled in the art. Functionally equivalent methods and apparatuses within the scope of the disclosure, in addition to those enumerated herein, will be apparent to those skilled in the art from the foregoing descriptions. Such modifications and variations are intended to fall within the scope of the appended claims.
[0131] The above detailed description describes various features and functions of the disclosed systems, devices, and methods with reference to the accompanying figures. In the figures, similar symbols typically identify similar components, unless context dictates otherwise. The example embodiments described herein and in the figures are not meant to be limiting. Other embodiments can be utilized, and other changes can be made, without departing from the spirit or scope of the subject matter presented herein. It will be readily understood that the aspects of the present disclosure, as generally described herein, and illustrated in the figures, can be arranged, substituted, combined, separated, and designed in a wide variety of different configurations, all of which are explicitly contemplated herein.
[0132] A block that represents a processing of information may correspond to circuitry that can be configured to perform the specific logical functions of a herein-described method or technique. Alternatively or additionally, a block that represents a processing of information may correspond to a module, a segment, or a portion of program code (including related data). The program code may include one or more instructions executable by a processor for implementing specific logical functions or actions in the method or technique. The program code or related data may be stored on any type of computer readable medium such as a storage device including a disk or hard drive or other storage medium.
[0133] The computer readable medium may also include non-transitory computer readable media such as computer-readable media that stores data for short periods of time like register memory, processor cache, and random access memory (RAM). The computer readable media may also include non-transitory computer readable media that stores program code or data for longer periods of time, such as secondary or persistent long term storage, like read only memory (ROM), optical or magnetic disks, compact-disc read only memory (CD-ROM), for example. The computer readable media may also be any other volatile or non-volatile storage systems. A computer readable medium may be considered a computer readable storage medium, for example, or a tangible storage device.
[0134] Moreover, a block that represents one or more information transmissions may correspond to information transmissions between software or hardware modules in the same physical device. However, other information transmissions may be between software modules or hardware modules in different physical devices.
[0135] The particular arrangements shown in the figures should not be viewed as limiting. It should be understood that other embodiments can include more or less of each element shown in a given figure. Further, some of the illustrated elements can be combined or omitted. Yet further, an example embodiment can include elements that are not illustrated in the figures.
[0136] While various aspects and embodiments have been disclosed herein, other aspects and embodiments will be apparent to those skilled in the art. The various aspects and embodiments disclosed herein are for purposes of illustration and are not intended to be limiting, with the true scope being indicated by the following claims.

Claims

CLAIMS What is claimed is:
1. A method comprising: receiving, from at least one sensor on at least one robotic device, data representative of a plurality of locations in an environment; determining a plurality of object detections for the data representative of the environment, wherein each object detection identifies an object class corresponding to an object detected at one of the plurality of locations in the environment; determining a heat map of a particular object class based on the plurality of object detections, wherein each cell in the heat map corresponds to an area in the environment and is associated with a measurement of how frequently the particular object class was detected at the area in the environment; and causing a robotic device to navigate towards a target area in the environment corresponding to a cell in the heat map based on a respective measurement of how frequently the particular object class was detected at the target area in the environment.
2. The method of claim 1, wherein the measurement of how frequently the particular object class was detected at the area in the environment is based on a number of object detections of the particular object class at the area in the environment and a total number of object detections of the particular object class in the heat map.
3. The method of claim 2, wherein the measurement of how frequently the particular object class was detected at the area in the environment is a ratio of the number of object detections of the particular object class at the area in the environment to the total number of object detections of the particular object class in the heat map.
4. The method of claim 2, wherein the number of object detections of the particular object class at the area in the environment and the total number of object detections of the particular object class in the heat map are stored separately.
5. The method of claim 1, wherein the data comprises a plurality of images, wherein determining a plurality of object detections is based on applying a pre-trained machine learning model to the plurality of images.
6. The method of claim 1, further comprising: determining a plurality of heat maps of a plurality of object classes, including the heat map of the particular object class, wherein causing the robotic device to navigate towards the target area in the environment corresponding to the cell in the heat map is based on an additional measurement of how frequently the plurality of object classes was detected at the target area in the environment.
7. The method of claim 1, further comprising: receiving, from the at least one sensor on the at least one robotic device, additional data representative of the environment; determining a plurality of additional object detections for the additional data; and updating the heat map of the particular object class based on the plurality of additional object detections.
8. The method of claim 1, wherein determining the plurality of object detections for the data and determining the heat map of the particular object class are carried out by a server device.
9. The method of claim 1, wherein determining the heat map of the particular object class comprises: transmitting the plurality of object detections for the data to a server device; and receiving, from the server device, the heat map of the particular object class.
10. The method of claim 1, wherein determining the heat map of the particular object class further comprises: updating, by a server device, an existing heat map of one or more particular object classes including the particular object class.
11. The method of claim 1, wherein the measurement corresponding to each cell in the heat map represents a probability of observing the particular object class in the area represented by the cell.
12. The method of claim 1, wherein a size of the area represented by each cell in the heat map is based on a quantity of the plurality of object detections and a quantity of the plurality of locations in the environment.
13. The method of claim 1, wherein each cell in the heat map is also associated with a probability that the particular object class was detected at the area in the environment.
14. The method of claim 1, wherein the object class represents one or more objects that are stationary for at least a period of time.
15. The method of claim 1, further comprising: updating the heat map of the particular object class periodically with one or more object detections from the at least one robotic device.
16. The method of claim 1, wherein determining the heat map of the particular class comprises determining a plurality of heat maps for the particular class during a plurality of day parts, wherein the plurality of heat maps includes a particular heat map associated with a day part, wherein causing the robotic device to navigate towards the target area in the environment during the day part is based on the particular heat map associated with the day part.
17. The method of claim 1, wherein one or more cells of the heat map correspond to an area in the environment for which the at least one robotic device did not determine an object detection.
18. The method of claim 1, wherein causing the robotic device to navigate towards the target area comprises: determining a plurality of target areas including the target area based on a plurality of respective measurement of how frequently the particular object class was detected at the plurality of target areas in the environment; determining a path including one or more of the target areas; and causing the robotic device to traverse the determined path.
19. A robotic device comprising: at least one sensor; and a control system configured to: receive, from the at least one sensor on the robotic device, data representative of a plurality of locations in an environment; determine a plurality of object detections for the data representative of the environment, wherein each object detection identifies an object class corresponding to an object detected at one of the plurality of locations in the environment; determine a heat map of a particular object class based on the plurality of object detections, wherein each cell in the heat map corresponds to an area in the environment and is associated with a measurement of how frequently the particular object class was detected at the area in the environment; and cause the robotic device to navigate towards a target area in the environment corresponding to a cell in the heat map based on a respective measurement of how frequently the particular object class was detected at the target area in the environment.
20. A non-transitory computer readable medium comprising program instructions executable by at least one processor to cause the at least one processor to perform functions comprising: receiving, from at least one sensor on at least one robotic device, data representative of a plurality of locations in an environment; determining a plurality of object detections for the data representative of the environment, wherein each object detection identifies an object class corresponding to an object detected at one of the plurality of locations in the environment; determining a heat map of a particular object class based on the plurality of object detections, wherein each cell in the heat map corresponds to an area in the environment and is associated with a measurement of how frequently the particular object class was detected at the area in the environment; and causing a robotic device to navigate towards a target area in the environment corresponding to a cell in the heat map based on a respective measurement of how frequently the particular object class was detected at the target area in the environment.
PCT/US2023/079815 2022-11-15 2023-11-15 Semantic heat map for robot object search WO2024107837A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202263383812P 2022-11-15 2022-11-15
US63/383,812 2022-11-15

Publications (1)

Publication Number Publication Date
WO2024107837A1 true WO2024107837A1 (en) 2024-05-23

Family

ID=89223964

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2023/079815 WO2024107837A1 (en) 2022-11-15 2023-11-15 Semantic heat map for robot object search

Country Status (1)

Country Link
WO (1) WO2024107837A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9829333B1 (en) * 2016-09-13 2017-11-28 Amazon Technologies, Inc. Robotic traffic density based guidance

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9829333B1 (en) * 2016-09-13 2017-11-28 Amazon Technologies, Inc. Robotic traffic density based guidance

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SIVANANTHAM VINU ET AL: "Adaptive Floor Cleaning Strategy by Human Density Surveillance Mapping with a Reconfigurable Multi-Purpose Service Robot", SENSORS, vol. 21, no. 9, 23 April 2021 (2021-04-23), CH, pages 2965, XP093122161, ISSN: 1424-8220, Retrieved from the Internet <URL:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8122887/pdf/sensors-21-02965.pdf> DOI: 10.3390/s21092965 *
VU LE ANH ET AL: "Social Density Monitoring Toward Selective Cleaning by Human Support Robot With 3D Based Perception System", IEEE ACCESS, IEEE, USA, vol. 9, 10 March 2021 (2021-03-10), pages 41407 - 41416, XP011844909, DOI: 10.1109/ACCESS.2021.3065125 *

Similar Documents

Publication Publication Date Title
CN114728417B (en) Method and apparatus for autonomous object learning by remote operator triggered robots
US12070859B2 (en) Robot base position planning
US20230247015A1 (en) Pixelwise Filterable Depth Maps for Robots
US11945106B2 (en) Shared dense network with robot task-specific heads
US11097414B1 (en) Monitoring of surface touch points for precision cleaning
US11766783B2 (en) Object association using machine learning models
US11745353B2 (en) Recovering material properties with active illumination and camera on a robot manipulator
EP4095486A1 (en) Systems and methods for navigating a robot using semantic mapping
US11436869B1 (en) Engagement detection and attention estimation for human-robot interaction
US20230084774A1 (en) Learning from Demonstration for Determining Robot Perception Motion
US11407117B1 (en) Robot centered augmented reality system
US20220168909A1 (en) Fusing a Static Large Field of View and High Fidelity Moveable Sensors for a Robot Platform
EP3842888A1 (en) Pixelwise filterable depth maps for robots
WO2024107837A1 (en) Semantic heat map for robot object search
US12090672B2 (en) Joint training of a narrow field of view sensor with a global map for broader context
US20240202969A1 (en) Depth-Based 3D Human Pose Detection and Tracking
US12085942B1 (en) Preventing regressions in navigation determinations using logged trajectories
WO2024137503A1 (en) Learning an ego state model through perceptual boosting

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23825327

Country of ref document: EP

Kind code of ref document: A1