US11348280B2 - Method and computer readable medium for pose estimation - Google Patents
Method and computer readable medium for pose estimation Download PDFInfo
- Publication number
- US11348280B2 US11348280B2 US16/751,617 US202016751617A US11348280B2 US 11348280 B2 US11348280 B2 US 11348280B2 US 202016751617 A US202016751617 A US 202016751617A US 11348280 B2 US11348280 B2 US 11348280B2
- Authority
- US
- United States
- Prior art keywords
- pose
- confidence
- boundary features
- image
- generating
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/75—Determining position or orientation of objects or cameras using feature-based methods involving models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/77—Determining position or orientation of objects or cameras using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30108—Industrial image inspection
- G06T2207/30164—Workpiece; Machine component
Definitions
- the present disclosure relates to a method and computer readable medium for pose estimation.
- ODPE Object Detection and Pose Estimation
- depth maps sometimes provide a robust and reliable measurement to estimate the pose of an object
- this approach is limited when depth maps have missing depth information.
- depth maps often have missing depth information near high curvature regions such as edges and corners.
- objects with a sharp edge and flat views are frequently confused with flat surfaces that are bin boundaries.
- Edge based methods using RGB data can be negatively affected by shadows and illumination changes, as well as changes in contrast caused by differences in color of the object and a background.
- generating 2D image data from a 3D model is computationally expensive.
- An aspect of this disclosure is to refine a coarsely estimated initial pose by using both depth features and RGB sensor data.
- a first aspect of this disclosure includes a non-transitory computer readable medium that embodies instructions that cause one or more processors to perform object detection, the method comprising (a) receiving an image containing an object, a first pose of the object in the image, and 3D boundary features of a model corresponding to the object; (b) computing a first pose confidence of the first pose based on the image, the 3D boundary features, and the first pose; (c) perturbing the first pose to obtain a second pose; (d) computing a second pose confidence of the second pose based on the image, the 3D boundary features, and the second pose; (e) determining if the second pose confidence is greater than the first pose confidence; and (f) outputting the second pose and the second pose confidence if the second pose confidence is greater than the first pose confidence.
- a second aspect of this disclosure further modifies the first aspect, wherein refining the second pose if the second pose confidence is greater than the first pose confidence.
- a third aspect of this disclosure further modifies the second aspect, wherein refining the second pose includes perturbing the second pose based on how the first pose was perturbed in step (c).
- a fourth aspect of this disclosure further modifies the first aspect, wherein computing the first pose confidence in step (b) and the second pose confidence in step (d) includes projecting the 3D boundary features onto a 2D space of the image, using the pose, to obtain 2D boundary features, generating a gradient map from the image, for each of the 2D boundary features, estimating an edge score for an area on the gradient map, the area being around a location of the 2D boundary feature, and generating the confidence score based on the estimated edge score.
- a fifth aspect of this disclosure further modifies the first aspect, wherein computing the first pose confidence in step (b) and the second pose confidence in step (d) includes projecting the 3D boundary features onto a 2D space of the image, using the pose, to obtain 2D boundary features, generating a gradient map from the image, for each of the 2D boundary features, estimating an edge score for a plurality of areas having different scales on the gradient map, the plurality of areas being around a location of the 2D boundary feature, and generating the confidence score based on the estimated edge score.
- a sixth aspect of this disclosure further modifies the first aspect, wherein the pose is stochastically perturbed in step (c).
- a seventh aspect of this disclosure further modifies the first aspect, wherein the 3D boundary features include a set of 3D model contour feature points of a 3D model corresponding to the object, and the set of 3D model contour feature points are represented in a three-dimensional coordinate system; step (b) includes calculating a first set of 2D model contour points by projecting the set of 3D model contour feature points based on the first pose, the first set of 2D model contour points being represented in a two-dimensional coordinate system; and step (d) includes calculating a second set of 2D model contour points by projecting the set of 3D model contour feature points based on the second pose, the second set of 2D model contour points being represented in the two-dimensional coordinate system.
- An eighth aspect of this disclosure further modifies the seventh aspect, wherein the pose confidence is calculated based on an estimated edge hypothesis in each of steps (b) and (d).
- FIG. 1 is a perspective view of a robot system.
- FIG. 2 is a functional block diagram of a control system and a robot.
- FIG. 3 is a flowchart of an exemplary method of pose estimation.
- FIG. 4 is a flowchart of a method of generating template data.
- FIGS. 5A-5C are examples of steps for extracting boundary features.
- FIG. 6 is a flowchart of a method executed at runtime for pose estimation.
- FIG. 7 is an example of a projection of a set of boundary points on a captured image.
- FIG. 8 is a diagram illustrating how an edge hypothesis is calculated.
- FIG. 9 is a flowchart for refining a pose.
- FIG. 1 is a perspective view illustrating a robot controlled by a control device according to an embodiment of the invention.
- a robot system in one example of the invention includes robots 1 to 3 as illustrated in FIG. 1 .
- Each of the robots 1 to 3 is a six-axis robot including an end effector, and the different end effectors are attached to the robots 1 to 3 .
- the robot 1 is attached with an imaging unit 21 (for example, an RGB stereo- or monocular-camera and a depth sensor, or an RGB-D sensor),
- the robot 2 is attached with an illumination unit 22 like an illuminator, and the robot 3 is attached with a gripper 23 .
- the imaging unit 21 and the illumination unit 22 are referred to as an optical system.
- the robots 1 to 3 are controlled by a control device 40 .
- the control device 40 is communicably connected to the robots 1 to 3 via cables.
- a constituent element of the control device 40 may be provided in the robot 1 .
- the control device 40 may be formed of a plurality of devices (for example, a learning unit and a control unit are provided in different devices).
- the control device 40 can be communicably connected to a teaching device (not illustrated) via a cable or wireless communication.
- the teaching device may be a dedicated computer, and may be a general purpose computer in which a program for teaching the robot 1 is installed.
- the control device 40 and the teaching device may be integrally formed with each other.
- the robots 1 to 3 are single-arm robots of which various end effectors are attached to arms, and, in the present embodiment, configurations of the arms or axes of the robots 1 to 3 are equivalent to each other.
- FIG. 1 reference signs for explaining arms or axes are added to the robot 3 .
- each of the robots 1 to 3 includes a base T, six arm members A 1 to A 6 , and six joints J 1 to J 6 .
- the base T is fixed to a work table.
- the base T and the six arm members A 1 to A 6 are connected to each other via the joints J 1 to J 6 .
- the arm members A 1 to A 6 and the end effectors are movable portions, and the movable portions are operated such that the robots 1 to 3 can perform various pieces of work.
- the joints J 2 , J 3 and J 5 are bent joints, and the joints J 1 , J 4 and J 6 are torsional joints.
- the arm member A 6 on the distal end side in the arm A is attached with a force sensor P and the end effector.
- Each of the robots 1 to 3 drives the arms of six axes such that the end effector is disposed at any position within a movable range, and can thus take any pose.
- the end effector provided in the robot 3 is the gripper 23 , and can grip a target object W.
- the end effector provided in the robot 2 is the illumination unit 22 , and can illuminate an irradiation region with light.
- the end effector provided in the robot 1 is the imaging unit 21 , and can capture an image within a visual field.
- a position which is relatively fixed with respect to the end effector of each of the robots 1 to 3 is defined as a tool center point (TCP).
- a position of the TCP is a reference position of the end effector, and a TCP coordinate system which has the TCP as the origin and is a three-dimensional orthogonal coordinate system relatively fixed with respect to the end effector is defined.
- the force sensor P is a six-axis force detector.
- the force sensor P detects magnitudes of forces which are parallel to three detection axes orthogonal to each other, and magnitudes of torques about the three detection axes, in a sensor coordinate system which is a three-dimensional orthogonal coordinate system having a point on the force sensor as the origin.
- the six-axis robot is exemplified, but various aspects of robots may be used, and aspects of the robots 1 to 3 may be different from each other. Any one or more of the joints J 1 to J 5 other than the joint J 6 may be provided with a force sensor as a force detector.
- the robot coordinate system is a three-dimensional orthogonal coordinate system defined by an x axis and a y axis orthogonal to each other on a horizontal plane, and a z axis having a vertical upward as a positive direction (refer to FIG. 1 ).
- a negative direction of the z axis substantially matches the gravitational direction.
- a rotation angle about the x axis is indicated by Rx
- a rotation angle about the y axis is indicated by Ry
- a rotation angle about the z axis is indicated by Rz.
- Any position in the three-dimensional space can be expressed by positions in the x, y and z directions, and any pose in the three-dimensional space can be expressed by rotation angles in the Rx, Ry and Rz directions.
- the term pose indicates a position of an object, such as the target object W, in the x, y, z directions and an attitude of the object with respect to angles expressed in the Rx, Ry, and Rz directions.
- relationships among various coordinate systems are defined in advance, and coordinate values in the various coordinate systems can be converted into each other.
- the robot 1 is a general purpose robot which can perform various pieces of work through teaching, and includes, motors M 1 to M 6 as actuators, and encoders E 1 to E 6 as sensors. Controlling the arms indicates controlling the motors M 1 to M 6 .
- the motors M 1 to M 6 and the encoders E 1 to E 6 are provided to respectively correspond to the joints J 1 to J 6 , and the encoders E 1 to E 6 respectively detect rotation angles of the motors M 1 to M 6 .
- the respective motors M 1 to M 6 are connected to power lines for supplying power, and each of the power lines is provided with an ammeter. Therefore, the control device 40 can measure a current supplied to each of the motors M 1 to M 6 .
- the control device 40 includes hardware resources such as a computer and various software resources stored in a storage unit 44 , and can execute a program.
- the control device 40 functions as a calculation unit 41 , a detection unit 42 , and a control unit 43 .
- the hardware resources may include a processor like a CPU, a memory like a RAM, a ROM, and the like, and may include an ASIC, and various configurations may be employed.
- the storage unit 44 is a computer readable medium such as a flash memory, a solid-state memory, or a magnetic memory.
- the detection unit 42 performs a process of detecting a target object W, and the control unit 43 drives the arms of the robots 1 to 3 .
- the detection unit 42 is connected to the imaging unit 21 and the illumination unit 22 forming an optical system 20 .
- the detection unit 42 controls the imaging unit 21 , and acquires an image captured by an imaging sensor, such as an RGB-D sensor, provided in the imaging unit 21 .
- the detection unit 42 controls the illumination unit 22 , and changes brightness of output light.
- the detection unit 42 performs a template matching process on the basis of the captured image, and performs a process of detecting a relatively coarse position (pose) of a target object W.
- the detection unit 42 performs the template matching process on the basis of the template data 44 c .
- Template data 44 c is a template for each of a plurality of poses stored in the storage unit 44 . Therefore, if a pose is correlated with an ID or the like with respect to the template data 44 c , a pose of a target object W viewed from the detection unit 42 can be specified by using the appropriate type of template data 44 c , as will be described in further detail below.
- a position at which the illumination unit 22 is disposed when a target object W is imaged is defined as a position of the illumination unit, and is included in the illumination unit parameter.
- the illumination unit 22 includes a mechanism capable of adjusting brightness, and a value of brightness of when a target object W is imaged is included in the illumination unit parameter.
- a position of the illumination unit may also be described in various methods, and, for example, a configuration in which a position of the TCP of the illumination unit 22 is described in the robot coordinate system may be employed.
- the detection unit 42 may operate the robot 1 or 2 by specifying a position of the imaging unit 21 or the illumination unit 22 on the basis of the optical parameters, but positions of when the robot 1 and the robot 2 are driven may be given by the operation parameters or the force control parameters.
- the control unit 43 includes the position control portion 43 a , a force control portion 43 b , a contact determination portion 43 c , and a servo 43 d .
- a correspondence relationship U 1 between a combination of rotation angles of the motors M 1 to M 6 and a position of the TCP in the robot coordinate system is stored in a storage medium, and a correspondence relationship U 2 between the coordinate systems is stored in a storage medium. Therefore, the control unit 43 or the calculation unit 41 can convert a vector in any coordinate system into a vector in another coordinate system on the basis of the correspondence relationship U 2 .
- control unit 43 or the calculation unit 41 may acquire acting forces to the robots 1 to 3 in the sensor coordinate system on the basis of outputs from the force sensor P, and may convert the acting forces into forces acting on positions of the TCP in the robot coordinate system.
- the control unit 43 or the calculation unit 41 may convert a target force expressed in the force control coordinate system into a target force at a position of the TCP in the robot coordinate system.
- the correspondence relationships U 1 and U 2 may be stored in the storage unit 44 .
- the storage unit 44 stores a robot program 44 b for controlling the robots 1 to 3 in addition to the parameters 44 a .
- the parameters 44 a and the robot program 44 b are generated through teaching and are stored in the storage unit 44 , but may be corrected by the calculation unit 41 .
- the robot program 44 b mainly indicates the sequence of work (an order of steps) performed by the robots 1 to 3 , and is described by a combination of predefined commands.
- the parameters 44 a are specific values which are required to realize each step, and are described as arguments of each command.
- the storage unit 44 also stores pose data 44 d for determining a pose of the object W, as will be described below in further detail.
- the parameters 44 a for controlling the robots 1 to 3 include the operation parameters and the force control parameters in addition to the optical parameters.
- the operation parameters are parameters related to operations of the robots 1 to 3 , and are parameters which are referred to during position control in the present embodiment. In other words, in the present embodiment, a series of work may be divided into a plurality of steps, and the parameters 44 a of when each step is performed are generated through teaching.
- the operation parameters include parameters indicating a start point and an end point in the plurality of steps.
- the start point and the end point may be defined in various coordinate systems, and, in the present embodiment, the start point and the end point of the TCP of a control target robot are defined in the robot coordinate system. In other words, a translation position and a rotation position are defined for each axis of the robot coordinate system.
- FIG. 3 shows an exemplary method for pose estimation according to this embodiment.
- the method for pose estimation carried out by the control device 40 .
- the method uses an RGB image of the object W as inputs.
- the method outputs a refined pose and confidence for use in determining the position and attitude of the object W.
- the computer model could be, for example, a CAD model of the object W.
- the template data 44 c is generated during training based on the input of a computer model, such as a CAD model, in S 60 that is input to S 100 for training, as shown in FIG. 3 .
- S 100 trains based on the computer model and then provides the data for use during pose improvement in S 200 .
- S 100 outputs information including boundary features as a result of the training in S 80 .
- S 70 images an object W and sends the image data, such as RGB image data, to the pose improvement of S 200 with an initial pose of the object W.
- S 200 refines the pose received from S 70 based on the image data received from S 70 and the trained data from S 100 .
- S 200 is an example of a runtime portion of this embodiment.
- S 100 is an example of a training portion of this embodiment. Additional details of the training process S 100 are shown in FIG. 4 .
- the control device 40 receives a computer model, for example a CAD model, which defines a shape of the object W.
- the computer model represents the shape of the object W in three dimensions (e.g., a 3D model).
- the control device 40 After receiving the computer model in S 101 , the control device 40 generates a rendered depth map for each of a plurality of views in S 103 as shown for an exemplary view of the object W in FIG. 5A .
- Each view of the object W is from a unique angle.
- a multi-scale gradient map is created from the rendered depth map, and in this embodiment, the multi-scale gradient map includes edge maps at respective scales in a two-dimensional space as shown for an exemplary scale and a view of the object W in FIG. 5B .
- the corresponding depth map can be a multi-scale gradient map.
- the control device 40 learns to discriminate boundary locations BL, or edge feature locations, for each view based on the corresponding multi-scale gradient map as shown for an exemplary scale and a view of the object W in FIG. 5C .
- the boundary locations BL identify boundaries of the object W in each respective view.
- a set of boundary features BF is determined in S 107 .
- Each boundary feature BF is a point on a boundary, or an edge, location.
- the boundary features BF obtained for a view are then projected back to a three-dimensional coordinate system of the 3D model. These back-projected features having 3D coordinates may also be referred to as “3D boundary features” of the 3D model.
- the 3D boundary features are stored in a memory area together with, or associated with, their corresponding 2D coordinates (“2D boundary features”) and the corresponding view for run time use.
- a plurality of the boundary features BF are shown for an exemplary view of the object W in FIG. 5C .
- the number of boundary features BF can be random, based on resolution of the imaging sensor, the processing power available, the memory available, or other factors as would be apparent in light of this disclosure.
- the boundary features BF are a set of model contour feature points of the computer model corresponding with the object W.
- the boundary features BF and the edge threshold for each view are examples of data stored in the parameters 44 a .
- an edge threshold for each boundary feature for each view can be calculated in S 107 .
- an image is captured by the optical unit 20 , and an initial pose is calculated by the control device 40 based on the captured image.
- the detection unit 42 sequentially sets the template data 44 c for each of a plurality of poses as a processing target, and compares the template data 44 c with the captured image while changing a size of the template data 44 c .
- the detection unit 42 detects, as an image of the target object W, an image in which a difference between the template data 44 c and the image is equal to or less than a threshold value.
- the initial pose is defined in a 3D coordinate system of the camera or the rendering camera for rendering the 3D model onto the image plane of the camera.
- the detection unit 42 specifies, or derives, a pose of the target object W on the basis of a size of the template data 44 c appropriate for a relationship of a predefined coordinate system.
- a distance between the imaging unit 21 and the target object W in an optical axis direction is determined on the basis of the size of the template data 44 c
- a position of the target object W in a direction perpendicular to the optical axis is determined on the basis of the position of the target object W detected in the image.
- the detection unit 42 can specify a position of the target object W in the TCP coordinate system on the basis of a size of the template data 44 c , and a position where the template data 44 c is appropriate for the image.
- the detection unit 42 may specify a pose of the target object W in the TCP coordinate system on the basis of an ID of the appropriate template data 44 c .
- the detection unit 42 can specify a pose of the target object W in any coordinate system, for example the robot coordinate system, by using the correspondence relationship in the above coordinate system.
- the template matching process may be a process for specifying a pose of a target object W, and may employ various processes. For example, a difference between the template data 44 c and an image may be evaluated on the basis of a difference between grayscale values, and may be evaluated on the basis of a difference between features of the image (for example, gradients of the image).
- the detection unit 42 performs the template matching process by referring to parameters.
- various parameters 44 a are stored in the storage unit 44 , and the parameters 44 a include parameters related to detection in the detection unit 42 .
- the optical parameters are parameters related to detection in the detection unit 42 .
- the operation parameters and the force control parameters are parameters related to control of the robots 1 to 3 .
- the optical parameters include an imaging unit parameter related to the imaging unit 21 , an illumination unit parameter related to the illumination unit 22 , and an image processing parameter related to image processing on an image of a target object W captured by the imaging unit 21 .
- the imaging unit 21 includes a mechanism capable of adjusting an exposure time and an aperture, and an exposure time and a value of the aperture for imaging a target object W are included in the imaging unit parameter.
- a position of the imaging unit may be described in various methods, and, for example, a configuration in which a position of the TCP of the imaging unit 21 is described in the robot coordinate system may be employed.
- the detection unit 42 sets an exposure time and an aperture of the imaging unit 21 by referring to the imaging unit parameter. As a result, the imaging unit 21 is brought into a state of performing imaging on the basis of the exposure time and the aperture.
- the detection unit 42 delivers a position of the illumination unit 22 to the position control portion 43 a by referring to the imaging unit parameter.
- the detection unit 42 sets brightness in the illumination unit 22 by referring to the illumination unit parameter. As a result, the illumination unit 22 is brought into a state of outputting light with the brightness.
- the detection unit 42 refers to the image processing parameter in a case where the template matching process is applied to an image captured by the imaging unit 21 .
- the image processing parameter includes an image processing order indicating a processing sequence of performing the template matching process.
- a threshold value in the template matching process is variable, and a threshold value of the current template matching is included in the image processing parameter.
- the detection unit 42 may perform various processes before comparing the template data 44 c with an image.
- a smoothing process and a sharpening process can include various processes, and the intensity of each thereof is included in the image processing parameter.
- the detection unit 42 determines an order of image processing (including whether or not the image processing is to be performed) on the basis of the image processing sequence, and performs image processing such as a smoothing process or a sharpening process in the order. In this case, the detection unit 42 performs image processing such as a smoothing process or a sharpening process according to the intensity described in the image processing parameter. In a case where comparison (comparison between the template data 44 c and the image) included in the image processing sequence is performed, the comparison is performed on the basis of a threshold value indicated by the image processing parameter.
- the template matching process roughly estimates an initial 3D pose based on an image captured by the camera, and in S 70 , provides the initial pose and the image captured by the camera to the pose improvement process of S 200 .
- the initial pose provided in S 70 is a relatively coarse pose.
- a detection algorithm is run by one or more processors in the detection unit 42 .
- the detection algorithm can perform a template matching algorithm where 2D feature templates that have been generated based on a 3D model at various poses are used to estimate a 3D pose of the object W in a captured image by minimizing reprojection errors, over a trained view range.
- FIG. 6 provides a pose improvement process S 200 according to this embodiment.
- the image captured by the optical system, the associated initial pose, and boundary features included in the parameters 44 a are received in S 201 .
- a pose confidence of the initial pose is calculated based on the 3D boundary features of the 3D model associated with a pose closest to the initial pose, and the image containing the object X captured by the optical system.
- the 3D boundary features are projected onto a 2D space, which is typically the same as the 2D coordinate system of the captured image, using the initial pose or the pose closest to the initial pose.
- the initial pose or the pose closest to the initial pose may be referred to as a first pose.
- an edge hypothesis, or an edge score is estimated for one or preferably more areas having different scales around a location (u, v).
- An example edge hypothesis illustrates edge hypotheses at scale 4 and at scale 1 . The points that are on an actual edge will have the highest score. The score decreases as the location (u, v) is farther from an actual edge point.
- the score for each location (u, v) of the boundary features of the 3D model can be computed as follows in S 203 .
- the 3D boundary features BF in the template T i are projected onto the image using the initial pose as shown in FIG. 7 .
- the pose closest to the initial pose may also be used to project the 3D boundary features onto the image.
- a variance map, or a gradient map is generated from the image corresponding to the initial pose.
- the edge strength g(k) of the input image at each projected boundary feature location (u, v) and a final edge score at location (u, v) are computed.
- an edge score e(u, v) for estimating a pose confidence can be calculated based on the formula:
- GMA gradient map area
- the gradient values on all locations within the area GMA centered at (u, v) are added to obtain the edge strength or g(k).
- e(u, v) is an edge score at pixel location (u, v).
- the variable n is a number that indicates the number of the areas GMA centered at (u, v) minus 1.
- the function h(x) is a step function in which the edge strength g(k) can be applied as a value x, and ⁇ is an edge threshold.
- the edge threshold ⁇ indicates a minimum edge strength for a given area GMA centered at (u, v). Thus, when an area GMA is too far from an edge, the edge strength g(k) is less than the edge threshold ⁇ , and the edge score e(u, v) is 0.
- any number of areas GMA and values of scale k can be used for calculating edge score e(u, v) at location (u, v).
- the area GMA can have any shape.
- the pose confidence and the initial pose are stored as a stored pose confidence and a stored pose.
- the pose is further improved by stochastically perturbing the stored pose in S 207 to generate a perturbed pose (e.g., a second pose).
- Stochastically perturbing the stored pose includes, for example, any combination of slightly offsetting the pose, slightly rotating the pose, slightly enlarging the projected scale of the pose, and slightly reducing the projected scale of the pose. It is noted that a plurality of different perturbed poses (e.g., second poses) can be computed by iterations of step S 207 .
- a pose confidence is calculated for the perturbed pose based on the image, the perturbed pose and the 3D boundary features, and is an example of a perturbed pose confidence.
- the perturbed pose confidence is calculated in the same manner as the stored pose confidence in S 203 .
- the perturbed pose confidence is then compared to the stored pose confidence in S 211 . If the perturbed pose confidence is greater than or equal to the stored pose confidence, meaning that the perturbed pose is closer to the actual pose of the object W, the flow proceeds to refine the stored pose in S 213 . In addition, if the perturbed pose confidence is greater than or equal to the stored pose confidence, in S 211 , the perturbed pose can be an example of a pose refined to a relatively intermediate level.
- FIG. 9 shows a pose refining process of S 213 .
- S 301 a pose difference between the stored pose and the perturbed pose is determined.
- the pose difference represents the stochastic perturbation carried out in S 207 .
- the perturbed pose and the perturbed pose confidence are stored as the stored pose and the stored pose confidence.
- the stored pose and the stored pose confidence can be overwritten with the perturbed pose and the perturbed pose confidence, respectively.
- the perturbed pose and the perturbed pose confidence can be stored and indicated as the stored pose and the stored pose confidence, respectively.
- the stored pose is perturbed based on the pose difference, and is another example of the perturbed pose.
- the stored pose can be slightly shifted in the first direction, or slightly shifted in a second direction opposite to the first direction.
- the pose difference indicates a rotation in a third direction
- the pose can be slightly rotated in the third direction, or slightly rotated in a fourth direction opposite to the third direction.
- the difference can indicate any combination of slightly offsetting the pose, slightly rotating the pose, slightly enlarging the projected scale of the pose, and slightly reducing the projected scale of the pose, and is perturbed in S 305 based on the pose difference.
- a pose confidence for the perturbed pose is calculated in S 307 as another example of a perturbed pose confidence.
- the perturbed pose confidence is calculated in the same manner as the stored pose confidence in S 203 .
- the perturbed pose confidence is then compared to the stored pose confidence in S 309 . If the perturbed pose confidence is greater than or equal to the stored pose confidence, meaning that the perturbed pose is closer to the actual pose of the object W, the flow proceeds S 311 . It should be noted that a plurality of different perturbed poses (e.g., second poses) can be computed by iterations of step S 309 .
- the perturbed pose and the perturbed pose confidence are stored as the stored pose and the stored pose confidence.
- the stored pose and the stored pose confidence can be overwritten with the perturbed pose and the perturbed pose confidence, respectively.
- the perturbed pose and the perturbed pose confidence can be stored and indicated as the stored pose and the stored pose confidence, respectively.
- the process continues to S 313 .
- the process proceeds to determining if a stop condition is satisfied in S 313 .
- a stop condition can be any combination of, for example, a number of iterations of S 311 , a number of consecutive times the perturbed pose confidence is less than the stored pose confidence as determined in S 309 , whether the stored perturbed pose confidence is greater than or equal to a threshold, and a total number of iterations of S 305 . If the stop condition is satisfied, the process returns to S 215 as shown in FIG. 6 . If the stop condition is not satisfied, the process returns to S 305 .
- the flow proceeds to determining if a stop condition is satisfied in S 215 .
- a stop condition can be any combination of, for example, a number of iterations of S 213 , a number of consecutive times the perturbed pose confidence is less than the stored pose confidence as determined in S 211 , whether the stored perturbed pose confidence is greater than or equal to a threshold, and a total number of iterations of S 207 . If the stop condition is satisfied, the process outputs the stored pose and the stored pose confidence in S 80 as shown in FIG. 3 . If the stop condition is not satisfied, the process returns to S 207 . In addition, when the stop condition is satisfied in S 215 , the output pose is an example of a pose refined to a relatively fine level.
- control device 40 is illustrated in FIG. 1 as a separate element from robots 1 to 3 , the control device 40 can be a component of any combination of robots 1 to 3 , or distributed among any combination of the robots 1 to 3 .
- the 3D model contour feature points are described as being extracted by the control device 40 during a training process, it would be would be understood in light of this disclosure that the 3D model contour feature points (e.g., boundary features BF) can be extracted by a remote computer and then transmitted to the control device 40 .
- the 3D model contour feature points e.g., boundary features BF
- FIG. 9 and S 301 to S 313 describe a process for refining the stored pose in S 213
- alternative methods can be used to refine the stored pose in S 213 as would be understood in light of this disclosure.
- S 213 in FIG. 6 and S 301 to S 313 in FIG. 9 may be omitted from the process in alternative embodiments.
- the present disclosure improves upon the related art by reducing computational costs.
- the pose of the object is refined without a significant increase in computational costs.
- Computational costs are also reduced during training, for example, by rendering a depth map (a 2.5D image) from a 3D model, which is computationally cheaper than rendering a 2D image from the 3D model.
- U.S. patent application Ser. No. 15/888,552 describes an exemplary control device, robot, and robot system upon which this disclosure can be implemented.
- the entire disclosure of U.S. patent application Ser. No. 15/888,552, filed Feb. 5, 2018 is expressly incorporated by reference herein.
- the entire disclosure of Japanese Patent Application No. 2017-019312, filed Feb. 6, 2017 is expressly incorporated by reference herein.
- the method and computer readable medium for pose estimation is in the context of a control device, robot, and robot system, of the method and computer readable medium for pose estimation are described as implemented using an exemplary control device, robot, and robot system, the method and computer readable medium for pose estimation can be implemented in alternative computing environments including a processor, memory, and an imaging device having an RGB-D image sensor.
- alternative embodiments are, by non-limiting example, a head mounted display, or a personal computer with an imaging device.
- the above-mentioned exemplary embodiments of the method and computer readable medium for pose estimation are not limited to the examples and descriptions herein, and may include additional features and modifications as would be within the ordinary skill of a skilled artisan in the art.
- the alternative or additional aspects of the exemplary embodiments may be combined as well.
- the foregoing disclosure of the exemplary embodiments has been provided for the purposes of illustration and description. This disclosure is not intended to be exhaustive or to be limited to the precise forms described above. Obviously, many modifications and variations will be apparent to artisans skilled in the art. The embodiments were chosen and described in order to best explain principles and practical applications, thereby enabling others skilled in the art to understand this disclosure for various embodiments and with the various modifications as are suited to the particular use contemplated.
Abstract
Description
Where g(k) is an edge strength which is based on gradient values in an area GMA (“gradient map area”) of a gradient map obtained from the captured (real) image, the size of the area being based on a scale k, and the location of the area being centered at location (u, v).
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/751,617 US11348280B2 (en) | 2020-01-24 | 2020-01-24 | Method and computer readable medium for pose estimation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/751,617 US11348280B2 (en) | 2020-01-24 | 2020-01-24 | Method and computer readable medium for pose estimation |
Publications (2)
Publication Number | Publication Date |
---|---|
US20210233269A1 US20210233269A1 (en) | 2021-07-29 |
US11348280B2 true US11348280B2 (en) | 2022-05-31 |
Family
ID=76971117
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/751,617 Active 2040-08-26 US11348280B2 (en) | 2020-01-24 | 2020-01-24 | Method and computer readable medium for pose estimation |
Country Status (1)
Country | Link |
---|---|
US (1) | US11348280B2 (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040019598A1 (en) | 2002-05-17 | 2004-01-29 | Jing Huang | Binary tree for complex supervised learning |
US20160292889A1 (en) | 2015-04-02 | 2016-10-06 | Canon Kabushiki Kaisha | Information processing apparatus, information processing method, and storage medium |
US20180222058A1 (en) | 2017-02-06 | 2018-08-09 | Seiko Epson Corporation | Control device, robot, and robot system |
US20190197196A1 (en) | 2017-12-26 | 2019-06-27 | Seiko Epson Corporation | Object detection and tracking |
US10366276B2 (en) | 2016-03-29 | 2019-07-30 | Seiko Epson Corporation | Information processing device and computer program |
US20200234466A1 (en) | 2019-01-22 | 2020-07-23 | Fyusion, Inc. | Object pose estimation in visual data |
-
2020
- 2020-01-24 US US16/751,617 patent/US11348280B2/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040019598A1 (en) | 2002-05-17 | 2004-01-29 | Jing Huang | Binary tree for complex supervised learning |
US20160292889A1 (en) | 2015-04-02 | 2016-10-06 | Canon Kabushiki Kaisha | Information processing apparatus, information processing method, and storage medium |
US10366276B2 (en) | 2016-03-29 | 2019-07-30 | Seiko Epson Corporation | Information processing device and computer program |
US20180222058A1 (en) | 2017-02-06 | 2018-08-09 | Seiko Epson Corporation | Control device, robot, and robot system |
US20190197196A1 (en) | 2017-12-26 | 2019-06-27 | Seiko Epson Corporation | Object detection and tracking |
US20200234466A1 (en) | 2019-01-22 | 2020-07-23 | Fyusion, Inc. | Object pose estimation in visual data |
Non-Patent Citations (11)
Title |
---|
Brachmann et al.; "Learning 6D Object Pose Estimation using 3D Object Coordinates;" pp. 1-16; 2014. |
Bugaev, Bogdan, Anton Kryshchenko, and Roman Belov. "Combining 3D Model Contour Energy and Keypoints for Object Tracking." European Conference on Computer Vision. Springer, Cham, 2018. (Year: 2018). * |
Jan. 22, 2021 Office Action Issued in U.S. Patent Application No. 16/776,675. |
Johnson et al.; "Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes;" IEEE Transactions on Pattern Analysis and Machine Intelligence; vol. 21; No. 5; pp. 433-449; May 1999. |
Metropolis, Nicholas, et al. "Equation of state calculations by fast computing machines." The journal of chemical physics 21.6 (1953) : 1087-1092. (Year: 1953). * |
Rusu et al.; "Fast Point Feature Histograms (FPFH) for 3D Registration;" Robotics and Automation, 2009; ICRA '09 International Conference on IEEE; 2009. |
Tan et al.; "6D Object Pose Estimation with Depth Images: A Seamless Approach for Robotic Interaction and Augmented Reality;" Technische Universitat München; pp. 1-4; Sep. 5, 2017. |
Tan et al.; "A Versatile Learning-based 3D Temporal Tracker: Scalable, Robust, Online;" International Conference on Computer Vision (ICCV); Santiago, Chile; pp. 693-701; Dec. 2015. |
Tan et al.; "Looking Beyond the Simple Scenarios: Combining Learners and Optimizers in 3D Temporal Tracking;" IEEE Transactions on Visualization and Computer Graphics; vol. 23; No. 11; pp. 2399-2409; Nov. 2017. |
Tan et al.; "Multi-Forest Tracker: A Chameleon in Tracking;" 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); pp. 1202-1209; 2014. |
Tombari et al.; "Unique Signatures of Histograms for Local Surface Description;" European Conference on Computer Vision; Springer; Bedin, Heidelberg; pp. 356-369; 2010. |
Also Published As
Publication number | Publication date |
---|---|
US20210233269A1 (en) | 2021-07-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP1413850B1 (en) | Optical sensor for measuring position and orientation of an object in three dimensions | |
US9089971B2 (en) | Information processing apparatus, control method thereof and storage medium | |
JP5602392B2 (en) | Information processing apparatus, information processing method, and program | |
JP6271953B2 (en) | Image processing apparatus and image processing method | |
US7280687B2 (en) | Device for detecting position/orientation of object | |
US10043279B1 (en) | Robust detection and classification of body parts in a depth map | |
US11625842B2 (en) | Image processing apparatus and image processing method | |
US9914222B2 (en) | Information processing apparatus, control method thereof, and computer readable storage medium that calculate an accuracy of correspondence between a model feature and a measurement data feature and collate, based on the accuracy, a geometric model and an object in an image | |
US10740649B2 (en) | Object attitude detection device, control device, and robot system | |
JP6626338B2 (en) | Information processing apparatus, control method for information processing apparatus, and program | |
WO2022021156A1 (en) | Method and apparatus for robot to grab three-dimensional object | |
CN113269835A (en) | Industrial part pose identification method and device based on contour features and electronic equipment | |
JP2730457B2 (en) | Three-dimensional position and posture recognition method based on vision and three-dimensional position and posture recognition device based on vision | |
US11138752B2 (en) | Training a pose detection algorithm, and deriving an object pose using a trained pose detection algorithm | |
US11348280B2 (en) | Method and computer readable medium for pose estimation | |
JP2015007639A (en) | Information processing apparatus, information processing method and program | |
US10366278B2 (en) | Curvature-based face detector | |
JP2015132523A (en) | Position/attitude measurement apparatus, position/attitude measurement method, and program | |
WO2019093299A1 (en) | Position information acquisition device and robot control device provided with same | |
JP6719925B2 (en) | Information processing device, information processing method, and program | |
JPH07160881A (en) | Environment recognizing device | |
WO2023074235A1 (en) | Conveyance system | |
CN116091608B (en) | Positioning method and positioning device for underwater target, underwater equipment and storage medium | |
US20230267646A1 (en) | Method, System, And Computer Program For Recognizing Position And Attitude Of Object Imaged By Camera | |
US20230154162A1 (en) | Method For Generating Training Data Used To Learn Machine Learning Model, System, And Non-Transitory Computer-Readable Storage Medium Storing Computer Program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: EPSON CANADA LIMITED, CANADA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MISHRA, AKSHAYA;MAANI, ROUZBEH;REEL/FRAME:051609/0028 Effective date: 20191209 Owner name: SEIKO EPSON CORPORATION, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:EPSON CANADA, LTD.;REEL/FRAME:051609/0092 Effective date: 20191209 |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |