CN116783619A

CN116783619A - Method and device for generating point cloud histogram

Info

Publication number: CN116783619A
Application number: CN202180061425.1A
Authority: CN
Inventors: D·J·迈克尔; 朱红卫; N·M·维迪雅
Original assignee: Cognex Corp
Current assignee: Cognex Corp
Priority date: 2020-05-11
Filing date: 2021-05-10
Publication date: 2023-09-19

Abstract

The technology described herein relates to methods, apparatus, and computer-readable media configured to generate a point cloud histogram. A one-dimensional histogram may be generated by determining a distance to a reference for each 3D point of the 3D point cloud. For each histogram entry, a distance within the distance range of the entry is added to generate a one-dimensional histogram. A two-dimensional histogram may be determined by determining, for each 3D point, an orientation having at least a first value of the first component and a second value of the second component to generate an orientation group. A two-dimensional histogram may be generated based on the set of orientations. Each interval may be associated with a range of values for the first component and the second component. An orientation may be added for each interval having a first value and a second value within a first value range and a second value range, respectively, for that interval.

Description

Method and device for generating point cloud histogram

RELATED APPLICATIONS

The present application claims priority from 35U.S. c. ≡119 (e) to U.S. provisional application No.63/023,163 entitled "METHODS AND APPARATUS FOR GENERATING POINT CLOUD HISTOGRAMS" filed on day 5 and 11 of 2020, and U.S. provisional application No.63/065,456 entitled "METHODS AND APPARATUS FOR GENERATING POINT CLOUD HISTOGRAMS" filed on day 8 and 13 of 2020, the entire contents of which are incorporated herein by reference.

Technical Field

The technology described herein relates generally to methods and apparatus for machine vision, including techniques for generating histograms of point cloud data.

Background

The machine vision system may include robust imaging capabilities, including three-dimensional (3D) imaging devices. For example, the 3D sensor may image the scene to generate a set of 3D points, each 3D point including an (x, y, z) position within a 3D coordinate system (e.g., where the z-axis of the coordinate system represents distance from the 3D imaging device). Such a 3D imaging device may generate a 3D point cloud comprising a set of 3D points captured during 3D imaging. However, the absolute number of 3D points in a 3D point cloud may be huge (e.g., compared to the 2D data of the scene). Furthermore, a 3D point cloud may include only pure 3D data points, and thus may not include data indicating relationships between/among 3D points, or other information (such as surface normal information), processing 3D points that do not have data indicating relationships between other points may be complex. Thus, while a 3D point cloud may provide a large amount of 3D data, performing machine vision tasks on the 3D point cloud data may be complex, time consuming, require a large amount of processing resources, and so forth.

Disclosure of Invention

In accordance with the disclosed subject matter, apparatus, systems, and methods are provided for improved machine vision techniques, particularly for providing improved machine vision techniques for summarizing point cloud data (e.g., which may be used to compare objects in point cloud data). In some embodiments, these techniques are used to generate a histogram of point cloud data. The histogram may be of various dimensions, such as a one-dimensional histogram and/or a two-dimensional histogram. A histogram may be generated using various metrics, including a reference based on the point cloud data. For example, a one-dimensional histogram may be generated based on distances of 3D points to a reference plane, to representative points (e.g., centroids) of a 3D point cloud, and so on. As another example, a two-dimensional histogram may be generated based on information determined from the 3D point cloud (such as based on surface normals, vectors, etc.).

Some aspects relate to a computerized method for generating a histogram of a three-dimensional (3D) point cloud. The method comprises the following steps: receiving data indicative of a 3D point cloud comprising a plurality of 3D points; determining a reference in spatial relationship to the 3D point cloud; determining, for each 3D point of the plurality of 3D points, a distance to a reference to generate a distance set of the plurality of 3D points; and generating a histogram including a set of entries based on the set of distances, including, for each entry (entry) of the set of entries, inserting distances from the set of distances that are within a range of distances associated with the entry.

According to some examples, the method includes: generating a 3D voxel grid (voxel grid) for at least a portion of the 3D point cloud, wherein each voxel in the 3D voxel grid comprises the same set of dimensions; for each voxel in the 3D voxel grid, determining whether one or more of the plurality of 3D data points are within the voxel to generate an associated set of 3D points with the voxel; for each voxel in a 3D voxel grid having an associated set of 3D points, determining a single 3D data point for the voxel based on the associated set of 3D data points; and storing the single 3D data point in the voxel. Determining the distance group may include determining, for each voxel in the 3D voxel grid, a distance from the single 3D data point to the reference to generate the distance group.

According to some examples, the reference is a two-dimensional (2D) reference plane, and determining the distance of each 3D point to generate the distance group includes determining a shortest distance of each 3D point to the reference plane.

According to some examples, the reference is a reference line, and determining the distance of each 3D point to generate the distance group includes determining a shortest distance of each 3D point to the reference line.

According to some examples, the method includes determining an estimated centroid of the 3D point cloud, wherein the reference is the estimated centroid. Determining the distance of each 3D point to generate the distance group may include determining a distance of each 3D point to the estimated centroid.

According to some examples, the method includes comparing the histogram with a second histogram generated for a second 3D point cloud to determine a measure of similarity between the 3D point cloud and the second 3D point cloud.

Some aspects relate to a computerized method for generating a histogram of a three-dimensional (3D) point cloud. The method includes receiving data indicative of a 3D point cloud comprising a plurality of 3D points. The method includes generating a set of orientations including determining an orientation of each 3D point in the 3D point cloud for the 3D point, wherein the orientation includes at least a first value of a first component and a second value of a second component. The method includes generating a histogram including a set of bins based on the orientation set, wherein each bin in the set of bins is associated with a first range of values of a first component and a second range of values of a second component; and generating the histogram includes, for each bin in the set of bins, adding an orientation that is to be from the set of orientations and has a first value and a second value within a first range of values and a second range of values, respectively, associated with the bin.

According to some examples, the interval groups are arranged in two dimensions, wherein a first dimension is associated with the first component and a second dimension is associated with the second component.

According to some examples, the first component includes a tilt angle and the second component includes an azimuth angle.

According to some examples, the method further comprises: generating a 3D voxel grid for at least a portion of the 3D point cloud, wherein each voxel in the 3D voxel grid comprises the same set of dimensions; for each voxel in the 3D voxel grid, determining whether one or more of the plurality of 3D data points are within the voxel to generate an associated set of 3D points with the voxel; for each voxel in a 3D voxel grid having an associated set of 3D points, determining a single 3D data point for the voxel based on the associated set of 3D data points; and storing the single 3D data point in the voxel. Generating the set of orientations includes determining an orientation of a single 3D data point for each voxel in the 3D voxel grid to generate the set of orientations.

According to some examples, generating the set of orientations includes, for each 3D point in the 3D point cloud, determining an orientation of the 3D point based on a fixed coordinate system associated with the 3D point cloud.

According to some examples, generating the orientation group includes, for each 3D point in the 3D point cloud, determining an orientation of the 3D point based on a local coordinate system associated with the 3D point in the 3D point cloud.

According to some examples, the method includes comparing the histogram with a second histogram associated with a second 3D point cloud to determine data indicative of a similarity measure between the 3D point cloud and the second 3D point cloud. Comparing the histogram with the second histogram includes determining a first set of peaks of the histogram and a second set of peaks of the second histogram, and determining a correspondence between at least a portion of the first set of peaks and at least a portion of the second set of peaks.

Some embodiments relate to a non-transitory computer-readable medium comprising instructions that, when executed by one or more processors on a computing device, are operable to cause the one or more processors to perform a method of any of the techniques described herein.

Some embodiments relate to a system including a memory storing instructions and a processor configured to execute the instructions to perform a method of any of the techniques described herein.

There has thus been outlined, rather broadly, the more important features of the disclosed subject matter in order that the detailed description that follows may be better understood, and in order that the present contribution to the art may be better appreciated. There are, of course, additional features of the disclosed subject matter that will be described hereinafter and which will form the subject matter of the claims appended hereto. It is to be understood that the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting.

Drawings

In the drawings, each identical or nearly identical component that is illustrated in various figures is represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating various aspects of the technology and devices described herein.

FIG. 1 illustrates an exemplary machine vision system, according to some embodiments.

FIG. 2 is a flow chart illustrating an exemplary computerized method for generating a histogram of a three-dimensional point cloud, according to some embodiments.

Fig. 3A illustrates two exemplary point-to-plane histograms according to some embodiments.

Fig. 3B illustrates two exemplary point-to-line histograms according to some embodiments.

FIG. 4 illustrates two exemplary point-to-centroid distance histograms according to some embodiments.

FIG. 5 is a flowchart of an exemplary computerized method for generating a histogram of a 3D point cloud, according to some embodiments.

Fig. 6 illustrates two exemplary normal direction histograms of point cloud data of a frustum, according to some embodiments.

Fig. 7 illustrates two exemplary normal direction histograms of point cloud data for cylindrical objects according to some embodiments.

Fig. 8 illustrates two exemplary normal direction histograms of hemispherical objects according to some embodiments.

Fig. 9 illustrates two exemplary normal direction histograms of urban landscape objects according to some embodiments.

Fig. 10 is a table of exemplary similarity scores calculated using the 2D direction histogram described in connection with fig. 6-9, according to some embodiments.

Detailed Description

The techniques described herein provide data reduction techniques that may be used to analyze 3D point cloud images. The inventors have appreciated that conventional machine vision techniques may be significantly inefficient when processing 3D point cloud data (e.g., to determine whether an object is present in the 3D point cloud data). A 3D point cloud typically includes hundreds of thousands or millions of (x, y, z) points. Accordingly, the inventors have appreciated that directly interpreting such a large number of 3D points in space can be quite challenging. For example, since a 3D point cloud includes such a large number of 3D points, and generally does not include information about spatial relationships between 3D points, attempting to interpret a pure 3D point cloud is not feasible for many machine vision applications that perform such interpretation in limited time, limited hardware resources, and the like.

In particular, it may be desirable to determine whether a 3D point cloud image captures an object. Conventional techniques solve such problems by searching for objects in a point cloud using computationally intensive techniques. For example, some methods process the set of 3D points to calculate edge features (e.g., wrinkles or tears in one or more surfaces in the field of view) indicative of abrupt changes, calculate a measure of the prevalence of such edges, and determine the presence (or absence) of an object if the prevalence exceeds a predetermined threshold. Computing such edge features requires a large amount of computation involving points in the neighborhood of each point in the 3D cloud. Additionally or alternatively, it may be desirable to compare objects detected in different 3D point cloud images. As with object detection, such an approach may require computing edge features. For example, edge features may be computed in each of the point clouds, the locations and other attributes of the edges may be compared, and finally, the comparison results may be assembled into a similarity measure for the object.

As another example, some methods of object detection, classification, and/or registration calculate a mesh of points or surface patches (surface patches) to determine object features (e.g., surface curvature). Calculating such meshes or surface patches alone can be time consuming and the technique requires performing further processing in order to determine object features and build a surface model. For example, computing surface features of a mesh or patch and building a parametric model typically requires an iterative process involving optimization, thus often limiting the use of these techniques to non-real-time applications.

The inventors have made technical improvements to machine vision techniques to address these and other inefficiencies. The techniques described herein may reduce a 3D point cloud to a 1D signal (e.g., a 1D histogram) and/or a 2D image (e.g., a 2D histogram), which 1D signal and/or 2D image may be used to easily interpret 3D point clouds for various machine vision applications. Techniques also provide an interpretation of the 1D signal and the 2D signal, such as comparing different 3D point clouds to determine whether the 3D point clouds are similar. For example, a 1D or 2D histogram may be calculated for two different 3D point cloud images and the histograms compared to determine whether the 3D point cloud images are likely to include the same object. Since traditionally large amounts of data in a 3D point cloud can be reduced to one or two dimensions, the techniques described herein can significantly improve performance and allow various types of machine vision tasks to interpret the 3D point cloud. Further, the resulting histogram may be represented using a small amount of data compared to conventional techniques, and thus requires only minimal memory or disk space to store the histogram.

The described histogram-based techniques may overcome various processing inefficiencies of conventional techniques in that they do not require computationally intensive aspects, such as computing edge features, point meshes, and/or surface patches (and also do not require generating surface models). Rather, the techniques described herein provide for determining statistical features of an object directly from 3D points. For example, a 1D histogram may be generated based on 3D point locations and/or a 2D histogram may be generated based on point normal vectors. As an illustrative example, consider a manufacturing application in which a 3D sensor is mounted above a conveyor belt (e.g., cardboard boxes of different sizes, mailing envelopes, plastic bags, etc.) through which objects pass. Each class of objects has different characteristics including one or more heights above the conveyor belt, surface normal orientation, etc. Such characteristics may be represented by a 1D distance-based histogram and/or a 2D direction-based histogram to detect and classify objects in real-time as they travel down the conveyor belt. For example, given a distance histogram through objects, statistical metrics (e.g., median, particular percentiles, patterns, etc.) can be used to quickly estimate the height of objects above the conveyor belt. Additionally or alternatively, by using the histogram derived metrics, the techniques described herein may reduce noise and/or enhance robustness of the measurement results, among other things.

In the following description, numerous specific details are set forth, such as examples of systems and methods, and environments in which such systems and methods may operate, in order to provide a thorough understanding of the disclosed subject matter. Additionally, it will be appreciated that the examples provided below are exemplary and that other systems and methods are contemplated as being within the scope of the disclosed subject matter.

FIG. 1 illustrates an exemplary machine vision system 100, according to some embodiments. The exemplary machine vision system 100 includes a camera 102 (or other imaging acquisition device) and a computer 104. Although only one camera 102 is shown in fig. 1, it should be understood that multiple cameras may be used in a machine vision system (e.g., where a point cloud is merged from the point clouds of multiple cameras). The computer 104 includes one or more processors and a human-machine interface in the form of a computer display, and optionally one or more input devices (e.g., keyboard, mouse, trackball, etc.). The camera 102 includes a lens 106 and a camera sensor element (not shown) as well as other components. The lens 106 includes a field of view 108, and the lens 106 focuses light from the field of view 108 onto the sensor element. The sensor elements generate a digital image of the camera field of view 108 and provide the image to a processor that forms part of the computer 104. As shown in the example of fig. 1, the object 112 travels along the conveyor 110 into the field of view 108 of the camera 102. While the object 112 is in the field of view 108, the camera 102 may generate one or more digital images of the object for processing, as discussed further herein. In operation, a conveyor may contain a plurality of objects. Such as during an inspection process, these objects may pass sequentially within the field of view 108 of the camera 102. In this way, the camera 102 may acquire at least one image of each observed object 112.

In some embodiments, the camera 102 is a three-dimensional (3D) imaging device. As an example, the camera 102 may be a 3D sensor of a progressive scan scene, such as the DS-line of a laser profiler 3D displacement sensor available from the assignee of the present application, kannai vision corporation (cogniex corp.). According to some embodiments, the 3D imaging device may generate a set of (x, y, z) points (e.g., where the z-axis increases by a third dimension, such as distance from the 3D imaging device). The 3D imaging device may use various 3D image generation techniques, such as shape-from-shading (shape-from-shading), stereoscopic imaging, time-of-flight techniques, projector-based techniques, and/or other 3D generation techniques. In some embodiments, the machine vision system 100 includes a two-dimensional imaging device, such as a two-dimensional (2D) CCD or CMOS imaging array. In some embodiments, a two-dimensional imaging device generates a 2D array of luminance values.

In some embodiments, the machine vision system processes 3D data from the camera 102. The 3D data received from the camera 102 may include, for example, a point cloud and/or a range image. The point cloud may comprise a set of 3D points on or near the surface of the solid object. For example, the points may be represented by their coordinates in a straight line or other coordinate system. In some embodiments, other information may optionally also be present, such as a grid or lattice structure indicating which points are neighbors on the object surface. In some embodiments, information about the surface features (including curvature, surface normal, edge, and/or color and albedo information), whether derived from sensor measurements or previously calculated, may be included in the input point cloud. In some embodiments, 2D and/or 3D data may be obtained from 2D and/or 3D sensors, from CAD or other solid models, and/or by preprocessing distance images, 2D images, and/or other images.

According to some embodiments, the set of 3D points may be part of a 3D point cloud within a user-specified region of interest, and/or include data specifying a region of interest in the 3D point cloud. For example, since a 3D point cloud may include so many points, it may be desirable to specify and/or define one or more regions of interest (e.g., to limit the space over which the techniques described herein apply).

Examples of computer 104 may include, but are not limited to, a single server computer, a series of clothingA server computer, a single personal computer, a series of personal computers, a mini computer, a mainframe computer, and/or a computing cloud. The various components of computer 104 may execute one or more operating systems, examples of which may include, but are not limited to: for example Microsoft Windows Server ^TM ；Novell Netware ^TM ；Redhat Linux ^TM Unix, and/or custom operating systems. The one or more processors of computer 104 may be configured to process operations stored in a memory connected to the one or more processors. The memory may include, but is not limited to: hard disk drives, flash drives, tape drives, optical drives, RAID arrays, random Access Memory (RAM), and Read Only Memory (ROM).

The technology described herein relates to generating a histogram of a 3D point cloud. A histogram may be generated based on geometric aspects of the 3D point cloud, such as the location of the 3D points and/or the normal direction of the points (e.g., on the surface of the object captured by the 3D point cloud), and so forth. Some embodiments relate to one-dimensional (1D) histograms. The 1D histogram may be generated relative to a reference, such as a reference plane, a line, and/or other points spatially oriented relative to the point cloud. For example, the point-to-plane distance histogram is a 1D histogram generated based on the distance from each point (or other representation, such as a voxel) in the 3D point cloud to a selected or estimated plane. For example, the point-to-line distance histogram is a 1D histogram generated based on the distance from each point (or other representation, such as a voxel) in the 3D point cloud to a selected or estimated line. As another example, the point-to-centroid distance histogram is a 1D histogram generated based on the distance from each point (or other representation, such as a voxel) in the 3D point cloud to the centroid of the 3D point cloud. Some embodiments relate to two-dimensional (2D) histograms. For example, the normal direction projection histogram is a 2D histogram of a unit normal direction of the 3D point cloud.

For various 3D point cloud based applications, the histogram may be used as a useful feature descriptor for the scene. The techniques described herein (including descriptors and interpretations) can resist changes such as resolution, noise, and pose, maintaining geometric invariance. Given the reduced dimensionality and data points, the computational cost of the technique may be lower than conventional techniques for comparing point clouds. The histogram may be used (e.g., directly) as input to various image processing techniques and computer vision tools, including classification, measurement, object detection, object registration and/or deep learning, etc.

Some embodiments of the technology relate to generating 1D histograms. FIG. 2 is a flow diagram illustrating an exemplary computerized method 200 for generating a histogram of a three-dimensional (3D) point cloud, according to some embodiments. At step 202, a machine vision system (e.g., machine vision system 100 of fig. 1) receives data indicative of a 3D point cloud comprising a plurality of 3D points. At step 204, the machine vision system determines references (e.g., reference planes, lines, centroids, etc.) that are disposed in some spatial relationship with the 3D point cloud (e.g., point cloud-selected based and/or point cloud-calculated based, etc.). At step 206, the machine vision system determines a distance to a reference for each 3D point of the plurality of 3D points to generate a set of distances for the plurality of 3D points. At step 208, the machine vision system generates a histogram based on the set of distances.

Referring to step 202, according to some embodiments, the 3D point cloud may be processed prior to generating the histogram. For example, the techniques may include voxelizing (voxelizing) the 3D point cloud. The machine vision system may generate a 3D voxel grid for at least a portion of the 3D point cloud (e.g., the portion of interest for which a histogram is to be calculated), wherein each voxel in the 3D voxel grid has the same size (e.g., the same length, width, and height). The machine vision system may determine, for each voxel in the 3D voxel grid, whether a location of one or more of the plurality of 3D data points falls within the voxel to generate an associated set of 3D points for the voxel. It should be appreciated that some voxels may be empty (e.g., where the location of the 3D point does not fall within those voxels).

According to some embodiments, a voxel may store its associated set of 3D points (e.g., for subsequent processing, such as for calculating an average and/or median, etc.). According to some embodiments, the set of 3D points may be processed prior to being stored in the voxel to reduce the number of data points stored in the voxel grid. For example, for each voxel in a 3D voxel grid, the machine vision system may determine a single 3D data point for that voxel based on an associated set of 3D data points and store the single 3D data point in that voxel (e.g., by determining the centroid of the point, the average point value, etc.). A histogram may be generated based on the voxel data. For example, assume a set of 3D points is stored into voxels such that each voxel includes zero 3D points (e.g., if no 3D points are located in the voxel) or one 3D point (e.g., if a single 3D point falls in the voxel, or if there are multiple points, a representative 3D data point is generated for the multiple points). The machine vision system may generate the set of distances by determining, for each voxel in the 3D voxel grid storing 3D data points, a distance from a single 3D data point to a reference to determine the set of distances.

Referring to steps 204 through 208, according to some embodiments, when a 3D point cloud is represented using a voxel grid at step 202, it may be determined that the reference of step 204 is in spatial relationship to the voxel grid. For step 206, the distances from the representative point of each voxel of the grid to the reference may be measured to generate a set of distances for generating a histogram at step 208.

Referring to step 204, various reference computing point clouds may be used. According to some embodiments, the reference is a plane, such as a 2D reference plane. The reference plane may be determined according to various techniques. For example, the user may specify a 2D reference plane as input. As another example, a user may specify a region of interest of a point cloud, where the contained points may be used to extract a plane as a reference plane. The reference plane may be extracted from the region of interest by using various techniques. For example, the reference plane may be extracted from the region of interest by fitting the plane using some and/or all points contained within the region of interest using a least squares method, and/or by fitting a plane with the largest number of intra-inserts over the contained points using a random sample consensus (RANdom SAmple Consensus, RANSAC) technique, and so on.

According to some embodiments, the reference may be a line based on a 3D point cloud estimate. The reference line may be determined based on the region of interest by using various techniques as described herein. For example, a reference line may be determined using a least squares method, fitting a line using some and/or all points contained within the region of interest. As another example, the RANSAC technique may be used to fit a line with the largest number of inserts over some and/or all points contained within the region of interest, thereby determining a reference line. In some embodiments, the reference line may be determined based on one or more 3D shapes (e.g., axes of cylinders, intersecting lines of two non-parallel planes, etc.) extracted from points in the region of interest.

According to some embodiments, the reference may be a point estimated based on a 3D point cloud, such as an estimated centroid or centroid of the 3D point cloud. The machine vision system may process a 3D point cloud (e.g., 3D points and/or voxels) to determine an estimated centroid and use the estimated centroid as a reference point. While computing a centroid is an example of a technique that may be used to calculate a reference point, it should be understood that other methods may be used with the techniques described herein. In particular, the reference points used to create the histogram may be determined as desired and should not be limited to the centroid of the inspected object. For example, a reference point may be created by determining a 3D shape and extracting a centroid from the 3D shape, and/or calculating a centroid from a subset of 3D points, and so forth.

Referring to step 206, if the reference is a plane, the machine vision system may determine the distance of each 3D point by determining the distance of each 3D point from the reference plane. The distance may be determined by calculation (e.g., the shortest distance of each point to the reference plane, and/or the distance of each point along the projection, etc.). Such a 1D point-to-plane histogram may represent the distribution of the distances of the points to the reference plane. By eliminating the effects from noise points and outliers, the point-to-plane histogram can be used to robustly measure the distance/height from the (e.g., noise) 3D point to the reference plane. In some embodiments, if the reference is a line, the machine vision system may determine the distance of each 3D point by determining the distance of each 3D point to the estimated line (e.g., the shortest distance to the estimated line). Such a 1D point-to-line histogram may represent an identification of surface points of the object.

Referring to step 208, in some embodiments, the histogram includes a set of entries (e.g., a one-dimensional bar graph and/or a dot line graph, etc.). For example, each entry may be associated with a distance and/or a range of distances. According to some embodiments, when computing the histogram, the histogram entry may represent a signed or unsigned distance of each element of the 3D point cloud to the reference. Generating the histogram may include: for each entry of the histogram, a distance in the set of distances is added that satisfies and/or is within the distance range associated with the entry. For example, the adding may include determining a count of the distance of each entry of the histogram, e.g., such that the distance is considered to belong to an entry of the histogram by discretizing/quantizing its value.

Referring to steps 206 through 208, in some embodiments, if the reference is a point (e.g., centroid), the machine vision system may determine the distance of each 3D point by determining the distance of each 3D point to the estimated centroid. Such a 1D point-to-centroid histogram may represent an identification of surface points of an object. The point-to-centroid histogram may be used as an efficient feature for point cloud-based object recognition and/or classification, etc.

FIG. 3A illustrates two exemplary point-to-plane histograms of urban landscape objects in accordance with some embodiments. Fig. 3A shows an exemplary histogram 300 of point-to-plane distances of a point cloud 302 to a plane 306 (which is disposed at the base of the box 304) of a cylindrical object within the box 304. The X-axis of the histogram 300 represents the distance from the reference plane 306 and the Y-axis represents the number of points having a particular distance. Fig. 3A also shows an exemplary histogram 350 of point-to-plane distances of point clouds 352 to planes 356 (which are also disposed at the base of the box 354) of three planar surfaces within the box 354. Similar to histogram 300, the X-axis of histogram 350 represents the distance from reference plane 356, and the Y-axis represents the number of points having a particular distance. The histogram distance in these examples is calculated by determining the shortest distance of each point to the reference plane, as described herein.

In some embodiments, histograms 300 and 350 may be compared (e.g., to determine whether objects captured by the 3D image are similar) as described herein. The similarity score between the two histograms 300 and 350 is 0.296643, where the calculated score value may be in the range of 0 to 1.0, where 1.0 indicates that the two compared histograms are the same and a value of 0 indicates the smallest similarity. Thus, in this example, the score 0.296643 indicates that the objects within the regions 304, 354 are dissimilar. The similarity score for this example is calculated using the histogram intersection metric by comparing the sum of the minima of the normalized frequencies over all bins (bins) between the two histograms.

FIG. 3B illustrates two exemplary point-to-line histograms of urban landscape objects in accordance with some embodiments. Fig. 3B shows an exemplary histogram 370 of the point-to-line distance of a point cloud 372 to a line 376 (the axis of the cylinder) of a cylindrical object within a box 374. Fig. 3B also shows an exemplary histogram 380 of the point-to-line distances of the point clouds 382 to the lines 386 of three planar surfaces within the box 384. The X-axis of histograms 370 and 380 represents the distance from respective reference lines 376 and 386, and the Y-axis represents the number of points having a particular shortest distance. The similarity score between the two histograms 300 and 380 is 0.191645, where the calculated score value may be in the range of 0 to 1.0, where 1.0 indicates that the two compared histograms are the same and a value of 0 indicates the smallest similarity. As in fig. 3A, the score is calculated using the sum of the minimum values of the normalized frequencies. Thus, in this example, the score 0.191645 indicates that the objects within the regions 374, 384 are dissimilar.

FIG. 4 illustrates two exemplary point-to-centroid distance histograms according to some embodiments. Fig. 4 shows an exemplary histogram 400 of point-to-centroid distances of a point cloud 402 of frustum surfaces within a box 404. Fig. 4 also shows an exemplary histogram 450 of the point-to-centroid distances of the point cloud 452 of the sphere within the box 454. The centroid of the frustum and sphere is not shown in fig. 4, since the centroid is within the object. The X-axis of histograms 400 and 450 represents the distance from the associated centroid and the Y-axis represents the number of points having a particular distance. The similarity score for both histograms 400 and 450 is 0.835731, indicating a higher similarity than the histograms 300 and 350 of fig. 3A.

According to some embodiments, techniques may generate 2D histograms representing 2D information (e.g., information associated with multiple ones of an x-direction, a y-direction, and a z-direction, etc.). For example, a 2D normal direction histogram may be generated to represent the distribution of unit normal directions of points in 3D space. FIG. 5 is a diagram of an exemplary computerized method 500 for generating a histogram of a 3D point cloud, according to some embodiments. At step 502, a machine vision system receives a 3D point cloud having a plurality of 3D points. At step 504, the machine vision system generates a set of orientations. The machine vision system determines an orientation of the 3D point for each 3D point in the 3D point cloud.

In some embodiments, the orientation includes a first value of a first component (e.g., tilt angle) and a second value of a second component (e.g., azimuth angle). According to some embodiments, the tilt angle may be an angle ranging from 0 degrees to 180 degrees with the direction of the positive Z-axis for direction or orientation. The azimuth angle may be an angle ranging from 0 degrees to 360 degrees projected into the positive X-axis in the X-Y plane. The azimuth angle may be periodic, with a period of 360 degrees.

At step 506, the machine vision system generates a histogram based on the set of orientations. The histogram may interval the orientation and/or values determined at step 504. In some embodiments, the histogram includes a set of bins that are two-dimensionally oriented. For example, each interval may be associated with a first range of values of a first component (e.g., an interval's inclination index) and a second range of values of a second component (e.g., an interval's azimuth index). The machine vision system generates a histogram by adding, for each bin, an orientation of first and second values within first and second ranges of values, respectively, associated with the bin to quantize each 3D point to the bin. As described herein, for example, adding may include determining a count of points having values within a particular component range to generate each entry of the histogram, e.g., such that a point is considered to belong to an entry of the histogram by discretizing/quantizing the value of the point.

According to some embodiments, referring to steps 504 through 506, the unit normal at each point may be an ordered pair of tilt angle and azimuth angle. The normal direction histogram may be a 2D image, wherein the two dimensions are the tilt angle and the azimuth angle. Each row and each column of the 2D histogram image may correspond to an interval of the tilt angle and the azimuth angle, respectively. The number of rows and columns may be determined by the tilt angle range and azimuth angle range, respectively, and the bin size. Each value at a given pixel or interval conveys a count (e.g., frequency) of normal directions whose tilt and azimuth fall within the interval.

According to some embodiments, a set of steps may be performed for each 3D point within the region of interest to calculate a 2D histogram. For each 3D point, the machine vision system may calculate its tilt angle and azimuth, quantize each tilt angle and azimuth by interval size to determine a tilt angle interval (e.g., tilt index) and an azimuth interval (e.g., azimuth index), and increment the pixel/interval value indexed at the tilt index and azimuth index described above by 1.

According to some embodiments, an optional secondary image (such as a centroid image) may be determined. For example, the centroid image may be a triplet image having as many pixels or bins as there are direction histograms. The value in each bin with the secondary image index (i, j) may be a 3D centroid of the position of a point of the point cloud, wherein the normal direction is located in the bin (i, j) of the direction histogram.

As described herein, according to some embodiments, a 3D point cloud may be processed (such as voxelized) prior to generating a histogram. The machine vision system may generate a 3D voxel grid for at least a portion of the 3D point cloud. The machine vision system may determine, for each voxel in the 3D voxel grid, whether a location of one or more of the plurality of 3D data points falls within the voxel to generate an associated set of 3D points for the voxel.

According to some embodiments, a voxel may store a set of 3D points, a representative point or set of points, normals, or other vectors, etc. associated therewith. According to some embodiments, the machine vision system may determine a representative normal or vector of voxels. For example, for each voxel in a 3-D voxel grid, the machine vision system may determine a surface normal and/or orientation of a representative 3-D data point (e.g., a median point) for that voxel to generate the set of orientations. As another illustrationFor example, the machine vision system may determine the surface normal vector using an associated set of 3D point locations, adjacent 3D data point locations, and/or information from the 3D sensor. In some embodiments, a voxel may be set to zero if it is not associated with any 3D point. In some embodiments, the technique may include determining a representative normal or vector for each voxel based on an associated set of 3D data points. For example, vv may be formed by calculating component averages and/or by subtracting the sum of the components from the cumulative matrix (e.g., by accumulating the outer product of each vector with itself ^T ) Extracting feature vectors, etc., determining representative vectors. As a further example, the techniques may include determining a vector for each of the associated 3D points and storing a set of vectors.

Referring to computerized method 500, according to some embodiments, when a 3D point cloud is represented using a voxel grid at step 502, the orientations at step 504 are determined using the surface normals and/or orientations of the representative 3D data points for each voxel to generate a set of orientations for generating a histogram at step 506.

A 2D histogram may be generated that represents aspects of the 3D point cloud, including global aspects and/or local aspects. According to some embodiments, the histogram may be a global feature descriptor. For example, the set of orientations may be determined based on a fixed coordinate system associated with the 3D point cloud. For example, consider a point cloud of multiple scenes, all of which are represented in some fixed coordinate system (e.g., a client coordinate system). Because the direction histogram calculated from each point cloud conveys information about the desired subset and/or the entire 3D point cloud as a whole, it may be considered a global descriptor of the scene (e.g., in the form of a distribution of orientations of surface point normals).

According to some embodiments, the histogram may be a local feature descriptor. For example, the set of orientations may be determined based on a local coordinate system associated with 3D points of the 3D point cloud. For example, the histogram may be recalculated at one or more points of the 3D point cloud. Given the 3D points of the point cloud, the machine vision system may establish a local coordinate system based on the 3D points, with the origin of the local coordinate system at the points. The tilt angle and azimuth of the direction of the 3D point cloud may be calculated in a local coordinate system and used as local descriptors after being compartmentalized as described herein.

A local 3D coordinate system may be obtained at a given point of a point cloud using various techniques. According to some examples, the machine vision system may select an initial local 3D coordinate frame with its origin at the point and its Z-axis aligned with the normal direction of the point (e.g., where the X-axis and Y-axis are arbitrary). The machine vision system may establish the final local coordinate frame by: its K nearest neighbors are first found to determine its X-axis and then their normals are used to calculate the direction histogram in the initial local 3D space. The machine vision system may identify the 2D interval location of the first distinguishable highest frequency peak of the histogram (e.g., along the increasing direction of azimuth angle) and use the direction of its azimuth angle as the X-axis identifying the coordinate frame. The final local coordinate system of each point may depend on the geometry of the neighborhood of the point (e.g., so that it may not change with the point cloud sampling and the change in object pose).

Fig. 6 illustrates two exemplary normal direction histograms of point cloud data of a frustum, according to some embodiments. Fig. 6 shows an exemplary histogram 600 of normals to a point cloud 602 of a frustum object in a first pose. Fig. 6 also shows an exemplary histogram 650 of normals to a point cloud 652 of the frustum object in the second pose. Both histograms 600, 650 show five distinct peaks (600A-600E and 650A-650E, respectively), each peak corresponding to a planar patch of a frustum object. The horizontal directions of images 600 and 650 represent azimuth intervals covering the azimuth range from 0 degrees to 360 degrees, and their vertical directions represent inclination intervals covering the inclination angle range from 0 degrees to 90 degrees. In image 600, five peaks are 600A: (91, 42), 600B: (183, 44), 600C: (275,2), 600D: (271, 47), 600E: (358, 45); in image 650, five peaks are 650A: (17, 60), 650B: (99, 25), 650C: (240, 38), 650D: (309, 67), 650E: (331, 24), wherein the first and second coordinates of the pixel location indicate azimuth and inclination angles, respectively, in degrees.

Fig. 7 illustrates two exemplary normal direction histograms of point cloud data for cylindrical objects according to some embodiments. Fig. 7 shows an exemplary histogram 700 of normals to a point cloud 702 of a cylindrical object in a first pose. Fig. 7 also shows an exemplary histogram 750 of normals to a point cloud 752 of a cylindrical object in a second (different) pose. The horizontal directions of images 700 and 750 represent azimuth intervals covering the azimuth range from 0 degrees to 360 degrees, and their vertical directions represent inclination intervals covering the inclination angle range from 0 degrees to 90 degrees.

Both histograms 700, 750 show significantly stronger ridges. As can be seen in images 702 and 752, the part is made of concentric cylindrical surfaces sharing a common axis. For each surface point of the part, its normal is (conceptually) perpendicular to the common axis and points in a direction away from the common axis. Thus, the normals of the surface points lie more or less in the same three-dimensional plane perpendicular to the common axis. For example, consider a 3D circle of unit radius perpendicular to the common axis: the start of all unit normals is at the center of the unit circle and the end is on the circle, occupying some arc of the circle.

In image 702, the part is positioned such that its axis is approximately aligned with the X-direction of the 3D coordinate space of the point cloud; when the normals of the surface points are projected into the XY domain, their projections will lie on a line in the XY domain (e.g., corresponding to two opposite directions with an azimuthal difference of 180 degrees). This is reflected by the two (almost) vertical ridges 700A and 700B shown in image 700, which are oriented approximately half the distance of the entire X dimension (360 degrees). As another example, when projecting the normal of a surface point to the Z-axis of the coordinate space, the projection will take a tilt value from 0 to the maximum tilt angle (< 90 degrees), depending on the field of view of the 3D sensor used to capture the point cloud.

In image 752, the part rotates from the part in image 700 and the normal to each surface point is no longer perpendicular to the X direction because the common axis of the parts is not parallel to the X axis. When the normals of the surface points are projected to the XY domain and the Z axis of the 3D coordinate space, respectively, this is conceptually similar to projecting the points of the 3D circles (for the covered arc segments) they represent to the XY domain and the Z axis, which shows connected ridges 750A and 750B in image 750. Ridge 750A appears to be broken because it is inverted U-shaped, with the azimuth direction being periodic with 360 degrees of periodicity, so it is separated at 360 degrees of azimuth angle.

Fig. 8 illustrates two exemplary normal direction histograms of hemispherical objects according to some embodiments. Fig. 8 shows an exemplary histogram 800 of normals to a point cloud 802 of a hemisphere in a first pose. Fig. 8 also shows an exemplary histogram 850 of normals to a point cloud 852 of a hemisphere in the second pose. Both histograms 800, 850 show a uniform distribution in the normal direction. The horizontal directions of the histograms 800 and 850 represent azimuth bins covering azimuth angles ranging from 0 degrees to 360 degrees, and their vertical directions represent inclination bins covering inclination angles ranging from 0 degrees to 90 degrees.

In both images 800 and 850, there is some "empty" space 800A and 850A, which corresponds to a range of tilt angles to which no normal to the sphere surface point can be projected. The "blank" portion appears near the high end in the oblique direction. As an example, consider an arrangement in which a sphere (e.g., a ball) is imaged by a 3D sensor looking down on the ball from the top. A 3D sensor can typically capture only a specific portion of the top of the sphere surface near the sensor, and how much it can observe is typically dependent on the field of view (FOV) of the sensor and its distance/pose to the sphere. Since in this example the 2D histogram image is fixed in its oblique dimension (including 90 degrees), the sensor cannot capture surface portions outside the sensor FOV but still on the upper hemisphere. Thus, in this example, there are no points on the point cloud of the sphere that have a tilt angle greater than the maximum view angle of the sensor (which is less than 90 degrees due to the perspective sensing model).

Fig. 9 illustrates two exemplary normal direction histograms of urban landscape objects according to some embodiments. Fig. 9 shows an exemplary histogram 900 of normals to a point cloud 902 of an urban landscape object in a first pose. Fig. 9 also shows an exemplary histogram 950 of normals to the point cloud 952 of the urban landscape object in the second (different) pose. The horizontal directions of the histograms 900, 950 represent azimuth bins covering azimuth angle ranges from 0 degrees to 360 degrees, and their vertical directions represent inclination bins covering inclination angle ranges from 0 degrees to 90 degrees.

Both histograms 900, 950 show similar patterns in the normal direction, but with offset and rotation between them. In particular, the U-shaped pattern 900A in the histogram 900 appears as a lower left U-shaped pattern 950A in the histogram 950. In fig. 9, the point cloud is captured using the same part, but with a different pose relative to the 3D sensor. The part is composed of a plurality of sub-parts: top surface 902A (characterized by pairs of planar surfaces forming a triangular top), concentric cylindrical surface 902B (the same as shown in fig. 7), two rows of cartridge tops 902C, and base surface 902D. The histograms shown in images 900 and 950 reflect a combination of normal information for the surfaces of these sub-parts. As shown in fig. 7, the histogram of the concentric cylindrical surface is a "U" shaped ridge and has a more diffuse inclination and azimuth than the other sub-parts. For both rows of box top and base surfaces, their normal directions are similar, and they contribute a single peak in the 2D histogram image. Since the top sub-part consists of several pairs of planar patches, each patch has a small area; thus, its histogram is characterized by a limited number of peaks. By combining all histograms of these sub-parts, a total histogram of two different poses is shown in images 900 and 950, where the U-shapes 900A, 950A correspond to concentric cylindrical surfaces and the highest peaks 900B, 950B (with the greatest frequency, indicated by the darkest pixels) correspond to the normals of the two rows of box tops 902C and base surfaces 902D.

According to some embodiments, the 1D histogram and the 2D histogram may be analyzed to interpret the 3D point cloud. Techniques may include comparing two (or more) histograms to calculate a score indicating similarity between/among the histograms. According to some embodiments, a 1D histogram may be used to measure similarity between two sets of points. As described herein, each distance-based 1D histogram may provide an effective fingerprint for characterizing a 3D point cloud dataset. Different scoring methods (e.g., histogram intersection, bhattacharyya metric, normalized cross-correlation, etc.) may be used to calculate the similarity between the two 1D histograms to evaluate how similar their representation points look.

In some embodiments, the 1D histogram may be compared using one or more scoring metrics. For example, to compare two 1D histograms, the machine vision system may use one or more of four scoring metrics that are based on the intersection, dot product, bhattacharyya metrics, and normalized cross-correlation, respectively.

In some embodiments, the machine vision system determines the intersection-based score by determining, for each bin, the smaller of the two relative frequency values for that bin (one for each histogram). The machine vision system may calculate the score as the sum of these smaller values of relative frequencies in all intervals.

In some embodiments, both the score based on the dot product and the score of the Bhattacharyya metric involve calculating for each interval the product of two relative frequency values for that interval. The machine vision system may calculate a dot product score by summing the products of these intervals over all intervals. The machine vision system may calculate a Bhattacharyya score by summing the square roots of the interval products over all intervals.

In some embodiments, the machine vision system may calculate (NCC+1)/2 based on normalized cross-correlation #Normalized Cross Correlation, NCC) score. The machine vision system may calculate NCC by dividing the dot product score by two factors (one factor for each histogram). For each histogram, these factors may be root mean square values of the relative frequency distribution.

In some embodiments, each 1D histogram may be first normalized to produce a relative frequency distribution that sums to 1. A similarity score may then be calculated based on the two relative frequency distributions. Regardless of the metric used by the machine vision system, the resulting score may be in the range of 0 to 1, where the greater the score, the more similar the two histograms.

In some embodiments, to compare two 2D histograms, a set of bin locations for high frequencies may be identified for each histogram. These bins may represent the primary 3D direction of the histogram using frequency peaks. At each frequency peak interval, its X and Y components indicate the azimuth interval position and the inclination interval position, respectively, where the peak occurs; the Z component of which represents the frequency of the peak. In some embodiments, noise removal techniques may be applied to the histogram first to filter bins with frequencies less than a predetermined significance threshold.

In some embodiments, the machine vision system may identify frequency peaks by locating 2D blobs (e.g., connected branches) in a filtered histogram that is characterized by a compact cluster of adjacent non-zero frequency bins. The location (X, Y) of the peak may be the centroid of its constituent intervals (e.g., the average of azimuth interval locations and inclination interval locations weighted by its frequency); the frequency value (Z) of the peak is the sum of the frequency values of its constituent intervals. In some embodiments, if there are no significant high frequency bins in the histogram, the average of all directions may be designated as the dominant direction.

In some embodiments, the machine vision system may find the rotation that best aligns with the dominant direction in order to minimize the difference in 3D directions conveyed by the two sets of peak positions. The machine vision system may obtain a rigid transformation corresponding to the optimal rotation.

In some embodiments, a measure of the goodness of rotation (referred to as a rotation score for purposes of illustration) is calculated based on the distance between frequency peaks that are considered to correspond to the peaks. In some embodiments, the machine vision system first calculates an average of euclidean distances between vertices of a unit vector corresponding to the direction of the frequency peaks. The rotation score may be calculated as 1 minus the average distance.

Another example of a score that may be calculated is a frequency matching score. For each bin of the first histogram having a frequency value that is non-zero, the machine vision system may rotate the representative direction (e.g., taken as the center of the bin) by the transformation described above to obtain the mapped direction. The mapped directions fall in a set of directions corresponding to one of the bins of the second histogram. If the bins of the first histogram are mapped to one bin of the second histogram, their frequencies are added to obtain a modified frequency. Thus, the first histogram is mapped to the space of the second histogram. For each bin, an overlap metric of two frequencies associated with the bin (i.e., the frequency of the second histogram and the mapped frequency of the first histogram) is calculated. The overlap metric is the ratio of the smaller of the two frequencies to the larger frequency. Next, the machine vision system calculates an average of these interval overlap metrics for all intervals having non-zero frequency content. Finally, the machine vision system will calculate a value that obtains a frequency matching score by applying a piecewise quadratic sigmoid function to the average overlap metric. For example, it may be beneficial to apply an sigmoid function to push the intermediate value to 0 or 1.

In some embodiments, the overall similarity score between two 2D histograms may be calculated as the product of the rotation score and the frequency matching score. As with the score of the 1D histogram, the score may be in the range of 0 to 1, where the greater the score, the more similar the two histograms.

According to some embodiments, a 2D normal direction histogram may be used to measure the similarity between two object surfaces. In a 2D histogram of directions, column distances may reflect differences in azimuth angles linearly (depending on the period of 360 degrees), while row distances may reflect differences in inclination angles linearly. According to some embodiments, by matching a portion of the histogram, histograms calculated at different points of view of the scene and/or different portions of the 3D point cloud may be correlated. For example, the correspondence between at least a portion of the first set of peaks and at least a portion of the second set of peaks may be determined by matching one or more peaks of one histogram with one or more peaks of another histogram to compare the different histograms. For example, consider two point clouds acquired by imaging the same object in two different poses. The machine vision system may identify peak locations in the two histograms calculated for the two point clouds and establish correspondence between these frequency peaks. Such correspondence may allow, for example, the machine vision system to estimate rotation of the object from one view to the next. In a similar manner, the transition between views may also be estimated from the histogram of centroid images.

According to some embodiments, a 1D histogram may be generated based on the 2D histogram. For example, by row/column summing the pixel values of a 2D image, two 1D direction histograms (e.g., related to tilt angle and azimuth angle) may be derived from the 2D histogram. Each 1D direction histogram may provide an effective feature for characterizing the surface of the object. Such a 1D histogram may be a useful indicator whenever the surface is fixed (as may be done using 3D based registration). Like distance-based 1D histograms, different scoring methods can be used to calculate the similarity between the two 1D direction histograms representing the two surfaces.

As described herein, techniques may be used to create global descriptors, such as providing a normal direction histogram of global feature descriptors. Global histograms may be used by existing image processing techniques for various applications, such as for object recognition, classification, and/or registration, among others. According to some embodiments, the highest frequency peak may be extracted from the histogram. For example, such peaks may provide useful statistical information about the surface properties of the object. Each peak may correspond to a dominant surface normal orientation in coordinate space, and its frequency may indicate the number of points whose surface normal is in that orientation. Thus, each peak may be characterized by an orientation and a frequency. Given a frequency threshold, the machine vision system may identify all peaks whose frequencies exceed the threshold. The resulting peaks may represent objects by the relationship of the number of peaks and the orientation they represent.

As described herein, techniques may be used to create local descriptors, such as local direction histograms. According to some embodiments, the local direction histogram may represent a point cloud using keypoints. For example, the keypoints of a point cloud may be characterized by its local direction histogram having peak frequencies on some non-zero rows (e.g., peaks not on row 0). If any qualified peak exists on any non-zero row in its local direction histogram, that point may be considered a keypoint. A peak may be qualified if its corresponding frequency exceeds a threshold (e.g., a predetermined threshold). Each keypoint may be associated with a list of qualified peaks. Because the number of keypoints is typically much smaller than the number of original points, the point cloud of an object with a varying surface of curvature can be reduced to its set of keypoints. Key points may occur around boundaries between surface patches having different orientations. Non-keypoints may be located on surfaces with uniform direction (e.g., indicated by a large sum of frequencies on row 0 in the histogram), slow normal change direction, etc. Keypoints may play a more important role in point cloud-based machine vision applications than non-keypoints.

According to some embodiments, the local direction histogram may be used to search for objects using correspondence between two sets of keypoints. For example, given an object, key points of the object may be extracted from a training time point cloud acquired under a typical acquisition pose to form a reference model. At runtime, keypoints may be extracted for each point cloud acquisition to form a runtime model from a set of keypoints. By comparing the corresponding peaks of the keypoint pairs, a mathematical score between them can be calculated. For each correspondence of keypoints between the reference model and the runtime model, the total score may be obtained by summing the matching scores of the corresponding pairs of keypoints. Different methods such as RANSAC may be used to correspond to the two sets of keypoints. The correspondence with the highest score may be used to determine the result.

According to some embodiments, the local direction histogram may be used to search for objects by comparing models of the objects. For example, a model may be built for a point, which may include the point, its identification coordinate frame, a region of interest (ROI) around the point, and/or a local direction histogram calculated using points inside the ROI in a local coordinate frame. The comparison can be made directly between the histograms of the two involved models (e.g., one at training time and the other at run time for a given point). To improve efficiency, models can be built and searched at key points of training time and run time.

Fig. 10 is a table 1000 illustrating exemplary similarity scores calculated using the 2D direction histograms described in connection with fig. 6-9, according to some embodiments. The images 652, 752, 852 and 952 from fig. 6-9 are used to represent fingerprints used to determine values in columns and rows, i.e., histograms 650, 750, 850 and 950 discussed in connection with fig. 6-9, respectively. As shown along the diagonal, a full score of 1.0 is obtained when each histogram is compared to itself. None of the scores is higher than 0.28, which indicates that no histogram has similar objects. For this example, to calculate the similarity of the two 2D histograms, for each histogram, bins are identified that have the highest frequency (e.g., frequency peaks exceeding a user specified threshold), each bin representing the dominant 3D orientation of the surface that it represents. As described herein, the highest peak may be presented as the centroid of an interval identified as belonging to the same blob in the histogram image. If both histograms contain enough frequency peaks, a fit is employed to correlate their dominant directions to find the best rotation between them as described herein (e.g., because the histograms may have been generated using images of the captured object in different poses). Then, as described herein, a similarity score is calculated by taking into account the error from the rotation estimate and the frequency similarity between the two sets of peak intervals aligned by the rotation. If no significant high frequency bins are identified in any of the 2D histograms, the 1D direction histograms derived by correlation using their respective tilt and azimuth angles are used to calculate a similarity score.

The embodiments discussed herein may be used in a variety of different applications, some of which may include, but are not limited to, part pick-up in vision-guided robots, three-dimensional inspection, automotive assembly, molded plastic and cast metal volume inspection, and assembly inspection. Such applications may include searching for and identifying the position and orientation of a pattern of interest within an image (e.g., to guide a robotic gripper, or to inspect an object).

The techniques operating in accordance with the principles described herein may be implemented in any suitable manner. The processes and decision blocks of the above flowcharts represent steps and actions that may be included in algorithms that perform these various processes. The algorithms resulting from these processes may be implemented as software integrated with and directing the operation of one or more single-or multi-purpose processors, may be implemented as functionally equivalent circuits, such as Digital Signal Processing (DSP) circuits or Application Specific Integrated Circuits (ASICs), or may be implemented in any other suitable manner. It will be appreciated that the flow charts included herein do not describe any particular circuitry or syntax or operation for any particular programming language or programming language type. Rather, the flow diagrams illustrate functional information one skilled in the art could use to fabricate circuits or to implement computer software algorithms to perform the processing of the particular apparatus to perform the types of techniques described herein. It is also to be understood that the specific order of steps and/or actions described in each flowchart is merely illustrative of algorithms that can be implemented and can be varied in implementations and embodiments of the principles described herein, unless otherwise indicated herein.

Accordingly, in some embodiments, the techniques described herein may be embodied in computer-executable instructions embodied in software, including application software, system software, firmware, middleware, embedded code, or any other suitable type of computer code. Such computer-executable instructions may be written using any of a number of suitable programming languages and/or programming or scripting tools, and also may be compiled as executable machine language code or intermediate code that is executed on a framework or virtual machine.

When the techniques described herein are implemented as computer-executable instructions, these computer-executable instructions may be implemented in any suitable manner, including as a plurality of functional facilities (functional facility), each providing one or more operations to complete execution of algorithms operating in accordance with these techniques. Regardless of how an instantiation is made, a "functional facility" is a structural component of a computer system that, when integrated with and executed by one or more computers, causes the one or more computers to perform a particular operational role. The functional facility may be a part of a software element or the whole software element. For example, the functional facility may be implemented as a function of a process, as a discrete process, or as any other suitable processing unit. If the techniques described herein are implemented as multiple functional facilities, each functional facility may be implemented in its own manner; it is not necessary that all functional facilities be implemented in the same manner. Furthermore, these functional facilities may be suitably executed in parallel and/or in series, and may communicate information between each other using shared memory on one or more computers on which they execute, using messaging protocols, or in any other suitable manner.

Generally, functional facilities include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Typically, the functions of the functional facilities may be combined or distributed as desired in the system in which they operate. In some implementations, one or more functional facilities that perform the techniques herein may together form a complete software package. In alternative embodiments, these functional facilities may be adapted to interact with other unrelated functional facilities and/or processes to implement software program applications.

Some example functional facilities for performing one or more tasks have been described herein. However, it should be understood that the described functional facilities and task divisions are merely illustrative of the types of functional facilities that may implement the example techniques described herein, and that embodiments are not limited to implementation in any particular number, division, or type of functional facilities. In some implementations, all functions may be implemented in a single functional facility. It should also be appreciated that in some embodiments, some of the functional facilities described herein may be implemented together with or separate from other functional facilities (i.e., as a single unit or separate units), or some of these functional facilities may not be implemented.

In some embodiments, computer-executable instructions (when implemented as one or more functional facilities or in any other manner) that implement the techniques described herein may be encoded on one or more computer-readable media to provide functionality to the media. Computer-readable media include magnetic media such as a hard disk drive, optical media such as a Compact Disk (CD) or Digital Versatile Disk (DVD), permanent or non-permanent solid state memory (e.g., flash memory, magnetic RAM, etc.), or any other suitable storage medium. Such computer-readable media may be implemented in any suitable manner. As used herein, a "computer-readable medium" (also referred to as a "computer-readable storage medium") refers to a tangible storage medium. The tangible storage medium is non-transitory and has at least one physical structural component. In a "computer-readable medium" as used herein, at least one physical structural element has at least one physical property that may be altered in some way during the creation of a medium having embedded information, during the recording of information thereon, or during any other process in which the medium is encoded with information. For example, the magnetization state of a portion of the physical structure of the computer readable medium may change during recording.

Further, some of the techniques described above include acts of storing information (e.g., data and/or instructions) in a particular manner for use by such techniques. In some implementations of the techniques (such as implementations in which the techniques are implemented as computer-executable instructions), the information may be encoded on a computer-readable storage medium. Where specific structures are described herein as being in an advantageous format for storing such information, these structures may be used to impart a physical organization to the information when encoded on a storage medium. These advantageous structures may then provide functionality to the storage medium by affecting the operation of the one or more processors interacting with the information; for example, by increasing the efficiency of computer operations performed by one or more processors.

In some, but not all embodiments, the techniques may be implemented as computer-executable instructions that may be executed on one or more suitable computing devices operating in any suitable computer system, or the one or more computing devices (or one or more processors of the one or more computing devices) may be programmed to execute the computer-executable instructions. When the instructions are stored in a manner accessible to the computing device or processor, such as in a data store (e.g., an on-chip cache or instruction register, a computer-readable storage medium accessible via a bus, a computer-readable storage medium accessible via one or more networks and by the device/processor, etc.), the computing device or processor may be programmed to execute the instructions. The functional facility comprising these computer-executable instructions may be integrated with, and direct the operation of, a single multipurpose programmable digital computing device, a coordinated system of two or more multipurpose computing devices sharing processing capabilities in combination with performing the techniques described herein, a single computing device or coordinated system of computing devices (co-located or geographically distributed) dedicated to performing the techniques described herein, one or more Field Programmable Gate Arrays (FPGAs) for performing the techniques described herein, or any other suitable system.

The computing device may include at least one processor, a network adapter, and a computer-readable storage medium. For example, the computing device may be a desktop or laptop personal computer, a Personal Digital Assistant (PDA), a smart mobile phone, a server, or any other suitable computing device. The network adapter may be any suitable hardware and/or software that enables the computing device to communicate with any other suitable computing device via any suitable computing network, wired and/or wireless. The computing network may include wireless access points, switches, routers, gateways, and/or other network devices, as well as any suitable wired and/or wireless communication medium for exchanging data between two or more computers, including the internet. The computer readable medium may be adapted to store data to be processed by the processor and/or instructions to be executed by the processor. The processor is capable of processing data and executing instructions. The data and instructions may be stored on a computer readable storage medium.

The computing device may additionally have one or more components and peripherals, including input devices and output devices. These devices may be used, inter alia, to present a user interface. Examples of output devices that may be used to provide a user interface include printers or display screens for visual presentation of output, as well as speakers or other sound generating devices for audible presentation of output. Examples of input devices that may be used for the user interface include keyboards and pointing devices (such as mice, touch pads, and digitizing tablets). As another example, the computing device may receive input information through speech recognition or in other audible format.

Embodiments of techniques implemented in circuitry and/or computer-executable instructions have been described. It should be appreciated that some embodiments may be in the form of methods, at least one example of which has been provided. Acts performed as part of the method may be ordered in any suitable manner. Accordingly, embodiments may be constructed in which acts are performed in a different order than illustrated, which may include performing some acts simultaneously, even though shown as sequential acts in the illustrative embodiments.

The various aspects of the above-described embodiments may be used alone, in combination, or in various arrangements not specifically discussed in the embodiments described in the foregoing and is therefore not limited in its application to the details and arrangement of components set forth in the foregoing description or illustrated in the drawings. For example, aspects described in one embodiment may be combined in any manner with aspects described in other embodiments.

In the claims, use of ordinal terms such as "first," "second," "third," etc., to modify a claim element does not by itself connote any priority, precedence, or order of one claim element over another or the temporal order in which acts of a method are performed, but are used merely as labels to distinguish one claim element having a certain name from another element having a same name (but for use of the ordinal term) to distinguish the claim elements.

Also, the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting. The use of "including," "comprising," "having," "containing," "involving," and variations thereof herein, is meant to encompass the items listed thereafter and equivalents thereof as well as additional items.

The term "exemplary" is used herein to mean serving as an example, instance, or illustration. Any examples, implementations, procedures, features, etc. described herein as exemplary should therefore be construed as illustrative examples and not as preferred or advantageous examples unless otherwise specified.

Having thus described several aspects of at least one embodiment, it is to be appreciated various alterations, modifications, and improvements will readily occur to those skilled in the art. Such alterations, modifications, and improvements are intended to be part of this disclosure, and are intended to be within the spirit and scope of the principles described herein. Accordingly, the foregoing description and drawings are by way of example only.

Various aspects are described in this disclosure, including but not limited to the following:

1. a computerized method for generating a histogram of a three-dimensional (3D) point cloud, the method comprising:

Receiving data indicative of a 3D point cloud comprising a plurality of 3D points;

determining a reference in spatial relationship to the 3D point cloud;

determining, for each 3D point of the plurality of 3D points, a distance to the reference to generate a set of distances for the plurality of 3D points; and

a histogram including a set of entries is generated based on the set of distances, including, for each entry in the set of entries, inserting distances from the set of distances that are within a range of distances associated with the entry.

2. The method according to 1, further comprising:

generating a 3D voxel grid for at least a portion of the 3D point cloud, wherein each voxel in the 3D voxel grid comprises the same set of dimensions;

determining, for each voxel in the 3D voxel grid, whether one or more of the plurality of 3D data points are within the voxel to generate a set of 3D points associated with the voxel;

for each voxel in the 3D voxel grid having an associated set of 3D points, determining a single 3D data point for the voxel based on the associated set of 3D data points; and

the single 3D data point is stored in the voxel.

3. The method of claim 2, wherein determining the distance group comprises:

For each voxel in the 3D voxel grid, a distance from the single 3D data point to the reference is determined to generate the distance group.

4. The method according to any one of claims 1 to 3, wherein:

the reference is a two-dimensional (2D) reference plane; and

determining the distance of each 3D point to generate the distance group comprises: a shortest distance of each 3D point to the reference plane is determined.

5. The method according to any one of claims 1 to 4, wherein:

the reference is a reference line; and

determining the distance of each 3D point to generate the distance group comprises: a shortest distance of each 3D point to the reference line is determined.

6. The method of any one of claims 1 to 5, further comprising:

an estimated centroid of the 3D point cloud is determined, wherein the reference is the estimated centroid.

7. The method of claim 6, wherein determining the distance of each 3D point to generate the distance group comprises: a distance of each 3D point to the estimated centroid is determined.

8. The method of any one of claims 1 to 7, further comprising: the histogram is compared to a second histogram generated for a second 3D point cloud to determine a measure of similarity between the 3D point cloud and the second 3D point cloud.

9. A non-transitory computer-readable medium comprising instructions that, when executed by one or more processors on a computing device, are operable to cause the one or more processors to

Generating a histogram of a three-dimensional (3D) point cloud, comprising:

determining a reference in spatial relationship to the 3D point cloud;

determining, for each 3D point of the plurality of 3D points, a distance to the reference to generate a distance set for the plurality of 3D points; and

10. The non-transitory computer-readable medium of claim 9, wherein the instructions are further operable to cause the one or more processors to perform:

the single 3D data point is stored in the voxel.

11. The non-transitory computer-readable medium of claim 10, wherein determining the distance group comprises:

12. The non-transitory computer readable medium of any one of claims 9 to 11, wherein:

the reference is a two-dimensional (2D) reference plane; and

13. The non-transitory computer readable medium of any one of claims 9 to 12, wherein:

the reference is a reference line; and

14. The non-transitory computer-readable medium of any one of claims 9-13, wherein the instructions are further operable to cause the one or more processors to perform:

Determining an estimated centroid of the 3D point cloud includes determining a distance of each 3D point to the estimated centroid, wherein the reference is the estimated centroid.

15. A system comprising a memory storing instructions and at least one processor configured to execute the instructions to generate a histogram of a three-dimensional (3D) point cloud, comprising:

determining a reference in spatial relationship to the 3D point cloud;

16. The system of claim 15, wherein the instructions are further operable to cause the at least one processor to perform:

the single 3D data point is stored in the voxel.

17. The system of claim 16, wherein determining the distance group comprises:

18. The system of any one of claims 15 to 17, wherein:

the reference is a two-dimensional (2D) reference plane; and

19. The system of any one of claims 15 to 18, wherein:

the reference is a reference line; and

20. The system of any of claims 15 to 20, wherein the instructions are further operable to cause the at least one processor to perform:

21. A computerized method for generating a histogram of a three-dimensional (3D) point cloud, the method comprising:

generating a set of orientations, including determining an orientation of the 3D point for each 3D point in the 3D point cloud, wherein the orientation includes at least a first value of a first component and a second value of a second component;

generating a histogram comprising bins based on the orientation set, wherein:

each interval in the set of intervals is associated with a first range of values of the first component and a second range of values of the second component; and

generating the histogram includes: for each interval in the set of intervals, adding an orientation from the set of orientations and having a first value and a second value within the first range of values and the second range of values, respectively, associated with the interval.

22. The method of claim 21, wherein the set of intervals are arranged in two dimensions, wherein a first dimension is associated with the first component and a second dimension is associated with the second component.

23. The method of any of claims 21-22, wherein the first component comprises an inclination angle and the second component comprises an azimuth angle.

24. The method of any one of claims 21 to 23, further comprising:

the single 3D data point is stored in the voxel.

25. The method of claim 24, wherein generating the set of orientations comprises:

for each voxel in the 3-D voxel grid, an orientation of the single 3-D data point is determined to generate the set of orientations.

26. The method of any one of claims 21 to 25, wherein generating the set of orientations comprises: for each 3D point in the 3D point cloud, an orientation of the 3D point is determined based on a fixed coordinate system associated with the 3D point cloud.

27. The method of any one of claims 21 to 26, wherein generating the set of orientations comprises: for each 3D point in the 3D point cloud, an orientation of the 3D point is determined based on a local coordinate system associated with the 3D point in the 3D point cloud.

28. The method of any of claims 21-27, further comprising comparing the histogram with a second histogram associated with a second 3D point cloud to determine data indicative of a similarity measure between the 3D point cloud and the second 3D point cloud.

29. The method of claim 28, wherein comparing the histogram with the second histogram comprises:

determining a first set of peaks of the histogram and a second set of peaks of the second histogram;

a correspondence between at least a portion of the first set of peaks and at least a portion of the second set of peaks is determined.

30. A non-transitory computer-readable medium comprising instructions that, when executed by one or more processors on a computing device, are operable to cause the one or more processors to generate a histogram of a three-dimensional (3D) point cloud, comprising:

generating a histogram comprising bins based on the orientation set, wherein:

31. The non-transitory computer-readable medium of claim 30, wherein the instructions are further operable to cause the one or more processors to perform:

the single 3D data point is stored in the voxel.

32. The non-transitory computer-readable medium of claim 31, wherein generating the set of orientations comprises:

33. The non-transitory computer readable medium of any one of claims 30-32, wherein generating the set of orientations comprises: for each 3D point in the 3D point cloud, an orientation of the 3D point is determined based on a fixed coordinate system associated with the 3D point cloud.

34. The non-transitory computer readable medium of any one of claims 30-33, wherein generating the set of orientations comprises: for each 3D point in the 3D point cloud, an orientation of the 3D point is determined based on a local coordinate system associated with the 3D point in the 3D point cloud.

35. The non-transitory computer-readable medium of any one of claims 30-34, wherein the instructions are further operable to cause the one or more processors to perform:

comparing the histogram with a second histogram associated with a second 3D point cloud to determine data indicative of a similarity measure between the 3D point cloud and the second 3D point cloud, comprising:

Determining a first set of peaks of the histogram and a second set of peaks of the second histogram; and

36. A system comprising a memory storing instructions and at least one processor configured to execute the instructions to generate a histogram of a three-dimensional (3D) point cloud, comprising:

generating a histogram comprising bins based on the orientation set, wherein:

37. The system of claim 36, wherein the instructions are further operable to cause the at least one processor to perform:

the single 3D data point is stored in the voxel.

38. The system of any one of claims 36 to 37, wherein generating the set of orientations comprises: for each 3D point in the 3D point cloud, an orientation of the 3D point is determined based on a fixed coordinate system associated with the 3D point cloud.

39. The system of any one of claims 36 to 38, wherein generating the set of orientations comprises: for each 3D point in the 3D point cloud, an orientation of the 3D point is determined based on a local coordinate system associated with the 3D point in the 3D point cloud.

40. The system of any of claims 36-39, wherein the instructions are further operable to cause the at least one processor to perform:

Claims

1. A computerized method for generating a histogram of a three-dimensional 3D point cloud, the method comprising:

determining a reference in spatial relationship to the 3D point cloud;

generating a histogram including a set of entries based on the set of distances includes, for each entry in the set of entries, inserting distances from the set of distances that are within a range of distances associated with the entry.

2. The method of claim 1, further comprising:

the single 3D data point is stored in the voxel.

3. The method of claim 2, wherein determining the distance group comprises:

4. The method according to claim 1, wherein:

the reference is a two-dimensional 2D reference plane; and

5. The method according to claim 1, wherein:

The reference is a reference line; and

6. The method of claim 1, further comprising:

8. The method of claim 1, further comprising: the histogram is compared to a second histogram generated for a second 3D point cloud to determine a measure of similarity between the 3D point cloud and the second 3D point cloud.

Generating a histogram of a three-dimensional 3D point cloud, comprising:

determining a reference in spatial relationship to the 3D point cloud;

Generating a histogram comprising a set of entries based on the set of distances includes, for each entry in the set of entries, inserting distances from the set of distances and within a range of distances associated with the entry.

the single 3D data point is stored in the voxel.

12. The non-transitory computer-readable medium of claim 9, wherein:

the reference is a two-dimensional 2D reference plane; and

13. The non-transitory computer-readable medium of claim 9, wherein:

the reference is a reference line; and

14. The non-transitory computer-readable medium of claim 9, wherein the instructions are further operable to cause the one or more processors to perform:

15. A system comprising a memory storing instructions and at least one processor configured to execute the instructions to generate a histogram of a three-dimensional 3D point cloud, comprising:

determining a reference in spatial relationship to the 3D point cloud;

the single 3D data point is stored in the voxel.

17. The system of claim 16, wherein determining the distance group comprises:

18. The system of claim 15, wherein:

the reference is a two-dimensional 2D reference plane; and

19. The system of claim 15, wherein:

the reference is a reference line; and

20. The system of claim 15, wherein the instructions are further operable to cause the at least one processor to perform:

21. A computerized method for generating a histogram of a three-dimensional 3D point cloud, the method comprising:

Generating a histogram comprising bins based on the orientation set, wherein:

generating the histogram includes, for each bin in the bin set, adding an orientation from the orientation set and having a first value and a second value within the first value range and the second value range, respectively, associated with the bin.

22. The method of claim 21, wherein the interval groups are arranged in two dimensions, wherein a first dimension is associated with the first component and a second dimension is associated with the second component.

23. The method of claim 21, wherein the first component comprises a tilt angle and the second component comprises an azimuth angle.

24. The method of claim 21, further comprising:

the single 3D data point is stored in the voxel.

26. The method of claim 21, wherein generating the set of orientations comprises: for each 3D point in the 3D point cloud, an orientation of the 3D point is determined based on a fixed coordinate system associated with the 3D point cloud.

27. The method of claim 21, wherein generating the set of orientations comprises: for each 3D point in the 3D point cloud, an orientation of the 3D point is determined based on a local coordinate system associated with the 3D point in the 3D point cloud.

28. The method of claim 21, further comprising comparing the histogram to a second histogram associated with a second 3D point cloud to determine data indicative of a similarity measure between the 3D point cloud and the second 3D point cloud.

30. A non-transitory computer-readable medium comprising instructions that, when executed by one or more processors on a computing device, are operable to cause the one or more processors to generate a histogram of a three-dimensional 3D point cloud, comprising:

generating a histogram comprising bins based on the orientation set, wherein:

each interval in the set of intervals is associated with a first range of values for the first component and a second range of values for the second component; and

the single 3D data point is stored in the voxel.

33. The non-transitory computer-readable medium of claim 30, wherein generating the set of orientations comprises: for each 3D point in the 3D point cloud, an orientation of the 3D point is determined based on a fixed coordinate system associated with the 3D point cloud.

34. The non-transitory computer-readable medium of claim 30, wherein generating the set of orientations comprises: for each 3D point in the 3D point cloud, an orientation of the 3D point is determined based on a local coordinate system associated with the 3D point in the 3D point cloud.

35. The non-transitory computer-readable medium of claim 30, wherein the instructions are further operable to cause the one or more processors to perform:

36. A system comprising a memory storing instructions and at least one processor configured to execute the instructions to generate a histogram of a three-dimensional 3D point cloud, comprising:

Generating a histogram comprising bins based on the orientation set, wherein:

The single 3D data point is stored in the voxel.

38. The system of claim 36, wherein generating the set of orientations comprises: for each 3D point in the 3D point cloud, an orientation of the 3D point is determined based on a fixed coordinate system associated with the 3D point cloud.

39. The system of claim 36, wherein generating the set of orientations comprises: for each 3D point in the 3D point cloud, an orientation of the 3D point is determined based on a local coordinate system associated with the 3D point in the 3D point cloud.

40. The system of claim 36, wherein the instructions are further operable to cause the at least one processor to perform: