WO2022258947A1 - Aligning 3d datasets - Google Patents
Aligning 3d datasets Download PDFInfo
- Publication number
- WO2022258947A1 WO2022258947A1 PCT/GB2022/051325 GB2022051325W WO2022258947A1 WO 2022258947 A1 WO2022258947 A1 WO 2022258947A1 GB 2022051325 W GB2022051325 W GB 2022051325W WO 2022258947 A1 WO2022258947 A1 WO 2022258947A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- dataset
- point
- point cloud
- cell
- axis
- Prior art date
Links
- 238000000034 method Methods 0.000 claims abstract description 252
- 238000013519 translation Methods 0.000 claims description 118
- 238000003384 imaging method Methods 0.000 claims description 108
- 239000013598 vector Substances 0.000 claims description 73
- 238000001914 filtration Methods 0.000 claims description 12
- 230000003247 decreasing effect Effects 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 6
- 230000014616 translation Effects 0.000 description 70
- 238000012545 processing Methods 0.000 description 26
- 230000015654 memory Effects 0.000 description 13
- 230000003068 static effect Effects 0.000 description 13
- 230000008569 process Effects 0.000 description 12
- 239000000463 material Substances 0.000 description 8
- 230000000007 visual effect Effects 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 6
- 230000033001 locomotion Effects 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 6
- 230000001419 dependent effect Effects 0.000 description 5
- 230000003993 interaction Effects 0.000 description 5
- 230000000670 limiting effect Effects 0.000 description 5
- 238000013507 mapping Methods 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000005484 gravity Effects 0.000 description 4
- 238000005259 measurement Methods 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 238000007781 pre-processing Methods 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000005286 illumination Methods 0.000 description 3
- 238000002595 magnetic resonance imaging Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 238000003325 tomography Methods 0.000 description 3
- 230000001131 transforming effect Effects 0.000 description 3
- 244000025254 Cannabis sativa Species 0.000 description 2
- 238000003491 array Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 239000003086 colorant Substances 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000011960 computer-aided design Methods 0.000 description 2
- 238000005314 correlation function Methods 0.000 description 2
- 239000011521 glass Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 239000003973 paint Substances 0.000 description 2
- 230000036961 partial effect Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 230000005855 radiation Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000002441 reversible effect Effects 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 241000237858 Gastropoda Species 0.000 description 1
- 238000012897 Levenberg–Marquardt algorithm Methods 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000003292 diminished effect Effects 0.000 description 1
- 239000000428 dust Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 238000009408 flooring Methods 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- 239000011796 hollow space material Substances 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000004651 near-field scanning optical microscopy Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000000149 penetrating effect Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 230000035699 permeability Effects 0.000 description 1
- 238000002600 positron emission tomography Methods 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 230000005258 radioactive decay Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 239000000779 smoke Substances 0.000 description 1
- 239000002689 soil Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000002604 ultrasonography Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
- G06T7/33—Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C11/00—Photogrammetry or videogrammetry, e.g. stereogrammetry; Photographic surveying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
- G06T7/66—Analysis of geometric attributes of image moments or centre of gravity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20021—Dividing image into blocks, subimages or windows
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20068—Projection on vertical or horizontal image axis
Definitions
- the present disclosure relates to the rotational and/or translational alignment (or registration) of 3D datasets, for example in the field of 3D surveying, mapping, and/or imaging.
- 3D survey datasets enable the creation of computerised 3D datasets, models, and meshes, for analysis, calculation, measurement and monitoring of the surveyed space. It is often the case that all parts of a field of interest (or space or volume to be surveyed) cannot be included in the field of view of an imaging apparatus (or scanner) from where it is. Multiple images, scans or surveys are taken at different times and with the imaging apparatus at different locations and orientations, or with multiple variously positioned and oriented scanners simultaneously, and the multiple images are subsequently aligned and/or merged.
- Some techniques for determining rotational and translational alignment of the multiple 3D datasets can be labour intensive.
- Examples of the present disclosure relate to determining a centre of rotation of a 3D dataset, being a cloud of points, so that the 3D dataset may be rotated, about a determined centre of rotation, to be brought in rotational alignment with another 3D dataset. Examples of the present disclosure also relate to determining a translational amount by which a 3D dataset, moved by that translational amount in a plane, or in a direction of an axis, may be brought into translational alignment with another 3D dataset.
- Some examples relate to calculating a rotational angle and first and second translational amounts by which a 3D dataset, rotated by the angle and translated by the first and second amounts in a plane and in a perpendicular axis (perpendicular to the plane), may be brought into rotational and translational alignment with another dataset (e g. brought into alignment in 3 dimensions).
- the 3D dataset may be a dataset of a space and may comprise a point cloud in three dimensions, with each point in the point cloud representing a reading of the space taken by an imaging apparatus.
- the examples disclosed herein are advantageous in situations where the space (represented by the 3D dataset) comprises unknown features and/or no features and/or features that do not comprise distinct elements (for example, distinct edges or smooth surfaces). Accordingly, therefore, the examples disclosed herein are advantageous when the space is a natural scene (such as a forest or beach etc.) as well as when the space comprises a human-made structure.
- an apparatus according to claim 1 and a method according to claim 23.
- a non- transitory machine-readable medium comprising a set of machine-readable instructions stored thereon which, when executed by at least one processor, cause the at least one processor to perform the method of claim 23, and a computer program which, when executed by a computing apparatus, causes the computing apparatus to execute the method of claim 23.
- Optional features are set out in the dependent claims.
- the first example relates to determining a centre of rotation for a 3D dataset about which it may be rotated to be brought into rotational alignment with another dataset.
- the at least one processor may be configured to translate the rotated dataset, or the method may comprise translating the dataset.
- the rotated 3D dataset may thereafter be translated in the direction of a plane in the common coordinate system and then further translated in an axis normal to the plane, according to the second and third examples.
- an apparatus according to claim 9 and a method according to claim 31.
- a non-transitory machine-readable medium comprising a set of machine-readable instructions stored thereon which, when executed by at least one processor, cause the at least one processor to perform the method of claim 31, and a computer program which, when executed by a computing apparatus, causes the computing apparatus to execute the method of claim 31.
- Optional features are set out in the dependent claims.
- the second example relates to rotating and translating a 3D dataset to align the 3D dataset with another 3D dataset in a plane in the common coordinate system onto which the datasets are projected.
- one dataset may be rotated in the plane about a centre, which may be the centre determined according to the first example, or which may be a different centre, for example an origin of the dataset to be rotated or point corresponding to a location of the imaging apparatus that took the readings of the first space.
- the dataset, rotated and translated in the plane according to the second example may then be aligned in an axis perpendicular to the plane according to the third example.
- the degree of match may be output and/or stored.
- determining the first angle and translational amount may comprise determining the degree of match.
- the rotated and translated dataset may be merged, or combined, with the un aligned dataset, and the two datasets may be combined into a merged dataset (or merged point cloud) that may be output and/or stored.
- an apparatus according to claim 45 and a method according to claim 55.
- a non-transitory machine-readable medium comprising a set of machine-readable instructions stored thereon which, when executed by at least one processor, cause the at least one processor to perform the method of claim 55, and a computer program which, when executed by a computing apparatus, causes the computing apparatus to execute the method of claim 55.
- Optional features are set out in the dependent claims.
- the third example relates to aligning a 3D dataset with another 3D dataset along an axis in the common coordinate system.
- the dataset that is aligned in the axis according to the third example may have already been aligned in the plane (according to the second example, e.g. using the first example to determine the centre of rotation).
- the translated point cloud and un-translated point cloud may be merged, or combined, with the un-aligned point cloud to form a merged point cloud that may be output/stored.
- Any of the first to third examples may therefore be performed on any dataset, for example a rotated and/or translated (in the plane) and/or a translated (in an axis) dataset, or an un-rotated and/or translated dataset.
- the first example may be used to determine a centre of rotation for any dataset, or a dataset rotated and translated in the plane according to the second example, or a dataset translated in an axis according to the third example
- the second example may be used to align, in a plane, a dataset translated in an axis according to the third example
- the third example may be used to align, in an axis, a dataset rotated and translated in a plane according to the second example, and in any of these examples the centre of rotation may be determined according to the first example.
- first”, “second”, and “third”, etc. in this disclosure are used as labels having regard to the order in which certain features are presented in a method or in a process performed by a processor. They should not therefore be regarded as limiting and should not necessarily be regarded to be used to distinguish one element from another.
- various “first” and “second” objects e . a “first 3D dataset”, “second 3D dataset,” “first 2D plane”, “second 2D plane” etc. are not intended to distinguish said objects from one another and, in this regard, the first 3D dataset could, wholly or partially, comprise the second 3D dataset and the first 2D plane could comprise or be the same as the second 2D plane etc.
- these labels could refer to the same, or different, elements.
- the datasets are 3D datasets.
- the imaging apparatus may be, for example, a 3D scanner.
- Each dataset contains data representing a 3D image of a subject of interest.
- the first and/or second spaces may be scanned by an imaging apparatus, which may comprise a scanner etc.
- the first and/or second spaces may each comprise a different subset of a wider field of interest (such as a wider space) and, as stated elsewhere, the first and second spaces may overlap (comprising any degree of overlap).
- the first and/or second spaces may also comprise a volume.
- the first and/or second spaces may be scanned by the same imaging apparatus or by a different imaging apparatus.
- the spaces themselves may comprise a hollow space (such as a room etc.) or an open space (such as a field etc.) or an object (e.g. a chair etc ). In some examples one space may be contained in another. In these examples, one space may represent an object within the other space.
- a degree of match between the object (represented by one 3D dataset) and the space in which it is located (represented by the other 3D dataset) may be proportional to a probability that the object is located within the space, for in that instance the 3D dataset of the space will contain the object.
- the object is a known object, e g. known to be a chair, then the degree of match may be proportional to a probability that the known object, or known object type, is located within the space.
- the apparatuses and methods disclosed herein may comprise an object recognition apparatus and an object recognition method, respectively.
- the examples are analytical and obviate any requirement for artificial targets to be manually positioned and, where high accuracy is required, separately surveyed, within a space to be imaged.
- Known methods, which make use of natural or artificial targets, for imaging a large space, consisting of many scans, are set out in the ‘Appendix’ section at the end of the detailed description in this document.
- the examples here in do not require natural targets to be found manually by eye or automatically, so, in consequence, the examples herein can work in featureless environments or environments with random features or which have many types of features which could easily be confused for one another (where mismatches might be made between non-matching features).
- Spaces can be scanned by a scanner (imaging apparatus) to generate separate datasets representing said scanning without the need to maintain consistent orientation and/or position of an imaging apparatus between the datasets, thereby speeding up the setting up of the imaging procedure, whether or not successive scans were recorded sequentially using a static or mobile scanner, or different scanners.
- the process of aligning scans second by second can also be used to determine where the scanner is and its trajectory.
- the examples herein can be used to check the results.
- the rotational and translational alignment of 3D datasets allows subsequent highly accurate measurement of distances on the aligned dataset, scan, point cloud, model or mesh without having to go back to the site to do the measurement.
- This may, in turn, allow an area of walls to be determined (e.g. to find out how much paint is needed to paint them), or floors for flooring or ceilings.
- This may also be used to find the volume of walls to work out how many skips are needed to take away the rubble if the wall is knocked down.
- This may also allow the volume of the air in a room to be determined to work out how much heating is needed and the window area to find out how much heat is leaking out of the windows or how much light is coming in.
- the rotational and translational alignment examples described herein can also allow the determination of, for example, by how much a roof of a tunnel has sagged or by how much walls moved or changed, since a scan of the same region taken earlier (e g. years ago or even the same region a few days before), where work on the tunnel has been conducted since and/or material has been removed from the fabric of the tunnel.
- the alignment could also allow scans from day to day to be compared to determine, e.g., if someone has dropped something somewhere or left a truck or crate somewhere, etc.
- the examples herein may be extended to deal with more than one or more than two scans (e.g. >2 datasets) of a space and, in these examples, one of the 3D datasets for which a centre is to be determined (e.g. according to the first example) or rotated and translated in the plane (e.g. according to the second example) or translated in an axis (e.g. according to the third example) may comprise an aligned 3D dataset (e.g. a pair of overlapping datasets that can be considered a single dataset). In this way, individual pairs of overlapping scans may be aligned in turn or simultaneously in a parallel cloud processor for computational speed.
- one of the 3D datasets for which a centre is to be determined e.g. according to the first example
- rotated and translated in the plane e.g. according to the second example
- translated in an axis e.g. according to the third example
- an aligned 3D dataset e.g. a pair of overlapping datasets that can be
- the datasets may comprise scan datasets and/or image datasets (e.g. data generated by scanning or imaging a physical space).
- the 3D datasets can be represented by 3D point clouds and viewed as 3D images.
- the or each of the datasets may comprise data recorded at successive points in the imaging process, each point being separated from its neighbour to a degree dictated by the resolution and direction of the imaging, each point recording data about the position of the point relative to the imaging apparatus or relative to an already-scanned (in the same scan) point, and each point being generated by an imaging apparatus in response to a reading, in other words, each point in the dataset represents a reading by an imaging apparatus.
- a reading is a recording of the response received by the imaging apparatus from an interaction with the transmitted or reflected beam at the point of interrogation when the scanner is active.
- the imaging apparatus may comprise a moving camera, or two cameras with a fixed distance apart, or multiple cameras (e.g. photogrammetry).
- the imaging apparatus may illuminate the scene or use natural/room light, and/or may comprise an RGB-D camera.
- the scene can be illuminated using structured illumination.
- the imaging apparatus may comprise a thermal camera and/or may be configured to receive radiation.
- the 3D dataset acquisition unit according to any of the examples is configured to obtain a dataset, and may be configured to obtain the dataset directly from the or each imaging apparatus that took the readings of the or each space, or from another entity storing the dataset(s).
- the input(s), e.g. the or each 3D dataset, may be restricted to being a scan, or scans, of a physical space or field or volume of interest.
- examples may obtain 3D datasets of a virtual space as one or both of the first and second 3D datasets.
- 3D point clouds may be represented as mesh models in virtual space to reduce memory, storage, rotation and translation calculation speed and transmission bandwidth requirements.3D datasets may be generated as mesh models for use in virtual space.
- the mesh models may be converted into point clouds, for example, by changing the mesh nodes to points or by interpolation and sampling, after which the methods described in this document can be used.
- the or each 3D dataset may be obtained from the same imaging apparatus or may be obtained from different imaging apparatuses.
- the imaging apparatuses are operable to take readings within a space.
- the readings are, for example, locations of emission or reflection or absorption of a wave or particle detected by the imaging apparatus.
- the readings may have been taken at any time, including time in the past.
- the imaging apparatuses are operable to interpret readings as physical features within 3D space, and to generate a data point in a point cloud corresponding to the reading.
- the point density is higher on surfaces or interfaces so the 3D space may be more strongly represented by surfaces or interfaces.
- a physical feature may be, for example, a surface or an interface between two materials, or any feature within a space such as a natural or man-made object, or any part of an indoor/outdoor space.
- the imaging apparatus used to generate the or each 3D dataset may image using different imaging techniques.
- the first imaging apparatus may be an X-ray scanner and the second imaging apparatus an MRI scanner.
- an extra variable of scale or magnification can be varied along with translations until a best degree of match is found.
- the reading can record the distance from the imaging apparatus to the point. Readings can record such values as position in x, y, z Cartesian coordinates or in cylindrical or spherical or geographic or other coordinates such as space-time.
- the reading can include date and time and person or instrument doing the recording as well as the resolution set and power of the laser used.
- the reading can record the strength of the reflected or transmitted or absorbed signal from the laser or sound wave.
- the reading may record the intensity and colour of any light or radiation or sound emitted from the point and detected by the apparatus.
- the reading may also include a property of the point and/or its neighbouring points such as the curvature of the surface and the position and orientation of a small flat plane patch fitted to the point and its neighbouring points which can be represented as a surface normal vector.
- Generalized ICP GICP
- GICP Generalized ICP
- the reading can record the resistivity or conductivity or capacitance or inductance or electrical complex permittivity or magnetic complex permeability of the space or the speed of travel of electromagnetic or sound waves at that point.
- the reading may record the colour of a surface of volume in r, g, b coordinates or in any of the following colour coordinates CIELAB, CIELUV, CIExyY, CIEXYZ, CMY, CMYK, HLS, HSI, HSV, HVC, LCC, NCS, PhotoYCC, RGB, Y'CbCr, Y'lQ, YPbPr and YUV (reference https://people.sc.fsu.edu/ ⁇ jburkardt/f_src/colors/colors html)
- the reading may record the texture or roughness of a surface or the material of which the surface is made or material of which a surface or volume is made, or determine the density of the volume of material in examples using X rays.
- the reading may record any local movement velocity and acceleration and vector direction (for periodic or vibrational motion, or for moving objects, either so that they can be captured in 3D or excluded from the 3D capture of the rest of the scene) of the point being read over a short time scale by using a method such as Doppler. Note that when the imaging apparatus looks in one direction it may receive back more than one reading. If there is a solid opaque surface there may be one reading. See for example Heinzel, Johannes, and Barbara Koch. "Exploring full-waveform LiDAR parameters for tree species classification.” International lournal of Applied Earth Observation and Geoinformation 13, no.
- the first space and the second space overlap.
- the extent of the overlap may be implementation dependent, and will vary according to each pair of datasets.
- the extent of the overlap may depend on the number of features in the overlap region and the overlap may be the overlap of common features.
- the overlap may be at least a partial overlap, and may be total, for example, where the scans are taken at different times or with scanners using differing technologies and may be total where the scans are taken at different times or with scanners using differing technologies.
- one scanner may be aerial with a low ground resolution and another scanner may be terrestrial on a tripod with a high ground resolution.
- the overlap may be a common physical space represented in both the first and second datasets. Examples may include a pre-processing step as part of the obtaining by the 3D acquisition unit to remove from the input datasets some non-overlapping data.
- the or each 3D datasets is stored as respective point clouds in a common coordinate system.
- the common coordinate system may comprise a workspace, e.g. a virtual workspace.
- the or each 3D dataset has a native coordinate system in which the respective clouds of points are defined. It may therefore be a prerequisite of the examples disclosed herein that the or each of the first and second 3D datasets are compatible with the common coordinate system of the storage unit.
- the native coordinate system may be relative to the position of the imaging apparatus for that dataset as the origin and orientation of the imaging apparatus.
- the or each point cloud may be differently rotated and translated when placed into the workspace.
- the determined centre of rotation, determined angle and translational amount (in the plane) and/or the determined translational amount (in an axis) may be output.
- the output may comprise the determined values themselves or may comprise a dataset, e.g. a transformed dataset, e.g. a rotated and translated dataset (second example) or a translated dataset (third example).
- a transformed dataset e.g. a rotated and translated dataset (second example) or a translated dataset (third example).
- the term “aligned dataset” may refer to a dataset that has been rotated and/or translated (either in a plane or in an axis) or a dataset that has been rotated and translated in a plane and/or then translated in an axis perpendicular to the plane.
- Rotational and/or translational values may be determined relative to, or in, the common coordinate system and therefore may be regarded as in “common coordinate system units.” In some examples, therefore, to rotate and/or translate one of the point clouds a conversion may be performed in which the calculated amounts are converted into “point cloud units,” and then the point cloud is aligned by the converted units.
- the output may therefore comprise a point cloud, rotated and/or translated in a plane and/or translated in an axis by the at least one processor.
- the output first point cloud may comprise an aligned point cloud, and therefore may be rotationally and/or translationally aligned with the stored second point cloud.
- the output may also include a copy of the stored second point cloud. For example, if the common coordinate system is different from the native coordinate system of the second 3D dataset, then in order to achieve two rotationally aligned datasets where physical entities represented in both datasets are co-aligned in the common coordinate system the apparatus may output both the aligned first 3D dataset and the second 3D dataset as respective clouds of points in the common coordinate system.
- the rotationally aligned datasets may be merged and output as a single image dataset fde.
- rotationally aligned datasets it may be taken to mean that physical entities represented in both datasets are co-aligned in the common coordinate system of the two datasets.
- the output e.g. rotated and/or translated point cloud
- the output may be combined with another scan and/or both merged together, optionally with the degree of match, or only the degree of match may be output, optionally with the rotation and translations to achieve it, and the same for any other scans etc. with the highest degrees of match.
- For finding the trajectory of a mobile scanner drone, person, vehicle, etc.
- Visual SLAM reference Taketomi, Takafumi, Hideaki Uchiyama, and Sei Ikeda.
- Visual Odometry references Nister, David, Oleg Naroditsky, and James Bergen.
- Visual Odometry references Nister, David, Oleg Naroditsky, and James Bergen.
- Visual Odometry references Nister, David, Oleg Naroditsky, and James Bergen.
- Visual odometry In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., vol. 1, pp. I-I. Ieee, 2004; Scaramuzza, Davide, and Friedrich Fraundorfer.
- Visual odometry [tutorial]. IEEE robotics & automation magazine 18, no.
- the position of the or each point cloud in the common coordinate system may also be output. If this was not previously known or output, it may be output here in the common coordinate system.
- the values or datasets may be stored, either as an alternative or in addition to being outputted. Any output may be in the form of an amount or a vector (e.g. indicating a direction of rotation or translation and/or an amount etc.). For example, a translational amount may be output as a direction of movement and an amount.
- a subset of grid cells may be removed. This subset may correspond to a boundary of the space.
- the subset to be removed may comprise those grid cells that are less than a predetermined threshold distance away from the nearest cell containing points, and, following their removal, the mean position of the remaining grid cells (of the 2D grid of cells comprising a number of points of a dataset which was projected onto a plane to define the 2D grid, such that the data is projected onto a plane and then the projected image divided into pixels in a grid (the cells)) may be determined. For example, if the space to be imaged is a room or tunnel etc.
- the walls of said room or tunnel may correspond to a boundary of that space, with those cells corresponding to those walls being removed in this step.
- the grid cell may be determined that is nearest to the determine mean and the centre of that grid cell may be selected as the components of the centre of rotation in the axes defining the plane.
- a distance map may be created for each grid cell by assigning a value to each cell, the assigned value representing a distance between the next cell having points therein (which may be a measure of how close that cell is to the boundary of the first space). Either, those grid cells having a distance value less than a predetermined threshold may be removed or it may be determined which grid cells have distance values that are local maxima and the remaining grid cells may be removed.
- the coordinate of the centre of rotation in an axis perpendicular to the plane (thereby defining a normal to the plane) may be determined.
- the range of coordinates in the perpendicular axis may be determined and it may be further determined whether the range is less than a predetermined threshold and the coordinate of the centre of rotation in the perpendicular axis may be determined as the mean value of all the coordinates of the points on the perpendicular axis.
- the determined mean may be summed with a parameter that is proportional to the height of the imaging apparatus that took the readings of the space.
- the range is not less than the predetermined threshold then it may be determined which points have the respective highest and lowest coordinate in the perpendicular axis. The midpoint between these two values may be determined and then the midpoint may be selected as the coordinate of the centre of rotation in the perpendicular axis.
- a grid of cells (as described above) contained in the plane onto which the datasets were projected may be defined for each projected dataset, if not defined already depending on the example.
- Each cell in each respective grid may be of a predetermined size.
- Each cell in each respective grid may comprise a number of points in one or both of the respective datasets.
- a filter may be applied to the projected datasets to produce filtered datasets.
- the number of points in the cell may be increased and/or for each cell in each respective 2D dataset having a number of points of the respective point cloud above a predetermined threshold, the number of points in the cell may be decreased.
- only those grid cells with points from each of the point clouds within it may be selected for further processing, in other words, those grid cells constituting an overlap region between the point clouds may be chosen.
- one of the projected datasets may be iteratively rotated about the axis of rotation by a predetermined amount at each iteration until the dataset has been rotated by 360 degrees.
- a spatial correlation may be performed in the plane to determine a degree of offset between the rotated projected dataset and the un-rotated projected dataset.
- an angle and a translational amount for which a degree of match between the rotated projected dataset and the un-rotated projected dataset is the largest may be determined as the first angle and translational amount in the plane.
- the degree of offset may be determined based on the position of the highest correlation peak, e.g.
- 0 may be determined for the translational amount in the axis if the overlap region is 0 (or empty, e.g. comprises no such cells for which there is overlap).
- the points in the point cloud to be translated having the largest and lowest values in the axis are determined and recorded for each cell, and the points in the other point cloud having the largest and lowest values in the axis are determined and recorded for each cell.
- two ranges one for each point cloud, are determined and may be compared.
- a midpoint for each range may be determined and then a translational amount in the first axis to align the midpoints may be determined, for each cell.
- the mean of these amounts may be determined as the translation amount in the axis.
- the point in the point cloud to be translated and having the highest (or lowest) value in the axis may be determined and the point in the other point cloud having the highest (or lowest) value in the axis may be determined. Then, for each cell, a translational amount in the axis may be determined to bring the highest (or lowest) values into alignment. The mean of these may be selected as the translation amount (or degree of translation) in the axis.
- a measure of the distance between the respective ranges may be determined and those ranges for which the distance is above a predetermined threshold may be discarded. The above processes may then be performed for those remaining, un discarded, ranges.
- the projection of each point in the point cloud to be translated and the other point cloud onto the axis may be determined, forming respective histograms in the axis for each point cloud.
- a translation amount (or degree of translation) to bring the histograms into alignment may then be determined, for each cell.
- the most common translation amount may then be determined as the translation amount in the first axis.
- each grid cell in the overlap region determine the surface normal vectors for each point in the point cloud to be translated and the other point cloud to form sets of surface normal vectors for each point cloud.
- the projections of each set of vectors may be determined onto a plane (whose normal extends parallel to the axis) may be determined and, then, those projections further projected to the first axis, forming projected histograms for each point cloud.
- an amount to bring the histograms into alignment may be determined to form a degree of match and the maximum degree of match may be determined as the translation amount in the first axis.
- the absolute value of each component of each surface normal vector may be determined along the first axis and those vectors whose absolute values are above (or below, depending on the implementation) a predetermined threshold may be discarded, and the above processes may be performed for the remaining, un-discarded vectors or remaining points corresponding to the un-discarded vectors.
- the centre of rotation may then be determined whether the centre of rotation lies within a minimum bounding box of the point cloud of which it is the centre. If not, the centre may be projected into the minimum bounding box and this projected centre used as the centre of rotation (e.g. any coordinates thereof).
- the or each point cloud may be filtered, e.g. prior to performing the process of any of the examples - for example prior to performing the translational alignment of the third example and/or prior to performing the alignment (rotational and translational) in the plane according to the second example.
- a surface normal vector may be determined and their projections onto an axis perpendicular to the plane (e.g. the plane onto which a point cloud was projected to calculate the centre of rotation or the plane onto which point clouds were projected to calculate one of the translation amounts) may be determined to resolve the component of each vector along the first axis.
- the absolute value of each component of each vector along the first axis may be determined and those points whose vectors have absolute values below a predetermined threshold may be discarded, the un-discarded points forming the filtered point cloud.
- a line may be projected from either a centre of the point cloud, or an origin of the point cloud, or from a point corresponding to a location of the imaging apparatus that took the readings of the space corresponding to the point cloud (applicable if the imaging apparatus was static or was moving, in which case the location may be on a trajectory of the imaging apparatus), the line being projected to a point on the aligned point cloud. It may then be determined if the line intersects a point in the other point cloud (e.g. a second point cloud), e.g.
- a line from each point may be projected as long as the distance from the scanner and in the direction of the scanner, and for each point a distance and an azimuth and an elevation angle may be recorded, if the format of the data has this information.
- a line can be projected back to the scanner and if it intersects any point from the other point cloud before it gets to the scanner then this can be output. In other words, this could be done in reverse by projecting from the recorded points back in the recorded angular direction and by the recorded distance to the scanner.
- some examples herein relate to a mechanism by which a rotation and/or translation and/or further translation required to bring two clouds of points such that points representing the same features in the scanned space(s) are co-located in the common coordinate system. Thereafter, the two separate datasets may be merged and treated as a single dataset.
- This computation and optional subsequent merging may be performed in a computationally efficient manner described in the following paragraph without any requirement for any targets to be manually placed in the overlapping region of the imaged space, or for their positions to be surveyed, or for natural targets to be identified and their position and orientation noted, and without the need for a person to manually recognise corresponding features in the two scans and manually rotate and translate the scans to bring the corresponding features into alignment.
- a projection of a dataset onto a plane is effectively the collapsing of the dataset onto the plane.
- the projection may be along a vector normal to the plane or along a vector angled with respect to the plane or along a curve intersecting the plane.
- a projection of a dataset onto a line is effectively the collapsing of the dataset onto the line.
- the projection may be along a vector normal to the line or along a vector angled with respect to the line or along a curve intersecting the line.
- Any line of projection may be the Cartesian co-ordinate axes or a geographic co-ordinate system or spherical or cylindrical co-ordinates. Examples using surface normal vectors may comprise normalising the vectors.
- Examples wherein a best match is determined may comprise using a mathematical correlation function.
- determining should be understood to comprise any of identifying, obtaining, calculating, measuring, choosing or selecting.
- determining a quantity comprises calculating or measuring that quantity or, indeed, obtaining that quantity (e.g. from another entity), or identifying, choosing or selecting that quantity (e g. from a list).
- a number of entities e.g. a grid cell comprising a number of points
- a grid cell comprising a number of points should be understood to mean any number including zero, such that a grid of cells, each cell comprising a number of points, includes a grid of cells where some of those cells have no points.
- Each of the methods described herein may comprise a computer-implemented method and/or an image processing method and/or a dataset alignment method.
- the individual steps of each method may be performed by one of more processors (e.g. executing machine-readable instructions).
- Each apparatus according to each example comprises one or more processors configured to perform tasks (e.g. executing machine-readable instructions).
- each one of the tasks e.g. record a projection, determine an amount etc.
- each one of the tasks may be performed by the same processor or by one processor core of a processing unit such as a CPU, or a plurality of processors may perform the set of tasks with one of the plurality performing one or more tasks and another one of the plurality performing one or more remaining tasks etc.
- Figures 1-3 are each a schematic diagram of an example apparatus
- Figure 4-14 are each flowcharts illustrating part of a method.
- Figure 15 is a schematic diagram of a machine-readable medium in association with a processor
- Figure 16 is a schematic diagram of a hardware configuration. Detailed Description
- FIG. 1-3 depict an example apparatus 10, 20, 30 according to one example of this disclosure, each apparatus comprising at least one processor. It will be appreciated that any one of the apparatuses 10-30 (e.g. the at least one processor thereof) may also be configured to perform the task according to any one of the other apparatuses (e g. before or after) such that any one of the apparatuses 10-30 is configured to perform the methods of any of the first-third examples, in any order.
- the apparatuses 10-30 e.g. the at least one processor thereof
- the apparatuses 10-30 may also be configured to perform the task according to any one of the other apparatuses (e g. before or after) such that any one of the apparatuses 10-30 is configured to perform the methods of any of the first-third examples, in any order.
- Figure 1 shows an apparatus 10 comprising a 3D dataset acquisition unit 12 configured to obtain a first 3D dataset of a first space at a first time, the first 3D dataset being a first point cloud in three dimensions, each point in the first point cloud representing a reading within the first space being taken by an imaging apparatus.
- the apparatus 10 comprises a storage unit 14 that is configured to store the first 3D dataset as a first cloud of points in a coordinate system.
- the apparatus comprises at least one processor 16 that is configured to determine coordinates of a centre of rotation for the first point cloud in the coordinate system.
- the at least one processor 16 is configured to, for each point in the point cloud: record the projection of each point in the first point cloud onto a first 2D plane in the coordinate system, the first 2D plane being defined by first and second perpendicular axes in the coordinate system, each of the first and second axes being perpendicular to each other, to form a first 2D dataset, each point of the first 2D dataset being a projection of a corresponding point in the first point cloud onto the first 2D plane; define, for the first 2D dataset, a 2D grid of cells contained in the first plane, each cell being of a predetermined size and each cell comprising a number of points of the first 2D dataset; determine a mean position of the first 2D dataset or mean grid cell; select, as the components of the centre of rotation in the first and second axes, the coordinates in the first and second axes of the point that is either the mean of the 2D dataset or that is at the centre of the mean grid cell; and to store or output the determined components of the centre of rotation in
- the apparatus 10 of Figure 1 is configured to determine a centre of rotation of a point cloud (3D dataset). Thereafter, the apparatus 10 (or at least one processor thereof) may be configured to determine a translational amount in a plane (relative to a second 3D dataset) (as per the apparatus of Figure 2) and/or a translational amount in an axis (e.g. perpendicular to the plane) (as per the apparatus of Figure 3).
- Figure 2 shows an apparatus 20 comprising a 3D dataset acquisition unit 22 configured to obtain a first 3D dataset of a first space at a first time, the first 3D dataset being a first point cloud in three dimensions, each point in the first point cloud representing a reading within the first space being taken by an imaging apparatus; and to obtain a second 3D dataset of a second space at a second time, the second space and the first space at least partially overlapping, the second 3D dataset being a second point cloud in three dimensions, each point in the second point cloud representing a reading within the second space being taken by an imaging apparatus.
- the apparatus 20 also comprises a storage unit 24 configured to store the first 3D dataset and the second 3D dataset as, respectively, a first and second clouds of points in a common coordinate system.
- the apparatus 20 also comprises at least one processor that is configured to, for each of the first and second point clouds: record the projection of each point in the first and second point clouds onto a first 2D plane in the common coordinate system, the first 2D plane being defined by first and second perpendicular axes in the coordinate system, each of the first and second axes being perpendicular to each other, to form, for each respective point cloud, respective 2D datasets comprising a first 2D dataset and a second 2D dataset, wherein each point of the first 2D dataset is a projection of a corresponding point in the first point cloud onto the first 2D plane, and wherein each point of the second 2D dataset is a projection of a corresponding point in the second point cloud onto the first 2D plane; and to determine: a first angle about an axis of rotation, the axis of rotation extending in a direction perpendicular to the first 2D plane and through a point on the first 2D plane having its components in the first and second axes as the components of a centre of
- the apparatus 20 of Figure 2 is configured to determine a rotation and translation amount to align one point cloud with another in a plane.
- the angle of rotation about which the first point cloud is rotated in the plane may be as determined by the at least one processor 16 of Figure 1.
- the apparatus 20 (or at least one processor thereof) may be configured to determine a further translational amount in an axis perpendicular to the plane (as per the apparatus of Figure 3).
- Figure 3 shows an example apparatus 30 comprising a 3D dataset acquisition unit 32 configured to obtain a first 3D dataset of a first space at a first time, the first 3D dataset being a first point cloud in three dimensions, each point in the first point cloud representing a reading within the first space being taken by an imaging apparatus; and obtain a second 3D dataset of a second space at a second time, the second space and the first space at least partially overlapping, the second 3D dataset being a second point cloud in three dimensions, each point in the second point cloud representing a reading within the second space being taken by an imaging apparatus.
- the apparatus 30 also comprises a storage unit 34 configured to store the first 3D dataset and the second 3D dataset as, respectively, a first and second clouds of points in a common coordinate system.
- the apparatus 30 also comprises at least one processor 36 configured to determine a first translational amount, being an amount in a first direction in the common coordinate system by which the first point cloud, moved by the first translational amount in the first direction, aligns with the second point cloud in the first direction.
- the at least one processor 36 is configured to: record, for each of the first point cloud and the second point cloud, the projection of each point in the respective point clouds onto a first 2D plane in the common coordinate system in the first direction, the first 2D plane having a normal vector parallel to the first direction, to form respective first and second 2D datasets, each point of the respective first and second 2D datasets being a projection of a corresponding point in the respective first point cloud and the second point cloud onto the first plane; define, for each of the first and second 2D datasets, a grid of cells contained in the first plane, each cell of each respective grid being of the same predetermined size, each cell in each respective grid comprising a number of points (which may be 0 since some cells may not contain any points) of the respective first and second datasets; determine an overlap subset of grid cells defining an overlap region, being those grid cells that contain a point from both the first and second 2D datasets; and if the overlap subset contains no such cells and the overlap region is empty, then the at least one processor is configured to
- the apparatus 30 may be configured to determine a further rotation and translational amount in a plane whose normal vector is perpendicular to the first axis (as per the apparatus of Figure 2), in which case the determined centre of rotation may be as per the apparatus of Figure 1.
- Each of the apparatuses 10-30 may be configured to perform the method according to any of Figures 4-14, e.g. any of the blocks/steps thereof.
- each of the at least one processors of the apparatuses 10-30 may be configured to perform the method according to any of Figures 4-14, e.g. any of the blocks/steps thereof.
- Figures 4-14 will be described later.
- Examples of the hardware described with reference to Figures 1-3 e.g. the storage unit, processor, imaging apparatus, scan acquisition unit etc.
- various concepts such as readings, imaging a space, overlap, match etc.
- PCT/GB2018/050233 PCT/GB2018/050233
- Appatus, method, and system for alignment of 3D datasets the entire disclosure of which is incorporated by reference.
- any of the scan acquisition units 12-32 may comprise a functional component to obtain one or more datasets, e g. from an imaging apparatus that took a scan or from another device (e g. storing the datasets but that did not take the scan).
- Any of the scan acquisition units 12-32 may comprise a programmed CPU and/or a GPU and/or a FGPA for real-time processing or an array of them together with an array of memory as in a computer cloud for parallel processing, and I/O (input/output) hardware configured to receive or read the or each dataset volatile or non-volatile memory for storage of processing instructions and image data during processing.
- each of the apparatuses 10-30 may comprise the imaging apparatus.
- the imaging apparatus, or the apparatuses 10-30 may be to use any of the following methods to calculate or derive data: shape from focus, shape from shadow, shape from texture, shape from silhouette, shape from shading, shape from template, shape from symmetry, shape from varying illumination and viewpoint, shape from movement or optical flow, shape from interaction, shape from positional contrast, shape from defocus, virtual 3D computer generated images, in computer aided design, computer generated models or plans.
- Some methods that may be used by the imaging apparatus are: SIFT (V. Vijayan and P. Kp, 2019), SURF (H. Bay, et al., 2006), FAST, Harris comer points, ORB.
- the best and fasted may be ORB (Dong, Pengfei, et al., 2019), (R. Mur-Artal, et al., 2015) which detects keypoints using Features from Accelerated Segment Test (FAST) (R. Mur-Artal and J. D. Tardos, 2017) and the orientation and colour information is stored in BRIEF descriptors (M. Calonder, et al , 2012). Recently there has been some research on finding lines (Ma, Jiayi, et al., 2019) (R. Wang, et al., 2018), (R. G. V. Gioi, et al., 2008), (Y.-H.
- the imaging apparatus may be an apparatus for generating photogrammetric scans, which use two or more adjacent or side by side or slightly spaced cameras, or two views from a single moving camera.
- the imaging apparatus may be an apparatus for projection mapping: This makes use of projected structured light patterns which are projected onto the 3D scene. Examples include Kinect and Hololens. The patterns projected may be lines or grids or random spots. The scene is then viewed using a camera often with a wide angle lens. The light projected is usually infrared so that it is not visible to the human eye. Other formats such as unsigned shorts generated from a video stream, or meshes, may be converted into point clouds as a pre processing step and then input to examples.
- projection mapping the following methods may be used: Endres, Felix, Jiirgen Hess, Jiirgen Sturm, Daniel Cremers, and Wolfram Burgard. "3-D mapping with an RGB-D camera.” IEEE transactions on robotics 30, no.
- the data formats that are possible for the input 3D datasets include one or more from among .fls, ptx, pts, . ply, zfs, rdbx, res, leica format, ptl, e57, xyz, pod, wrl, obj, las, laz which are typically used for surveying applications and DICOM .dcm, PAR/REC, .ima, NIFTI, which are typically used for medical applications and other formats for other applications.
- Data formats may be Cartesian or in spherical angular and radial co-ordinates or in other co ordinates.
- Suitable point cloud structures, imaging apparatuses, and data formats that are possible for the input 3D datasets are set out in WO 2018/138516 Al, the entire disclosure of which are incorporated by reference.
- the following site gives information about building data formats: https://info.vercator.com/blog/what-are-the-most-common-3d-point-cloud-file- formats-and-how-to-solve-interoperability -issues.
- the or each dataset may comprise a point cloud, or cloud of points, which may be derived direct from a laser scanner, and/or which may comprise mesh, voxel, CAD data such as IFC which has been converted into a point cloud.
- the point cloud may have been derived from photogrammetry using one or more cameras, or from structured light projection as in RGB-D cameras (such as Kinect vl or v2, Orbbec Astra Pro, Google Project Tango,), in which the depth image has been converted into a point cloud and the RGB image projected onto it, or by LIDAR laser scanner, or by a mobile Time of Flight scanner or by a laser interferometric scanner.
- RGB-D cameras such as Kinect vl or v2, Orbbec Astra Pro, Google Project Tango
- Other methods to calculate 3D point clouds include at least one of: shape from focus, shape from shadow, shape from texture, shape from silhouette, shape from shading, shape from template, shape from symmetry, shape from varying illumination and viewpoint, shape from movement or optical flow, shape from interaction, shape from positional contrast, shape from defocus, and/or virtual 3D computer generated images, in computer aided design, computer generated models or plans.
- the dataset(s) described herein may be taken by an imaging apparatus including an optical 3D scanner.
- the imaging apparatus may comprise any one of: a handheld LIDAR scanner; a static tripod based LIDAR scanner; a ground penetrating RADAR, or RADAR and a doppler RADAR; a 3D mobile LIDAR, or a 2D LIDAR from which 3D datasets are generated; an Electrical Resistivity Tomography (ERT) or electrical resistivity imaging (ERI) scanner; a CT (computerised tomography) or CAT (computerised axial tomography) scanner; a positron emission tomography (PET) scanner; an MRI (magnetic resonance imaging) scanner; a scanner that is configured to detect radioactive decay emission points; a nuclear quadrupole scanner; a 3D terahertz wave scanner; a projection mapping scanner based on structured light patterns; a photogrammetric scanner using spaced-apart 2D cameras; a 3D ultrasound scanner; a 3D seismic scanner; a 3D sonar scanner; optical interferometer; photogrammetry; projection imaging; surface 3D profiling instruments
- the or each dataset may comprise a prerequisite condition that it/they are compatible with being stored as a point cloud in a common coordinate system (common in this context meaning common to both datasets) and that, in the case of aligning two datasets, that the datasets are of spaces that are at least partially overlapping.
- the first and second datasets may be obtained from the same imaging apparatus, or different imaging apparatuses.
- the first and second 3D datasets may be transmitted, for example, from an imaging apparatus or pre-processor, and received by the 3D dataset acquisition units 12-32.
- the first and second 3D datasets may be stored in physical storage at an address accessible to the 3D dataset acquisition unit 12-32 for retrieval thereby.
- the processor may be inside the imaging apparatus.
- the 3D dataset acquisition units 12-32 may be configurable by an apparatus user, to enable specification of a storage location from which to obtain the 3D datasets, such configuration being, for example, via an interface.
- the 3D dataset acquisition unit may comprise a storage location or interface to which 3D datasets are submitted or transmitted by a user of a imaging apparatus (user here means a imaging apparatus operator or any party analysing or processing the 3D datasets produced by the imaging apparatus).
- the illustrated interconnections between the 3D dataset acquisition unit 12-32 and the storage unit 14-34 may represent the submission of the obtained first and second 3D datasets to the storage unit 14-34 by the respective 3D dataset acquisition unit 12-32.
- the first and second 3D datasets are stored as clouds of points in a common coordinate system by the storage unit 14-34.
- the obtained 3D datasets are clouds of points in three dimensions.
- the 3D dataset acquisition unit 12-32 may execute processing on the obtained 3D datasets to define the respective clouds of points in a common coordinate system (common here denotes common to the two 3D datasets). Any output by the at least one processor 16-36 for example a value, vector, or a rotated and/or translated dataset may be transmitted to the storage unit 14-34 for storage thereby.
- the storage units 14-34 are configured to store the first 3D dataset and the second 3D dataset as respective clouds of points in a common coordinate system and may comprise a volatile or non-volatile storage hardware that is accessible to the 3D dataset acquisition unit 12-32 and the at least one processor 16-36.
- the storage units 14-34 may comprise a controller or management unit to control read and write accesses to stored data.
- the common coordinate system need not be expressed in the same way for both clouds of points. However, there is a defined spatial and angular relationship between the coordinate system in which the two clouds of points are stored, so that whatever the expression of the clouds of points in storage, they are defined in a common coordinate system.
- Figure 4 is a flowchart illustrating a method which may be performed by any of the apparatuses described with respect to Figures 1-3.
- the method of Figure 4 comprises a method of determining the coordinates of a centre of rotation of a point cloud.
- a first 3D dataset is obtained.
- the first 3D dataset is a first point cloud in three dimensions, with each point in the point cloud representing a reading within the first space being taken by an imaging apparatus.
- the first 3D dataset is stored as a first cloud of points in a coordinate system.
- the method comprises, at 406, determining the coordinates of a centre of rotation for the first point cloud in the coordinate system, by, for each point in the point cloud which comprises 408-416.
- the method comprises recording the projection of each point in the first point cloud onto a first 2D plane in the coordinate system, the first 2D plane being defined by first and second perpendicular axes in the coordinate system, each of the first and second axes being perpendicular to each other, to form a first 2D dataset, each point of the first 2D dataset being a projection of a corresponding point in the first point cloud onto the first 2D plane. Therefore, the point cloud is projected onto a plane.
- the plane may be regarded as a “horizontal” plane and the first and second axes defining the plane may be regarded as the x and y axes, according to one example implementation of a Cartesian coordinate system.
- the direction of projection for the point cloud onto the plane may be the direction normal or perpendicular to the plane (e.g. along the z direction) or may be along a different line, e.g. a non-straight or angled line with respect to the plane etc.
- the method comprises defining, for the first 2D dataset, a 2D grid of cells contained in the first plane, each cell being of a predetermined size and each cell comprising a number of points of the first 2D dataset (that number may include 0 since some cells may not comprise any projected points).
- a square grid of pixels is effectively placed over the points to create a sparse 2D grid in the plane, where cells inside the grid cell contain points in the point cloud which have “height” in the direction perpendicular to the plane (e.g. the z-direction if the plane is an (x,y) plane).
- the length and width of each square cell may be chosen as 0.25m.
- the method comprises determining a mean position of the first 2D dataset or mean grid cell. At 412 the method therefore comprises calculating the mean position of all remaining grid cells.
- the method comprises selecting, as the components of the centre of rotation in the first and second axes, the coordinates in the first and second axes of the point that is either the mean of the 2D dataset or that is at the centre of the mean grid cell.
- the method comprises storing or outputting the determined components of the centre of rotation in the first and second axes.
- determining then centre of rotation may comprise steps 702-708, optionally where step 708 comprises steps 710-714.
- the method comprises removing a subset of the grid cells, the removed subset may correspond to a boundary of the first space (e.g. a wall).
- the subset of grid cells to be removed may comprise those that are less than a predetermined threshold distance away from their nearest cells containing points.
- the method comprises determining the mean position of the remaining grid cells.
- the method comprises selecting the grid cell that is nearest to the determined mean.
- the method comprises determining the centre of the grid cell that is nearest to the determined mean.
- the method comprises selecting, as the components of the centre of rotation in the first and second axes, the coordinates in the first and second axes of the centre of the grid cell that is nearest to the determined mean.
- removing 702 a subset of the grid cells comprises, at 710, creating a distance map for each cell in the grid by assigning a value to each cell, the assigned value representing a distance between that cell and the nearest cell containing points; and either removing those grid cells having a distance value that is less than a predetermined threshold 712 or determining 714 which grid cells have distance values that are local maxima, and removing the remaining grid cells.
- a mean grid cell position could correspond to a location outside the physical space that was scanned (e g. it could correspond to an inaccessible location beyond the walls of a curved tunnel) and so a point that is part of the original dataset closest to the mean position may be selected to ensure that the chosen point is within the dataset.
- the grid cell that is closest to the calculated mean position of the cell centre of gravity may be chosen and the centre of this pixel may be used as the centre of rotation. If the plane is an (x,y) plane then the x, y co-ordinate values of this cell are used as the sensor location’s (x,y) coordinates and therefore the (x,y) coordinates of the centre of rotation.
- Step 710 of creating the distance map may comprise applying a distance transform and may, as a result, distinguish between locations that are near to, or internal too, the centre of the scanned environment and the edges of that environment.
- the distance to the nearest pixel containing a point may be calculated in units of pixels, and the distance map may label the points or voxels furthest from the walls of the tunnel with the highest number (as being most inside) and labels the points or voxels towards the walls with a lower number (as being less near the centre) of the space.
- Cells at edges of the scanned regions may contain low values (e.g. 1) whereas internal cells have a larger value, with the value increasing with the distance to the edges.
- Removing grid cells may comprise removing cells that have a distance value of 1 and/or less than a predetermined threshold (e.g. less than 2 or 3 etc.) and/or that have neighbouring cells with a larger value) (step 712).
- a predetermined threshold e.g. less than 2 or 3 etc.
- the points sufficiently inside and near the centre of the space being scanned may be determined and used for further processing.
- pixels which are local maxima may be chosen and discarded, which leaves the cells located near the centre of the scanned space.
- the methods as described above may comprise sparsely sampling the point cloud data to reduce the amount of data that needs to be processed in subsequent steps.
- the method may comprise applying a KDtree or Octree method to sparsely sample the data.
- the resulting sampled point cloud may comprise a more uniform spacing between points, for example 2.5cm or 5cm.
- the point cloud may be divided into 0.5 metre voxels and/or or the projected 2D dataset and/or grid of cells may be divided into 0.5 metre pixels.
- the cell grid may then comprise a 2D grid of 601x601 pixels.
- the centre of rotation determined by Figure 4 may also be alternatively termed a “centre of mass” or “centre of inertia” or “centre of density” or “centre of gravity” of the point cloud, and may be the mean of the point cloud. It may be determined by choosing one of the co-ordinate axes in the workspace and then calculating the distance from the axis to each point in the point cloud, multiplied by the strength of that point and then add both together.
- each point in the point cloud could be 1, or unity. Alternatively they could be weighted according to the intensity of the reflected signal from them. This latter method could be used in examples where there is a lot of dust or smoke. Weak reflections could be filtered first. If the space is pixelated or voxelated space then this could be performed for every pixel or voxel and the strength may then be the number of points in that pixel or voxel.
- the method may proceed to determine a value of the centre of rotation in the axes perpendicular to the plane on which the dataset was projected, e g. considering the plane as an (x,y) plane, the method may determine a coordinate of the centre of rotation in the z-axis.
- Figure 8 is a flowchart illustrating such a method 802 which may be performed by any of the apparatuses described with respect to Figures 1-3, and which may be performed in conjunction with the method of Figure 4 (e.g. in conjunction with the Figure 7 method).
- Method 802 is a method if determining the coordinate of the centre of rotation in a third axis.
- the third axis is perpendicular to the first and second axes (defining the first 2D plane) and so the third axis defines a direction normal to the first 2D plane, wherein.
- the plane is an (x,y) plane then the third axis is the z-direction and the method 802 is a method of determine the coordinate of the centre of rotation in the z direction.
- the method comprises, at step 806, determining the coordinate of the centre of rotation in the third axis to be the mean value of all the third axis coordinates of the point. If otherwise (no), the method comprises, at step 810, determining the point having the highest value of the coordinate in the third axis and determining the point having the lowest value of the coordinate in the third axis and then, at 812, determining the midpoint between the lowest and highest value. In this case, the midpoint may be selected as the coordinate of the centre of rotation in the third axis.
- Step 804 may therefore comprise determining whether all scan coordinates are contained in a single plane. If all scan coordinates are determined to be contained in a single plane then the cell may be determined to correspond to an outdoor location and/or to a part of a physical space that has no roof or ceiling. In this case, the third axis value can be taken as the mean value of all of the point (which will be approximately equal to the ground level in the cell).
- the method could optionally comprise summing the determined mean (at 806) with a parameter proportional to the height of the imaging apparatus that took the readings of the first space.
- the imaging apparatus offset parameter e.g. scanner height offset parameter
- the imaging apparatus offset parameter may be 1.5m in some examples but, of course, this may depend on the height of the imaging apparatus that took the readings of the space. If the scan coordinates within the cell are determined not to be contained in a single plane (e.g. the range of third axis values is larger than the grid cell size parameter), then the third axis value may be set as the half-way point between the lowest and highest point within that grid cell. This is approximately equal to the half-way point between a floor (or ground) and a ceiling (or upper surface) in a room or space.
- Figure 13 is a flowchart illustrating a which may be performed by any of the apparatuses described with respect to Figures 1-3, and which may be performed in conjunction with the method of Figure 8.
- the method comprises projecting the determined centre of rotation into the minimum bounding box.
- the projected centre of rotation may then be selected as the new coordinates of the centre of rotation.
- the Figure 13 method may be useful in the case of a scan having no roof or upper surface (such as an outdoor scan) or of a scan of a very flat environment where the elevation changes in the environment are less than the scanner height offset.
- the parameters mentioned above (0.25 m for the grid cell size and 1.5 m for the offset parameter) are exemplary only and may be implementation specific. Some examples may use these values as “base” parameters and then scale them by respective scaling factors to account for smaller or larger scale scans (e g. aerial scans of cities or forests, etc. or a scan of a doll house etc.). Any such scaling may be performed automatically or by reading the scale of a scan and finding the maximum dimensions of the scan in the datafile input. If the scanned environment type is not known, the grid cell size parameter may be set to a value that is larger (e.g.
- the value should be small enough that it is possible to distinguish between internal and external grid cells (in case of a value that is too large, all cells would contain a distance value of 1, i.e., they would all be edge cells).
- the centre of rotation may be as determined according to Figure 4 or may be a different centre of rotation.
- the scan is a static scan then there may be a known location of the sensor and a point that corresponds to this known sensor location may be selected as the centre of rotation.
- Some examples may comprise a method of selecting a centre of rotation for a point cloud, this method comprising determining whether the scan (3D dataset) contains structure information. If yes, the scan may be assumed to be static and the sensor location of (0,0,0) (e.g. the sensor location corresponds to the origin of the data may be chosen as the centre of rotation).
- a fde format of the data is assumed to be structured. If yes, as above (0,0,0) may be chosen. If not, the scan may be assumed to be mobile such that the sensor location is not fixed and therefore unknown and, in these examples, either the sensor location may be estimated or the method of Figure 4 may be used to determine a centre of rotation. Thus, the Figure 4 example is particularly useful when the imaging apparatus was mobile and/or when the data is unstructured.
- methods may comprise pre-processing scan data. This may comprise reading a raw scan file (e.g. provided by a user). It may be determined, based on the read data and the scan file extension, whether the scan is structured or unstructured. If the data has structure information, the method may determine that the scan is static. Otherwise the method determined that the scan is mobile.
- a raw scan file e.g. provided by a user
- Scans in E57 format can be either structured or unstructured.
- the method may decide which centre of rotation to choose (e.g. according to the Figure 4 method or choose the origin of the data) based on the data inside the scan after reading the scan.
- pre-processing may be performed according to a method as described above. This may comprise a point filtering operation and/or a surface normal calculation, where the determined centre is chosen as a reference point. For example, in methods comprising point distance threshold filtering, all points that are further than a predetermined threshold from the determined centre may be removed.
- the spacing of points reduces the further from the scanner they are recorded, so to maintain resolution, any points further than a predetermined threshold distance away from the scanner may be excluded.
- Mobile scanners may utilise a SLAM algorithm to remove distance points.
- a fixed, well-defined, location should be chosen.
- the origin (0,0,0), or scanner/sensor location may be chosen but for mobile scans, the Figure 4 process may be used.
- the Figure 4 process has the following advantages. A centre of rotation within a bounding box of the scan points may be chosen (see the method of Figure 13). The chosen centre of rotation will be approximately at the centre of the scan rather than near the edge.
- the chosen centre may represent a physical location that is accessible to the scan operator in the physical space, e.g. inside a room or corridor rather than located in a wall or “in air” or in a region that has not been scanned. It also provides a more “realistic” sensor location (e.g. from which a 360 degree or “bubble” view image may be created). It also means that when the point cloud data is placed into a virtual computer workspace (e.g. the common coordinate system), then the workspace only needs to be as long as the length of the scan for the rotation to be determined, and a higher resolution and level of detail can be retained.
- a virtual computer workspace e.g. the common coordinate system
- the surface normal vectors may be made consistent with respect to the sensor location (initially the vectors may be randomly flipped by 180 degrees as a result of the Principal Component Analysis, PC A, algorithm).
- PC A Principal Component Analysis
- the surface normal vectors should point inwards to the room, but if the sensor location is outside the room, some vectors may point inwards and some outwards.
- the origin for measurements may be the centre of azimuthal and elevational rotation and all co-ordinates are assumed to be Cartesian or can be transformed into Cartesian co-ordinates relative to this origin.
- the original co-ordinates may be spherical in azimuthal and elevational angle, but these are often transformed into other co-ordinates in the scanner itself.
- Figure 5 is a flowchart illustrating a method which may be performed by any of the apparatuses described with respect to Figures 1-3.
- the method of Figure 5 comprises a method of determining an aligned dataset, e g. an aligned first point cloud, with respect to a second dataset, and/or may comprise a method of aligning a point cloud in a plane, and/or a method of rotating and translating a point cloud in a plane, and/or a method of determining a rotational and translational amount to align a point cloud in a plane.
- respective first and second 3D datasets are obtained.
- the first 3D dataset is of a first space at a first time and is a first point cloud in three dimensions, each point in the first point cloud representing a reading within the first space being taken by an imaging apparatus.
- the second 3D dataset is of a second space at a second time, the second space and the first space at least partially overlapping, and is a second point cloud in three dimensions, each point in the second point cloud representing a reading within the second space being taken by an imaging apparatus.
- the method comprises storing the first 3D dataset and the second 3D dataset as, respectively, a first and second clouds of points in a common coordinate system.
- the method comprises, for each of the first and second point clouds recording the projection of each point in the first and second point clouds onto a first 2D plane in the common coordinate system, the first 2D plane being defined by first and second perpendicular axes in the coordinate system, each of the first and second axes being perpendicular to each other, to form, for each respective point cloud, respective 2D datasets comprising a first 2D dataset and a second 2D dataset, wherein each point of the first 2D dataset is a projection of a corresponding point in the first point cloud onto the first 2D plane, and wherein each point of the second 2D dataset is a projection of a corresponding point in the second point cloud onto the first 2D plane.
- the method comprises determining a first angle about an axis of rotation, the axis of rotation extending in a direction perpendicular to the first 2D plane and through a point on the first 2D plane having its components in the first and second axes as the components of a centre of rotation of the first point cloud in the common coordinate system; and a first translational amount for which the first 2D dataset, rotated about the axis of rotation by the determined angle, and moved in the first 2D plane by the first translational amount, aligns with the second 2D dataset in the first 2D plane.
- the method comprises either storing or outputting (at 514) the determined first angle and first translational amount; or determining (at 5160 a rotated and translated first 3D dataset, the rotated and translated first 3D dataset being the first point cloud rotated, about the axis of rotation, by an amount proportional to the first angle and translated, in a direction of the first 2D plane, by an amount proportional to the first translational amount and then storing or outputting (518) the determined rotated and translated first 3D dataset.
- a method comprises, at 902, defining, for the first and second 2D datasets, respective 2D grids of cells contained in the first 2D plane, each cell in each respective grid being of a predetermined size and each cell comprising a number of points of the respective first or second 2D dataset.
- the defining a grid may be regarded as an optional step.
- the method comprises applying a filter to the respective datasets to produce respective filtered datasets.
- the defining a grid may be regarded as an optional step.
- Any such filter may be used to derive the or each filtered dataset.
- One example filter is: for each cell in each respective 2D dataset having a number of points of the respective point cloud below a predetermined threshold, increasing the number of points in the cell; and for each cell in each respective 2D dataset having a number of points of the respective point cloud above a predetermined threshold, decreasing the number of points in the cell.
- the Figure 5 method results in a plan-view image being created for the point clouds, and this may be calculated as a number of pixels, or an image size based on a number of pixels.
- the size of the 2D grid(s) may be calculated based on the maximum ranges of overlapping scans (e.g. distance between the scan sensor location/determined centre of rotation (see methods above) and the farthest point) and/or the size of the grid(s) may be automatically limited to a use-defined pixel maximum. For long-range scans, each pixel corresponds to a larger distance and for short-range scans each pixel represents a shorter distance. The total image size can then never exceed some maximum value.
- the square pixels may have sides of 0.5 m.
- the pixel intensity values may be scaled according to a logarithmic scale (e.g. loglO) to reduce pixel intensity variations due to a changes in point density regions (e.g. any walls far from the scanner which have fewer points on them should be as visible as any walls that are nearby).
- loglO logarithmic scale
- any such filtering is performed when proj ecting datasets onto a plane or onto an axis (e.g. perpendicular to the plane).
- the surface normal filtering may effectively remove floor and/or ceiling and/or wall points resulting in a large change in point density from vertical features and non-vertical features.
- the filtered dataset creates features in the 2D projected plan view image which can be used for alignment, even in examples where “prior” 3D (e.g. pre its collapse to 2D) point cloud does not have distinct or reproducibly found features or keypoints.
- overlapping plan-view images may have been created and a transformation for best overlap (or best degree of match) is calculated.
- This transformation comprises a rotation (about the first angle) and translation (by a first amount).
- the rotation is about an axis extending through the plane in a direction perpendicular to the plane (e.g. the z direction if the plane is the (x,y) plane).
- the method comprises, to determine the first angle and first translation amount, at 906, rotating the first 2D dataset about the axis of rotation by a predetermined amount and, at 908, performing a spatial correlation in the first 2D plane to determine a degree of offset between the rotated first 2D dataset and the second 2D dataset.
- steps 906 and 908 may be repeated, or performed multiple times, and at 910 the method comprises determining, based on the rotation(s) and spatial correlation(s), depending on the number of reptations of steps 906-908, an angle and a translational amount for which a degree of match between the first 2D dataset, rotated by the angle and translated by the translational amount, and the second 2D dataset is the largest.
- That angle and translational amount may be recorded as the first angle and first translational amount.
- step 906 is performed iteratively and step 908 is performed after each iterative (or incremental rotation). In some examples, each incremental rotation is by 1 degree.
- steps 906 and 908 are repeated until the first dataset has been rotated 360 degrees to its starting position. Therefore, in one example, steps 906 and 908 are performed 360 times, with the amount of rotation in step 906 being 1 degree each time but, of course, the amount of rotation could be by any other amount.
- the spatial correlation (step 908) may comprise a phase correlation and may be calculated between the rotated scan (2D scan, e.g. in plan view) and the non-rotated scan (2D scan, e.g. in plan view).
- Performing the spatial correlation may comprise performing a 2D Fourier transform on both point clouds (the rotated and non-rotated 2D datasets), then multiplying those Fourier-transformed point clouds, then inverse 2D Fourier transforming the point clouds to obtain the spatial correlation.
- the position, or coordinates, in the plane e.g (x,y) positions if the plane is considered to be an (x,y) plane
- the magnitudes of the local maxima themselves may be determined and recorded.
- the ratio of the magnitude of the local maxima to the average level of the correlation may be determined in some examples and output to a user.
- the maximum correlation may also be output to a user.
- the ratio of the magnitude of the local maximum to the average level of the correlation may also be output to a user.
- the rotation and 2D spatial correlation and local maxima positions and magnitudes calculation may all be repeated after each iteration until the rotated point cloud has performed a 360-degree rotation.
- each iterative rotation is by 1 degree, but it could be another amount, e.g. 2 or 3 degrees.
- the rotation angles are not uniform.
- the iterative rotation begins at a large angle and then decreases.
- the rotations may start at a large step size (e.g. 3 degrees), increasing the sampling density around a known or newly identified “best” angle down to an angular step size (in one example, equal to arctangent (2 / image width)) to achieve an accuracy (e.g. of 1 -pixel movement at the edge of the image).
- a 1000 x 1000 pixels image may have a rotation accuracy of arctangent (1/500) ⁇ 0.1 degrees.
- the translation accuracy may correspond to 1 pixel, which represents some distance in metres, depending on the scan range.
- steps 516 and 518 amounts proportional to these amounts may be calculated and the dataset itself may be rotated and translated by these proportional amounts. These steps may be utilised in examples where the angle and translational amount are in “image units”, and so need to be converted to units compatible with the point clouds. So the method 500 may comprise converting the image angle and translation result to units that are compatible with the point clouds. For example, if the pair of plan-view images has a transformation result of 5 degrees rotation around the perpendicular (e.g. z-axis) and translation of 10 and -15 pixels in the two directions of the plane (e.g. x and y directions), with the metres per pixel calculated as 0.05 (see step 4 above), then the corresponding point cloud transformation would be 5 degrees rotation around z and (0.05 * 10) and (0.05 * -15) metres translations in the x and y directions.
- the pair of plan-view images has a transformation result of 5 degrees rotation around the perpendicular (e.g. z-axis) and translation of 10 and -15
- step 910 finding greatest, or best, match
- the largest one may be selected and then the rotation corresponding to that local maximum correlation may be stored.
- the position/coordinates in the plane (e.g. x and y positions) of that local maximum correlation may then be stored. In some examples, this may be repeated for several of the local maxima correlations in descending order of magnitude and it may then be determined which is best by applying them to one point cloud and calculating a degree of match such as the point-to-plane Iterative Closest Point (ICP) Root Mean Square (RMS) error.
- the local maximum giving the highest degree of match or lowest point-to-plane RMS error may then be selected.
- the ratio of the magnitude of the local maxima to the average level of the correlation may be calculated.
- the first angle and translational amounts may be output or stored.
- Amounts e.g. in point cloud units
- a rotated and translated e.g. by the determined amounts
- the outcome of the Figure 5 process is a 3D point cloud, rotated and translated in a plane, brought into alignment with another 3D point cloud in the plane.
- Figure 6 is a flowchart illustrating a method which may be performed by any of the apparatuses described with respect to Figures 1-3.
- Figure 6 is a method of aligning (e.g. translating) two datasets in an axis.
- the axis may be perpendicular (e.g. orthogonal) to the plane onto which the datasets are proj ected and therefore, used together, in any order, the combination of Figures 5 and 6 may result in a dataset rotated and translated in a plane and translated in an axis perpendicular to the plane (e.g. rotated and aligned in three axes).
- the method of Figure 4 may be performed to determine the centre of rotation (about which the point cloud is rotated in the Figure 5 method) and, therefore, in this sense, the methods of Figures 4-6 (optionally performed with any of the methods of 7-14, alone or in combination) are fully compatible and may be performed in any order.
- the Figure 6 method comprises, at 602, obtaining first and second 3D datasets.
- the first 3D dataset is of a first space at a first time, and is a first point cloud in three dimensions, each point in the first point cloud representing a reading within the first space being taken by an imaging apparatus.
- the second 3D dataset is of a second space at a second time, the second space and the first space at least partially overlapping, and is a second point cloud in three dimensions, each point in the second point cloud representing a reading within the second space being taken by an imaging apparatus.
- the method comprises storing the first 3D dataset and the second 3D dataset as, respectively, a first and second clouds of points in a common coordinate system.
- the method comprises determining a first translational amount, which is an amount in a first direction in the common coordinate system by which the first point cloud, moved by the first translational amount in the first direction, aligns with the second point cloud in the first direction.
- the determining, at 606, comprises steps 616-638 as will now be described.
- the method comprises recording, for each of the first point cloud and the second point cloud, the projection of each point in the respective point clouds onto a first 2D plane in the common coordinate system in the first direction, the first 2D plane having a normal vector parallel to the first direction, to form respective first and second 2D datasets, each point of the respective first and second 2D datasets being a projection of a corresponding point in the respective first point cloud and the second point cloud onto the first plane.
- the method comprises defining, for each of the first and second 2D datasets, a grid of cells contained in the first plane, each cell of each respective grid being of the same predetermined size, each cell in each respective grid comprising a number of points of the respective first and second datasets.
- the method comprises determining an overlap subset of grid cells defining an overlap region, being those grid cells that contain a point from both the first and second 2D datasets.
- the third example method may then comprise blocks 622-638 as will now be described or, alternatively, may proceed to the method of Figure 11 or the method of Figure 12.
- the method comprises determining 0 for the first translational amount. Otherwise (no), the method comprises either steps 626-630 or steps 632-638.
- Step 632 comprises, for each cell in the overlap region, determining the point in the first point cloud having the largest value in the first axis and the point in the first point cloud having the lowest value in the first axis, and recording these largest and lowest values, these values defining a first range for the first point cloud, for each cell, and determining the point in the second point cloud having the largest value in the first axis and determine the point in the second point cloud having the lowest value in the first axis, and recording these largest and lowest values, these values defining a second range for the second point cloud, to thereby define, for each cell in the overlap region, first and second ranges, the first range having as its endpoints the lowest and largest values in the first axis for the points of the first point cloud contained in the cell, and the second range having as its endpoints the lowest and largest values in the first axis for the points of the second point cloud contained in the cell.
- the method comprises determining, for each cell, a midpoint for the first range and a midpoint for the second range.
- the method comprises determining, for each cell, a translation amount in the first direction to bring the midpoint of the first range into alignment with the midpoint for the second range.
- the method comprises determining the second translation amount to be the mean value of the translation amounts.
- the method comprises determining the point in the first point cloud having the highest value in the first axis or having the lowest value in the first axis and determining the point in second point cloud having the highest value in the first axis or having the lower value in the first axis.
- the method comprises determining, for each cell, a translation amount in the first direction to bring the highest value for the first point cloud into alignment with the highest value for the second point cloud or a translation amount to bring the lowest value for the first point cloud into alignment with the lowest value for the second point cloud.
- the method comprises determining the first translation amount to be the mean value of the translation amounts.
- the determined amount may be stored or output (at 610) and/or a translated first 3D dataset, being the first point cloud translated, in the first direction, by an amount proportional to the second translational amount may be determined (at step 612) and this may be stored or output (at 614).
- the Figure 6 method may be used on two point clouds correctly registered in a plane (e.g. according to the Figure 5 method) and the Figure 6 method determines a translational amount in the perpendicular (orthogonal) direction to bring them into alignment in that direction, and in three dimensions.
- the method relates to determining whether and, if so, in which region of a 2D plane (which may be the same plane onto which the clouds were projected in the Figure 5 method and which may be the same plane onto which the first cloud was projected in Figure 4 to determine its centre of rotation), the projections of the two scans to the plane have some overlap. This can be done by laying out a grid of pixels on the plane and projecting each point cloud onto the plane into the grid pixels while retaining their label as to which point cloud they came from. Then the method may search through the pixels to determine those which have points from both point clouds. These pixels may define the overlap region. If there is no overlap, a translation of 0 may be determined.
- separate 2D grids may be defined for each of the overlapping scans in the plane having a cell size (e.g., 0.25 x 0.25 metres).
- a cell size e.g. 0.25 x 0.25 metres.
- the grid cells can now be thought of as square columns, or pillars, along the direction of the axis, where the bottom of the column is at the local “floor” level and the top is at the “ceiling” level of the scan, at the plane locations of the columns.
- the grids for each of the overlapping scans has an equal number of “columns” (or ranges, the terms may be regarded as synonymous) (although some may be empty, or 0 if no point cloud data exists within that cell), and the columns have the same locations in plane.
- Outlier columns may be removed by iterating over each pair of columns (of matching grid indices in each axes defining the plane) in the overlapping scans (where the column for each scan is at the same location in the plane) and compare their heights and remove those pairs of columns whose heights differ markedly (the predetermined threshold being, e.g., if column height difference > grid cell size).
- the translation along the axis that that would align the centre of the one scan’s column with the overlapping scan’s column, and these translations may be recorded in a list of translation candidates.
- a further outlier rejection may be performed by creating a histogram of the available translation candidates and then rejecting some translations if they are higher than a predetermined threshold. This may be useful for mobile scans where any single scan may contain multiple floors above each other.
- the final translation value may then be calculated as the mean value of the remaining, un-discarded, translation candidates.
- the width of this histogram distribution about the final translation value may be determined and output.
- the method may comprise one point cloud into alignment with the other point cloud in the direction in which the translation may be determined.
- the method of Figure 6 may be advantageous in open-air environments.
- the highest voxels may correspond to, for example, the top of trees or grass, or clouds and the lowest voxels may correspond to, for example, the soil, earth, dirt level.
- the tops of the trees or grass or clouds may be swaying in the wind and so may not form a reproducibly reliable height datum level for vertical alignment.
- the method of Figure 6 may also be used for aerial laser scanning from helicopters and drones e.g. of rain forests or over areas where people lived in pre-history, but which have now been covered in vegetation. As the area is scanned, some laser beams find their way down to the forest or vegetation floor and the lowest point of the point cloud in each vertical column of voxels can be found and joined to form a map of the topology of the forest or vegetation floor, and the methods can also be used determine previously unknown earthworks and dwellings which cannot otherwise be recognized. See, for example: Chase, Arlen F , Diane Z Chase, John F. Weishampel, Jason B. Drake, Ramesh L. Shrestha, K. Clint Slatton, Jaime J. Awe, and William E. Carter.
- the pond surface may form the lower or upper, respectively level of the scan.
- the method is adaptable in that it may use ranges and/or midpoints or only the uppermost points or only lowermost points in each column (or range) of voxels for translational alignment in the perpendicular axis and therefore the method is applicable for a wide range of scans representing a wide range of physical spaces.
- the method of Figure 6 is not the only method according to which a translational amount in axis may be determined. Further methods may be used according to Figures 10-12 which will now be described. Each of these methods may be performed by any of the apparatuses described with respect to Figures 1 -3 and may be used in conjunction with the method of Figure 4 or 5. For example, any of the methods of Figures 10-12 may be used on a pair of datasets having been aligned in a plane using the method of Figure 5.
- the method comprises, for each grid cell in the overlap region determining a measure of the distance between the first and second ranges.
- the method comprises discarding those first and second ranges for which the distance between them is above a predetermined threshold, wherein the determination of the translation amounts (as per figure 6 or Figures 11 or 12) is performed for the remaining, un-discarded, ranges.
- Figure 10 represents a filter being applied to the grid cells.
- the method continues following step 620 of the method of Figure 6, at which the overlap region is defined.
- the method comprises, for each grid cell in the overlap region recording the proj ection of each point in the given point cloud whose proj ection is contained in the grid cell onto the third axis to form a first histogram in the third axis and recording the projection of each point in the second point cloud whose projection is contained in the grid cell onto the third axis to form a second histogram in the third axis.
- the method comprises determining, for each cell, a translation amount to bring the first histogram into alignment with the second histogram in the third direction.
- the method comprises determining the second translation amount to be the most common translation amount.
- the translational amount may be stored/outputted and/or an aligned 3D dataset (being translated in the third direction by an amount proportional to the determined translational amount); and may be determined and/or stored or output.
- the projection of a point or pixel etc. onto the perpendicular axis may be regarded as a virtual rod. With the method of Figure 6 as described above, one such rod (extending in the direction of the perpendicular axis) may extend for each pixel in the 2D/plan view image.
- the Figure 11 method rather than creating one vertical rod for all the points in the overlap region there is one vertical rod for each pixel in the plan view image overlap region, where only the points above each pixel are projected onto its vertical rod and form a vertical histogram.
- the bins having the most points may be identified for each pixel or the bins which have a local maximum in the number of points are kept. This may be done for each scan and the histogram for each pixel for each scan is slid over the other to find the best degree of match in the perpendicular axis.
- the degree of match can be the correlation of the two histograms and the highest correlation chosen.
- the translations across all the rods extending in the perpendicular axis above each pixel in the overlap region may be chosen and the most common translation and direction of translation may be found.
- the most common translation and direction may be chosen to be applied to the whole point cloud to bring it into alignment in the perpendicular direction, or a histogram can be plotted of the number of rods that give a particular z translation versus the z translation and all the local peaks in this histogram can be chosen to investigate further, as explained in subsequent paragraphs.
- the number of pixels in the plane image which give the same perpendicular direction translation for alignment may then then be output to the user. If there were two such translations giving similar magnitudes of correlation and are almost equally common amongst the rods above each pixel, the ratio of the number of pixels having the best translation to the number of pixels having the second best translation may be calculated and output to the user.
- the method comprises, at 1202, for each grid cell in the overlap region, determining a surface normal vector for each point in the given point cloud whose projection is contained in the grid cell to form a first set of surface normal vectors and determining a surface normal vector for each point in the second point cloud whose projection is contained in the grid cell to form a second set of surface normal vectors.
- the method comprises recording a projection onto a plane parallel to the first 2D plane of the first and second sets of surface normal vectors to form respective first and second projected sets and recording a projection onto the third axis of the first and second projected sets to form first and second projected histograms.
- the surface normal may be filtered according to the method of Figure 14.
- the method comprises determining, for each cell, a translation amount to bring the first projected histogram into alignment with the second projected histogram in the third direction to form, for each cell, a degree of match.
- Step 1210 may comprise dividing the axis into bins, each bin having a sub-histogram dividing the points into rotational angle of the surface normal relative to the axis azimuthal and elevational bins. Then, a both the vertical histogram and the two angle sub -histograms may be matched.
- each may comprise sub-histograms, with the points in each bin being divided according to projected component magnitudes of the surface normal vectors, and both the histograms and sub -histograms may be matched.
- the method comprises determining the second translation amount to be the maximum degree of match.
- the translational amount may be stored/outputted and/or an aligned 3D dataset (being translated in the third direction by an amount proportional to the determined translational amount); may be determined and/or stored or output; and/or merged with the other dataset and output and/or stored as a merged point cloud.
- Steps 1204 and 1206 are dotted and should be regarded as optional to the method of Figure 12.
- steps 1204-1206 represent a filter being applied to the surface normal vectors before the translational amount is determined.
- the surface normals are projected onto a plane and therefore keep their component magnitudes in the axes spanning the plane. Those points for which their horizontal component magnitude of surface normal vectors is less than a predetermined threshold (e.g. 0.1) may be discarded for the translation calculation in the axis (the points are not permanently discarded however, later all points in the point cloud will be translated according to the determined amount). This has the effect of discarding “vertical” surfaces (e.g. in the direction of the perpendicular axis/parallel to the plane) while keeping those surfaces which are either exactly or almost “horizontal” (e.g. parallel to the plane). Then, the points in the parts of each point cloud above the overlapping region are projected onto rods extending in the perpendicular direction.
- a predetermined threshold e.g. 0.1
- a histogram is formed of point density bins along the rods. Then the histogram for one point cloud is moved relative to that for the other point cloud until they come into alignment and the best alignments are found by measuring the degrees of match.
- the degree of match can be, for example, a mathematical correlation and may be determined, e.g. by ID Fourier transforming each histogram along the perpendicular direction, multiplying them, and Inverse ID Fourier Transforming them to get the correlation function. Then the maximum value of this function is found. The maximum correlation may then be output. The ratio of the magnitude of the local maximum to the average level of the correlation may also be output.
- Figure 14 is a flowchart of a method which may be performed by any of the apparatuses described with respect to Figures 1-3 and which may be performed in conjunction with any of the methods described herein. Then method is a method 1402 of filtering first and second datasets to be aligned by projecting the datasets onto a plane or onto a line (or rod). At 1404 the method comprises determining, for each point in the respective first and second point clouds, a surface normal vector.
- the method comprises recording the projection of each surface normal vector onto an axis (such as the third axis) or onto a plane (such as a plane whose normal is parallel to the third axis) to resolve the component of each surface normal vector along an axis perpendicular to the plane onto which the datasets have been projected, or are to be projected onto a plane (e g., in examples where the dataset have been projected onto a plane spanned by first and second axes, the third axis).
- the method comprises determining the absolute value of each component of each surface normal vector along the perpendicular axis or onto the plane, respectively.
- the method comprises discarding (albeit temporarily for the purpose of the calculation; any discarded points remain in the point clouds for later alignment/outputting) those points from their respective point clouds who have surface normal vectors whose absolute values of their components along the perpendicular axis are below a predetermined threshold.
- a filter processing such as the one of Figure 14, may result in an improved signal-to-noise ratio.
- the outcome of the Figure 14 method may be that only those points on walls or vertical surfaces are kept in the filtered dataset when projected onto the plane, and may be floor levels and ceiling levels when projected into a vertical rod.
- the method of Figure 14 may comprise determining whether the surface normals point away from the determined centre of the respective point cloud and, if so, reverse their directions by 180 degrees.
- a convex hull method may be used.
- the filter of Figure 14 may, as a result, keep those points having surface normals within a certain angle (e.g. of the horizontal or vertical) by resolving the component on the plane or perpendicular axis and if that absolute value is below a threshold (e.g. 0.1 in some examples), the point may be discarded as it may be assumed to be on a horizontal surface.
- a threshold e.g. 0.1 in some examples
- Filtering the dataset(s) in this way may keep points on walls of the space (e.g. tunnel walls) or on almost vertical surfaces (such as the trunks of trees). If, for example, a tunnel wall has been formed by chipping away at the surface it will have facets with surface normals in a wide range of directions or if the trunk of a tree has deeply creviced bark it may also give surfaces with normals in a wide range of directions. So, this filtering process may lose some of the wall or vertical surface points, but it will keep most of them. If the filtering process causes too many points to be lost from the walls or vertical surfaces for good alignment, then precede this step by filtering using another filter to smooth the surfaces of both point clouds first but not to move the surfaces otherwise later alignment will not be accurate.
- a tunnel wall has been formed by chipping away at the surface it will have facets with surface normals in a wide range of directions or if the trunk of a tree has deeply creviced bark it may also give surfaces with normals in a wide range of directions. So,
- Figure 15 depicts, schematically, a non-transitory and machine-readable medium 1500 in association with a processor 1502.
- the medium 1500 comprises a set of executable instructions 1504 stored thereon which, when executed by the processor 1502 causes the processor 1502 to perform the method according to any of Figures 4-14 (e.g. any one or more of the steps thereof).
- Figure 16 is a block diagram of a computing device, such as a server, which may be used to implement a method of any of the examples as described above (e.g. any one or more of the steps of any method alone or in combination).
- the computing device of Figure 16 may be the hardware configuration of one or more from among the 3D image acquisition unit 12-32, the storage unit 14-34, and/or the one or more processors 36.
- the computing device of Figure 16 may be used to implement the method of any of Figures 4-14.
- the computing device comprises a device processing unit 1601 (such as a CPU if the device is a computer, the device may comprise a smart device such as a phone or tablet, or the processing unit may be embedded inside an imaging apparatus, such as a scanner), a memory, such as Random Access Memory (RAM) 1603, and storage, such as a hard disk, 1604.
- the computing device also includes a network interface 1607 for communication with other such computing devices of examples herein.
- An example may be composed of a network of such computing devices operating as a cloud computing cluster.
- the computing device also includes Read Only Memory 1602, one or more input mechanisms such as keyboard and mouse 1606, and a display unit such as one or more monitors 1605.
- the components are connectable to one another via a bus 1600.
- the CPU 1601 is configured to control the computing device and execute processing operations.
- the RAM 1603 stores data being read and written by the CPU 1601.
- the storage unit 1604 may be, for example, a non-volatile storage unit, and is configured to store data.
- the optional display unit 1605 displays a representation of data stored by the computing device and displays a cursor and dialog boxes and screens enabling interaction between a user and the programs and data stored on the computing device.
- the input mechanisms 1606 enable a user to input data and instructions to the computing device.
- the display could be a 3D display using stereo glasses or in a helmet like Holdens or a holographic display or an autostereoscopic display none of which need special glasses.
- the network interface (network I/F) 1607 is connected to a network, such as the Internet, and is connectable to other such computing devices via the network.
- the network I/F 1607 controls data input/output from/to other apparatus via the network.
- the network I/F may provide a connection to a computing device from which the 3D datasets were obtained, and may receive guidance or instructions defining elements of the processing (for example, selecting algorithms).
- peripheral devices such as microphone, speakers, printer, power supply unit, fan, case, scanner, trackerball etc may be included in the computing device.
- Any one of the 3D image alignment apparatus 10-30 may be embodied as functionality realised by a computing device such as that illustrated in Figure 16.
- the functionality of the 3D image alignment apparatus 10-30 may be realised by a single computing device or by a plurality of computing devices functioning cooperatively or by a cloud processing network.
- Methods embodying the present invention may be carried out on, or implemented by, a computing device such as that illustrated in Figure 16.
- One or more such computing devices may be used to execute a computer program of any of the first-third examples disclosed herein.
- Computing devices embodying or used for implementing examples need not have every component illustrated in Figure 16, and may be composed of a subset of those components.
- the at least one processor may be programmed processor hardware, comprising processing instructions stored on a storage unit, a processor to execute the processing instructions, and a RAM to store information objects during the execution of the processing instructions.
- the processor could be a CPU or a GPU or FPGA or an array of them.
- the invention also provides a computer program or a computer program product or a cloud- based service for carrying out any of the methods described herein, and a computer readable medium having stored thereon a program for carrying out any of the methods described herein.
- a computer program embodying the invention may be stored on a computer-readable medium, or it could, for example, be in the form of a signal such as a downloadable data signal provided from an Internet website, or it could be in any other form.
- a computing device such as a data storage server, may embody the present invention, and may be used to implement a method of an example of the invention.
- the computing device may comprise a processor and memory.
- the computing device may also include a network interface for communication with other computing devices, for example with other computing devices of invention examples.
- the memory may include a computer readable medium, which may refer to a single medium or multiple media (e g., a centralized or distributed database and/or associated caches and servers) configured to carry computer-executable instructions or have data structures stored thereon.
- Computer-executable instructions may include, for example, instructions and data accessible by and causing a general purpose computer, special purpose computer, or special purpose processing device (e.g., one or more processors) to perform one or more functions or operations.
- computer-readable storage medium may also include any medium that is capable of storing, encoding or carrying a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methods of the present disclosure.
- the term “computer-readable storage medium” may accordingly be taken to include, but not be limited to, solid-state memories, optical media and magnetic media.
- such computer-readable media may include non-transitory computer-readable storage media, including Random Access Memory (RAM), Read-Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory (CD-ROM) or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory devices (e.g., solid state memory devices).
- the processor may be is configured to control the computing device and execute processing operations, for example executing code stored in the memory to implement the various different functions of modules described here and in the claims.
- the memory may store data being read and written by the processor.
- a processor may include one or more general-purpose processing devices such as a microprocessor, central processing unit, general processing unit, or a distributed cloud network of processors.
- the processor may include a complex instruction set computing (CISC) microprocessor, reduced instruction set computing (RISC) microprocessor, very long instruction word (VLIW) microprocessor, or a processor implementing other instruction sets or processors implementing a combination of instruction sets.
- CISC complex instruction set computing
- RISC reduced instruction set computing
- VLIW very long instruction word
- the processor may also include one or more special-purpose processing devices such as an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a digital signal processor (DSP), network processor, or the like.
- ASIC application specific integrated circuit
- FPGA field programmable gate array
- DSP digital signal processor
- a processor is configured to execute instructions for performing the operations and steps discussed herein.
- the display unit may display a representation of data stored by the computing device and may also display a cursor and dialog boxes and screens enabling interaction between a user and the programs and data stored on the computing device.
- the input mechanisms may enable a user to input data and instructions to the computing device.
- the network interface may be connected to a network, such as the Internet, and may be connectable to other such computing devices via the network.
- the network I/F may control data input/output from/to other apparatus via the network.
- Other peripheral devices such as microphone, speakers, printer, power supply unit, fan, case, scanner, trackerball etc may be included in the computing device.
- any reference signs placed in parentheses in one or more claims shall not be construed as limiting the claims.
- the word “comprising” and “comprises,” and the like, does not exclude the presence of elements or steps other than those listed in any claim or the specification as a whole.
- the singular reference of an element does not exclude the plural references of such elements and vice-versa.
- One or more of the examples may be implemented by means of hardware comprising several distinct elements. In a device or apparatus claim enumerating several means, several of these means may be embodied by one and the same item of hardware.
- the mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to an advantage.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Theoretical Computer Science (AREA)
- Geometry (AREA)
- Multimedia (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
An apparatus and method is disclosed for, in a first example, determining a centre of rotation of a 3D dataset, in a second example, determining an angle and translational amount to align a first 3D dataset with a second 3D dataset in a plane, and, in a third example, determining a translational amount in an axis to align a first 3D dataset with a second 3D dataset in the axis, and for performing the three examples alone or in combination to align, in three dimensions, one dataset with another.
Description
ALIGNING 3D DATASETS
The present disclosure relates to the rotational and/or translational alignment (or registration) of 3D datasets, for example in the field of 3D surveying, mapping, and/or imaging.
Background
3D survey datasets enable the creation of computerised 3D datasets, models, and meshes, for analysis, calculation, measurement and monitoring of the surveyed space. It is often the case that all parts of a field of interest (or space or volume to be surveyed) cannot be included in the field of view of an imaging apparatus (or scanner) from where it is. Multiple images, scans or surveys are taken at different times and with the imaging apparatus at different locations and orientations, or with multiple variously positioned and oriented scanners simultaneously, and the multiple images are subsequently aligned and/or merged.
Some techniques for determining rotational and translational alignment of the multiple 3D datasets can be labour intensive.
Statements of Invention
Examples of the present disclosure relate to determining a centre of rotation of a 3D dataset, being a cloud of points, so that the 3D dataset may be rotated, about a determined centre of rotation, to be brought in rotational alignment with another 3D dataset. Examples of the present disclosure also relate to determining a translational amount by which a 3D dataset, moved by that translational amount in a plane, or in a direction of an axis, may be brought into translational alignment with another 3D dataset. Some examples relate to calculating a rotational angle and first and second translational amounts by which a 3D dataset, rotated by the angle and translated by the first and second amounts in a plane and in a perpendicular axis (perpendicular to the plane), may be brought into rotational and translational alignment with another dataset (e g. brought into alignment in 3 dimensions). The 3D dataset may be a dataset of a space and may comprise a point cloud in three dimensions, with each point in the point cloud representing a reading of the space taken by an imaging apparatus.
The examples disclosed herein are advantageous when an imaging apparatus was moving at the time of taking the scan (e . a moving and/or a mobile scanner) as well as when the imaging apparatus was static, or when two or more imaging apparatus each takes a scan (either when they are moving or static) and the scans from each imaging apparatuses are aligned. The examples disclosed herein are advantageous in situations where the space (represented by the 3D dataset) comprises unknown features and/or no features and/or features that do not comprise distinct elements (for example, distinct edges or smooth surfaces). Accordingly, therefore, the examples disclosed herein are advantageous when the space is a natural scene (such as a forest or beach etc.) as well as when the space comprises a human-made structure.
According to a first example of the disclosure there is provided an apparatus according to claim 1 and a method according to claim 23. According to this example there is also provided a non- transitory machine-readable medium comprising a set of machine-readable instructions stored thereon which, when executed by at least one processor, cause the at least one processor to perform the method of claim 23, and a computer program which, when executed by a computing apparatus, causes the computing apparatus to execute the method of claim 23. Optional features are set out in the dependent claims.
The first example relates to determining a centre of rotation for a 3D dataset about which it may be rotated to be brought into rotational alignment with another dataset. The at least one processor may be configured to translate the rotated dataset, or the method may comprise translating the dataset. For example, the rotated 3D dataset may thereafter be translated in the direction of a plane in the common coordinate system and then further translated in an axis normal to the plane, according to the second and third examples.
According to a second example of the disclosure there is provided an apparatus according to claim 9 and a method according to claim 31. According to this example there is also provided a non-transitory machine-readable medium comprising a set of machine-readable instructions stored thereon which, when executed by at least one processor, cause the at least one processor to perform the method of claim 31, and a computer program which, when executed by a computing apparatus, causes the computing apparatus to execute the method of claim 31. Optional features are set out in the dependent claims.
The second example relates to rotating and translating a 3D dataset to align the 3D dataset with another 3D dataset in a plane in the common coordinate system onto which the datasets are projected. To do so, one dataset may be rotated in the plane about a centre, which may be the centre determined according to the first example, or which may be a different centre, for example an origin of the dataset to be rotated or point corresponding to a location of the imaging apparatus that took the readings of the first space. The dataset, rotated and translated in the plane according to the second example, may then be aligned in an axis perpendicular to the plane according to the third example.
According to the second example, the degree of match may be output and/or stored. In other words, determining the first angle and translational amount may comprise determining the degree of match. The rotated and translated dataset may be merged, or combined, with the un aligned dataset, and the two datasets may be combined into a merged dataset (or merged point cloud) that may be output and/or stored.
According to a third example of the disclosure there is provided an apparatus according to claim 45 and a method according to claim 55. According to this example there is also provided a non-transitory machine-readable medium comprising a set of machine-readable instructions stored thereon which, when executed by at least one processor, cause the at least one processor to perform the method of claim 55, and a computer program which, when executed by a computing apparatus, causes the computing apparatus to execute the method of claim 55. Optional features are set out in the dependent claims.
The third example relates to aligning a 3D dataset with another 3D dataset along an axis in the common coordinate system. The dataset that is aligned in the axis according to the third example may have already been aligned in the plane (according to the second example, e.g. using the first example to determine the centre of rotation).
According to the third example, the translated point cloud and un-translated point cloud may be merged, or combined, with the un-aligned point cloud to form a merged point cloud that may be output/stored.
Any of the first to third examples may therefore be performed on any dataset, for example a rotated and/or translated (in the plane) and/or a translated (in an axis) dataset, or an un-rotated and/or translated dataset. For example, the first example may be used to determine a centre of rotation for any dataset, or a dataset rotated and translated in the plane according to the second example, or a dataset translated in an axis according to the third example, and the second example may be used to align, in a plane, a dataset translated in an axis according to the third example, and the third example may be used to align, in an axis, a dataset rotated and translated in a plane according to the second example, and in any of these examples the centre of rotation may be determined according to the first example.
Accordingly, the terms “first”, “second”, and “third”, etc. in this disclosure are used as labels having regard to the order in which certain features are presented in a method or in a process performed by a processor. They should not therefore be regarded as limiting and should not necessarily be regarded to be used to distinguish one element from another. In particular, in this regard, as used herein, various “first” and “second” objects, e . a “first 3D dataset”, “second 3D dataset,” “first 2D plane”, “second 2D plane” etc. are not intended to distinguish said objects from one another and, in this regard, the first 3D dataset could, wholly or partially, comprise the second 3D dataset and the first 2D plane could comprise or be the same as the second 2D plane etc. As such, these labels could refer to the same, or different, elements.
The datasets are 3D datasets. The imaging apparatus may be, for example, a 3D scanner. Each dataset contains data representing a 3D image of a subject of interest.
The first and/or second spaces may be scanned by an imaging apparatus, which may comprise a scanner etc. The first and/or second spaces may each comprise a different subset of a wider field of interest (such as a wider space) and, as stated elsewhere, the first and second spaces may overlap (comprising any degree of overlap). The first and/or second spaces may also comprise a volume. The first and/or second spaces may be scanned by the same imaging apparatus or by a different imaging apparatus. The spaces themselves may comprise a hollow space (such as a room etc.) or an open space (such as a field etc.) or an object (e.g. a chair etc ). In some examples one space may be contained in another. In these examples, one space may represent an object within the other space. In these examples, using the second and/or third alignment methods as recited above, a degree of match between the object (represented by one 3D dataset) and the space in which it is located (represented by the other 3D dataset) may be
proportional to a probability that the object is located within the space, for in that instance the 3D dataset of the space will contain the object. If the object is a known object, e g. known to be a chair, then the degree of match may be proportional to a probability that the known object, or known object type, is located within the space. In this way, the apparatuses and methods disclosed herein may comprise an object recognition apparatus and an object recognition method, respectively.
The examples are analytical and obviate any requirement for artificial targets to be manually positioned and, where high accuracy is required, separately surveyed, within a space to be imaged. Known methods, which make use of natural or artificial targets, for imaging a large space, consisting of many scans, are set out in the ‘Appendix’ section at the end of the detailed description in this document. The examples here in do not require natural targets to be found manually by eye or automatically, so, in consequence, the examples herein can work in featureless environments or environments with random features or which have many types of features which could easily be confused for one another (where mismatches might be made between non-matching features). Spaces can be scanned by a scanner (imaging apparatus) to generate separate datasets representing said scanning without the need to maintain consistent orientation and/or position of an imaging apparatus between the datasets, thereby speeding up the setting up of the imaging procedure, whether or not successive scans were recorded sequentially using a static or mobile scanner, or different scanners. The process of aligning scans second by second can also be used to determine where the scanner is and its trajectory. Conversely, if the scanner traj ectory is approximately known by either having an onboard IMU and sensors or by locating the mobile phone of the user, the examples herein can be used to check the results.
The rotational and translational alignment of 3D datasets allows subsequent highly accurate measurement of distances on the aligned dataset, scan, point cloud, model or mesh without having to go back to the site to do the measurement. This may, in turn, allow an area of walls to be determined (e.g. to find out how much paint is needed to paint them), or floors for flooring or ceilings. This may also be used to find the volume of walls to work out how many skips are needed to take away the rubble if the wall is knocked down. This may also allow the volume of the air in a room to be determined to work out how much heating is needed and the window area to find out how much heat is leaking out of the windows or how much light is coming in. The rotational and translational alignment examples described herein can also allow the
determination of, for example, by how much a roof of a tunnel has sagged or by how much walls moved or changed, since a scan of the same region taken earlier (e g. years ago or even the same region a few days before), where work on the tunnel has been conducted since and/or material has been removed from the fabric of the tunnel. The alignment could also allow scans from day to day to be compared to determine, e.g., if someone has dropped something somewhere or left a truck or crate somewhere, etc.
Of course, the examples herein may be extended to deal with more than one or more than two scans (e.g. >2 datasets) of a space and, in these examples, one of the 3D datasets for which a centre is to be determined (e.g. according to the first example) or rotated and translated in the plane (e.g. according to the second example) or translated in an axis (e.g. according to the third example) may comprise an aligned 3D dataset (e.g. a pair of overlapping datasets that can be considered a single dataset). In this way, individual pairs of overlapping scans may be aligned in turn or simultaneously in a parallel cloud processor for computational speed.
High quality 3D datasets and models may be created from two overlapping datasets of a subject or field of interest, without requiring knowledge of: the relative orientation of the imaging apparatus from dataset to dataset = the relative orientation of the imaging apparatus to the subject of interest, the placement of artificial targets nor the surveying of their positions, nor the identification of the location and orientation of natural targets. The datasets may comprise scan datasets and/or image datasets (e.g. data generated by scanning or imaging a physical space). The 3D datasets can be represented by 3D point clouds and viewed as 3D images.
The or each of the datasets, e.g. obtained by a 3D dataset acquisition unit of an apparatus according to any of the examples disclosed herein, may comprise data recorded at successive points in the imaging process, each point being separated from its neighbour to a degree dictated by the resolution and direction of the imaging, each point recording data about the position of the point relative to the imaging apparatus or relative to an already-scanned (in the same scan) point, and each point being generated by an imaging apparatus in response to a reading, in other words, each point in the dataset represents a reading by an imaging apparatus. In turn, a reading is a recording of the response received by the imaging apparatus from an interaction with the transmitted or reflected beam at the point of interrogation when the scanner is active. The imaging apparatus may comprise a moving camera, or two cameras with a fixed distance apart, or multiple cameras (e.g. photogrammetry). The imaging apparatus may
illuminate the scene or use natural/room light, and/or may comprise an RGB-D camera. In examples using an RGB-D camera, the scene can be illuminated using structured illumination. In this way, if something is being emitted by the object the imaging apparatus may be configured to receiving it which may constitute a reading of the object. For example, the imaging apparatus may comprise a thermal camera and/or may be configured to receive radiation. The 3D dataset acquisition unit according to any of the examples is configured to obtain a dataset, and may be configured to obtain the dataset directly from the or each imaging apparatus that took the readings of the or each space, or from another entity storing the dataset(s).
The input(s), e.g. the or each 3D dataset, may be restricted to being a scan, or scans, of a physical space or field or volume of interest. Alternatively, examples may obtain 3D datasets of a virtual space as one or both of the first and second 3D datasets. 3D point clouds may be represented as mesh models in virtual space to reduce memory, storage, rotation and translation calculation speed and transmission bandwidth requirements.3D datasets may be generated as mesh models for use in virtual space. In order to align mesh models with point clouds or with other mesh models the mesh models may be converted into point clouds, for example, by changing the mesh nodes to points or by interpolation and sampling, after which the methods described in this document can be used.
The or each 3D dataset may be obtained from the same imaging apparatus or may be obtained from different imaging apparatuses. The imaging apparatuses are operable to take readings within a space. The readings are, for example, locations of emission or reflection or absorption of a wave or particle detected by the imaging apparatus. The readings may have been taken at any time, including time in the past. The imaging apparatuses are operable to interpret readings as physical features within 3D space, and to generate a data point in a point cloud corresponding to the reading. The point density is higher on surfaces or interfaces so the 3D space may be more strongly represented by surfaces or interfaces. A physical feature may be, for example, a surface or an interface between two materials, or any feature within a space such as a natural or man-made object, or any part of an indoor/outdoor space. The imaging apparatus used to generate the or each 3D dataset may image using different imaging techniques. For example, the first imaging apparatus may be an X-ray scanner and the second imaging apparatus an MRI scanner. Alternatively, in the optional further step of reduction of the respective datasets to one-dimensional arrays (e.g. in the second and third examples), an extra variable of scale or
magnification can be varied along with translations until a best degree of match is found. For the second example, where two images are aligned on a plane, after projection onto the plane of the two images, one image can be magnified or diminished relative to the other one and the best degree of match found corresponding to one value of scale, rotation, x and y translation. In this way the scale can be determined. This may be advantageous, for example if the scanner’s scale has not been calibrated by an obj ect or pattern of a known size immediately before, during or immediately after scanning.
The reading can record the distance from the imaging apparatus to the point. Readings can record such values as position in x, y, z Cartesian coordinates or in cylindrical or spherical or geographic or other coordinates such as space-time. The reading can include date and time and person or instrument doing the recording as well as the resolution set and power of the laser used. The reading can record the strength of the reflected or transmitted or absorbed signal from the laser or sound wave. The reading may record the intensity and colour of any light or radiation or sound emitted from the point and detected by the apparatus. The reading may also include a property of the point and/or its neighbouring points such as the curvature of the surface and the position and orientation of a small flat plane patch fitted to the point and its neighbouring points which can be represented as a surface normal vector. Generalized ICP (GICP) may be used to find the nearest point in the other point clouds having the same curvature and calculate and minimise the distance between them assuming that they match, for example as described in: Segal, Aleksandr, Dirk Haehnel, and Sebastian Thrun. "Generalized- icp." In Robotics: science and systems, vol. 2, no. 4, p. 435. 2009; Ren, Zhuli, Liguan Wang, and Lin Bi. "Robust GlCP-based 3D LiDAR SLAM for underground mining environment." Sensors 19, no. 13 (2019): 2915; and Peters, J. M. H. "Total curvature of surfaces (via the divergence of the normal)." International Journal of Mathematical Education in Science and Technology 32, no. 6 (2001): 795-810. ICP point to plane may be used, see: Low, Kok-Lim. "Linear least-squares optimization for point-to-plane icp surface registration." Chapel Hill, University of North Carolina 4, no. 10 (2004): 1-3. The reading may also include derived properties such as the magnitude of the curvature and whether the curvature is positive or negative. The reading can record the resistivity or conductivity or capacitance or inductance or electrical complex permittivity or magnetic complex permeability of the space or the speed of travel of electromagnetic or sound waves at that point. The reading may record the colour of a surface of volume in r, g, b coordinates or in any of the following colour coordinates CIELAB, CIELUV, CIExyY, CIEXYZ, CMY, CMYK, HLS, HSI, HSV, HVC, LCC, NCS, PhotoYCC,
RGB, Y'CbCr, Y'lQ, YPbPr and YUV (reference https://people.sc.fsu.edu/~jburkardt/f_src/colors/colors html) The reading may record the texture or roughness of a surface or the material of which the surface is made or material of which a surface or volume is made, or determine the density of the volume of material in examples using X rays. The reading may record any local movement velocity and acceleration and vector direction (for periodic or vibrational motion, or for moving objects, either so that they can be captured in 3D or excluded from the 3D capture of the rest of the scene) of the point being read over a short time scale by using a method such as Doppler. Note that when the imaging apparatus looks in one direction it may receive back more than one reading. If there is a solid opaque surface there may be one reading. See for example Heinzel, Johannes, and Barbara Koch. "Exploring full-waveform LiDAR parameters for tree species classification." International lournal of Applied Earth Observation and Geoinformation 13, no. 1 (2011): 152-160; or Mandlburger, Gottfried, Martin Pfennigbauer, Roland Schwarz, Sebastian Flory, and Lukas Nussbaumer. "Concept and Performance Evaluation of a Novel UAV-Borne Topo-Bathymetric LiDAR Sensor." Remote Sensing 12, no. 6 (2020): 986. If the surface is slightly transparent or if there is no surface and it is just a volume or space being imaged there may be thousands of readings from different distances away from the imaging apparatus which distinguishes them from each other. So, for example, a reading may be x,y,z,r,g,b. The or each dataset may be obtained from an imaging apparatus at a respective position and orientation, the position and orientation being of a or respective imaging apparatus(es) relative to the space being imaged or relevant to the previous position of the previous scan or neighbouring scan.
In examples where two datasets are to be aligned (e.g. according to the second and third examples), the first space and the second space overlap. The extent of the overlap may be implementation dependent, and will vary according to each pair of datasets. The extent of the overlap may depend on the number of features in the overlap region and the overlap may be the overlap of common features. For example, the overlap may be at least a partial overlap, and may be total, for example, where the scans are taken at different times or with scanners using differing technologies and may be total where the scans are taken at different times or with scanners using differing technologies. For example, one scanner may be aerial with a low ground resolution and another scanner may be terrestrial on a tripod with a high ground resolution. The overlap may be a common physical space represented in both the first and
second datasets. Examples may include a pre-processing step as part of the obtaining by the 3D acquisition unit to remove from the input datasets some non-overlapping data.
At the storage unit, the or each 3D datasets is stored as respective point clouds in a common coordinate system. The common coordinate system may comprise a workspace, e.g. a virtual workspace. The or each 3D dataset has a native coordinate system in which the respective clouds of points are defined. It may therefore be a prerequisite of the examples disclosed herein that the or each of the first and second 3D datasets are compatible with the common coordinate system of the storage unit. For example, the native coordinate system may be relative to the position of the imaging apparatus for that dataset as the origin and orientation of the imaging apparatus. The or each point cloud may be differently rotated and translated when placed into the workspace.
According to each example, the determined centre of rotation, determined angle and translational amount (in the plane) and/or the determined translational amount (in an axis) may be output. The output may comprise the determined values themselves or may comprise a dataset, e.g. a transformed dataset, e.g. a rotated and translated dataset (second example) or a translated dataset (third example). Herein, the term “aligned dataset” may refer to a dataset that has been rotated and/or translated (either in a plane or in an axis) or a dataset that has been rotated and translated in a plane and/or then translated in an axis perpendicular to the plane. Rotational and/or translational values may be determined relative to, or in, the common coordinate system and therefore may be regarded as in “common coordinate system units.” In some examples, therefore, to rotate and/or translate one of the point clouds a conversion may be performed in which the calculated amounts are converted into “point cloud units,” and then the point cloud is aligned by the converted units.
The output may therefore comprise a point cloud, rotated and/or translated in a plane and/or translated in an axis by the at least one processor. In examples with two 3D datasets, the output first point cloud may comprise an aligned point cloud, and therefore may be rotationally and/or translationally aligned with the stored second point cloud. The output may also include a copy of the stored second point cloud. For example, if the common coordinate system is different from the native coordinate system of the second 3D dataset, then in order to achieve two rotationally aligned datasets where physical entities represented in both datasets are co-aligned in the common coordinate system the apparatus may output both the aligned first 3D dataset
and the second 3D dataset as respective clouds of points in the common coordinate system. The rotationally aligned datasets may be merged and output as a single image dataset fde. By rotationally aligned datasets, it may be taken to mean that physical entities represented in both datasets are co-aligned in the common coordinate system of the two datasets. By aligning multiple adjacent or overlapping scans, a 3D dataset of the scene may be built and also the trajectory of the scanner may be determined. The output (e.g. rotated and/or translated point cloud) may be combined with another scan and/or both merged together, optionally with the degree of match, or only the degree of match may be output, optionally with the rotation and translations to achieve it, and the same for any other scans etc. with the highest degrees of match. For finding the trajectory of a mobile scanner, drone, person, vehicle, etc. the following techniques may be used: Visual SLAM reference: Taketomi, Takafumi, Hideaki Uchiyama, and Sei Ikeda. "Visual SLAM algorithms: a survey from 2010 to 2016." IPSJ Transactions on Computer Vision and Applications 9, no. 1 (2017): 1-11; Visual Odometry references: Nister, David, Oleg Naroditsky, and James Bergen. "Visual odometry." In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., vol. 1, pp. I-I. Ieee, 2004; Scaramuzza, Davide, and Friedrich Fraundorfer. "Visual odometry [tutorial]." IEEE robotics & automation magazine 18, no. 4 (2011): 80-92; Mobile phone tracking to find its trajectory: Wu, Bang, Chengqi Ma, Stefan Poslad, and David R. Selviah. "An Adaptive Human Activity-Aided Hand-Held Smartphone-Based Pedestrian Dead Reckoning Positioning System." Remote Sensing 13, no. 11 (2021): 2137; finding trajectories of mobile phones: Potorti, Francesco, Joaquin Torres-Sospedra, Darwin Quezada- Gaibor, Antonio Ramon Jimenez, Fernando Seco, Antoni Perez-Navarro, Miguel Ortiz et al. "Off-line Evaluation of Indoor Positioning Systems in Different Scenarios: The Experiences from IPIN 2020 Competition." IEEE Sensors Journal (2021); https ://ieeexplore. ieee.org/documerit/9439493 Off-line Evaluation of Indoor Positioning Systems in Different Scenarios: The Experiences from IPIN 2020 Competition ieeexplore.ieee.org; tracking of mobile phones in tube stations underground and in offices: Ma, Chengqi, Chenyang Wan, Yuen Wun Chau, Soong Moon Kang, and David R. Selviah. "Subway station real-time indoor positioning system for cell phones." In 2017 International Conference on Indoor Positioning and Indoor Navigation (IPIN), pp. 1-7. IEEE, 2017, Ma, Chengqi, Bang Wu, Stefan Poslad, and David R. Selviah. "Wi-Fi RTT Ranging Performance Characterization and Positioning System Design." IEEE Transactions on Mobile Computing (2020), Potorti, Francesco, Sangjoon Park, Antonino Crivello, Filippo Palumbo, Michele
Girolami, Paolo Barsocchi, Soyeon Lee et al. "The IPIN 2019 Indoor Localisation Competition — Description and Results." TF.F.E Access 8 (2020): 206674-206718.
The position of the or each point cloud in the common coordinate system may also be output. If this was not previously known or output, it may be output here in the common coordinate system. In some examples, the values or datasets may be stored, either as an alternative or in addition to being outputted. Any output may be in the form of an amount or a vector (e.g. indicating a direction of rotation or translation and/or an amount etc.). For example, a translational amount may be output as a direction of movement and an amount.
To determine the components of the centre of rotation in the axes defining the plane, a subset of grid cells may be removed. This subset may correspond to a boundary of the space. The subset to be removed may comprise those grid cells that are less than a predetermined threshold distance away from the nearest cell containing points, and, following their removal, the mean position of the remaining grid cells (of the 2D grid of cells comprising a number of points of a dataset which was projected onto a plane to define the 2D grid, such that the data is projected onto a plane and then the projected image divided into pixels in a grid (the cells)) may be determined. For example, if the space to be imaged is a room or tunnel etc. the walls of said room or tunnel may correspond to a boundary of that space, with those cells corresponding to those walls being removed in this step. The grid cell may be determined that is nearest to the determine mean and the centre of that grid cell may be selected as the components of the centre of rotation in the axes defining the plane. To remove the subset of the grid cells (e.g. the physical space that was imaged), a distance map may be created for each grid cell by assigning a value to each cell, the assigned value representing a distance between the next cell having points therein (which may be a measure of how close that cell is to the boundary of the first space). Either, those grid cells having a distance value less than a predetermined threshold may be removed or it may be determined which grid cells have distance values that are local maxima and the remaining grid cells may be removed.
The coordinate of the centre of rotation in an axis perpendicular to the plane (thereby defining a normal to the plane) may be determined. To do so, for all of the points in the point cloud that correspond to the projected points contained in the grid cell nearest to the determined mean, the range of coordinates in the perpendicular axis may be determined and it may be further determined whether the range is less than a predetermined threshold and the coordinate of the
centre of rotation in the perpendicular axis may be determined as the mean value of all the coordinates of the points on the perpendicular axis. Optionally, the determined mean may be summed with a parameter that is proportional to the height of the imaging apparatus that took the readings of the space. If the range is not less than the predetermined threshold then it may be determined which points have the respective highest and lowest coordinate in the perpendicular axis. The midpoint between these two values may be determined and then the midpoint may be selected as the coordinate of the centre of rotation in the perpendicular axis.
To determine the first angle and translational amount in the plane, a grid of cells (as described above) contained in the plane onto which the datasets were projected may be defined for each projected dataset, if not defined already depending on the example. Each cell in each respective grid may be of a predetermined size. Each cell in each respective grid may comprise a number of points in one or both of the respective datasets. Optionally a filter may be applied to the projected datasets to produce filtered datasets. According to one example filter, for each cell in each respective 2D dataset having a number of points of the respective point cloud below a predetermined threshold, the number of points in the cell may be increased and/or for each cell in each respective 2D dataset having a number of points of the respective point cloud above a predetermined threshold, the number of points in the cell may be decreased. According to another filter, only those grid cells with points from each of the point clouds within it may be selected for further processing, in other words, those grid cells constituting an overlap region between the point clouds may be chosen.
In one example, to determine the first angle and translational amount in a plane, one of the projected datasets may be iteratively rotated about the axis of rotation by a predetermined amount at each iteration until the dataset has been rotated by 360 degrees. After each iteration a spatial correlation may be performed in the plane to determine a degree of offset between the rotated projected dataset and the un-rotated projected dataset. Based on the iterative rotations and spatial correlations, an angle and a translational amount for which a degree of match between the rotated projected dataset and the un-rotated projected dataset is the largest may be determined as the first angle and translational amount in the plane. The degree of offset may be determined based on the position of the highest correlation peak, e.g. peak correlation value, or based on the ratio of the peak to average level, or degree of match (e.g. best degree of match), or may be performed by any other degree of match method. The degree of match may be output or stored.
Following the determination of the overlap region as part of the determination of the translational amount in an axis, 0 may be determined for the translational amount in the axis if the overlap region is 0 (or empty, e.g. comprises no such cells for which there is overlap). Alternatively, the points in the point cloud to be translated having the largest and lowest values in the axis are determined and recorded for each cell, and the points in the other point cloud having the largest and lowest values in the axis are determined and recorded for each cell. Thereby, for each cell, two ranges, one for each point cloud, are determined and may be compared. For each cell, a midpoint for each range may be determined and then a translational amount in the first axis to align the midpoints may be determined, for each cell. The mean of these amounts may be determined as the translation amount in the axis.
By way of another alternative, for each cell, the point in the point cloud to be translated and having the highest (or lowest) value in the axis may be determined and the point in the other point cloud having the highest (or lowest) value in the axis may be determined. Then, for each cell, a translational amount in the axis may be determined to bring the highest (or lowest) values into alignment. The mean of these may be selected as the translation amount (or degree of translation) in the axis.
For each grid cell in the overlap region a measure of the distance between the respective ranges may be determined and those ranges for which the distance is above a predetermined threshold may be discarded. The above processes may then be performed for those remaining, un discarded, ranges.
By way of another alternative, for each grid cell in the overlap region, the projection of each point in the point cloud to be translated and the other point cloud onto the axis may be determined, forming respective histograms in the axis for each point cloud. A translation amount (or degree of translation) to bring the histograms into alignment may then be determined, for each cell. The most common translation amount may then be determined as the translation amount in the first axis.
By way of another alternative, for each grid cell in the overlap region, determine the surface normal vectors for each point in the point cloud to be translated and the other point cloud to form sets of surface normal vectors for each point cloud. The projections of each set of vectors
may be determined onto a plane (whose normal extends parallel to the axis) may be determined and, then, those projections further projected to the first axis, forming projected histograms for each point cloud. For each cell, an amount to bring the histograms into alignment may be determined to form a degree of match and the maximum degree of match may be determined as the translation amount in the first axis.
Optionally, in this example, the absolute value of each component of each surface normal vector may be determined along the first axis and those vectors whose absolute values are above (or below, depending on the implementation) a predetermined threshold may be discarded, and the above processes may be performed for the remaining, un-discarded vectors or remaining points corresponding to the un-discarded vectors.
In some examples, once the centre of rotation has been determined (e.g. its components in first, second, and third axes determined), it may then be determined whether the centre of rotation lies within a minimum bounding box of the point cloud of which it is the centre. If not, the centre may be projected into the minimum bounding box and this projected centre used as the centre of rotation (e.g. any coordinates thereof).
In some examples, the or each point cloud may be filtered, e.g. prior to performing the process of any of the examples - for example prior to performing the translational alignment of the third example and/or prior to performing the alignment (rotational and translational) in the plane according to the second example. By way of one example filtering process, for each point in the or each point cloud a surface normal vector may be determined and their projections onto an axis perpendicular to the plane (e.g. the plane onto which a point cloud was projected to calculate the centre of rotation or the plane onto which point clouds were projected to calculate one of the translation amounts) may be determined to resolve the component of each vector along the first axis. The absolute value of each component of each vector along the first axis may be determined and those points whose vectors have absolute values below a predetermined threshold may be discarded, the un-discarded points forming the filtered point cloud.
In one example, where one point cloud has been transformed to become an aligned point cloud (e.g. rotated and/or translated etc.) (e.g. an aligned first point cloud), a line may be projected from either a centre of the point cloud, or an origin of the point cloud, or from a point corresponding to a location of the imaging apparatus that took the readings of the space
corresponding to the point cloud (applicable if the imaging apparatus was static or was moving, in which case the location may be on a trajectory of the imaging apparatus), the line being projected to a point on the aligned point cloud. It may then be determined if the line intersects a point in the other point cloud (e.g. a second point cloud), e.g. to ensure that there are no walls or objects, after alignment, which block the original view. In another example, a line from each point may be projected as long as the distance from the scanner and in the direction of the scanner, and for each point a distance and an azimuth and an elevation angle may be recorded, if the format of the data has this information. In this example, a line can be projected back to the scanner and if it intersects any point from the other point cloud before it gets to the scanner then this can be output. In other words, this could be done in reverse by projecting from the recorded points back in the recorded angular direction and by the recorded distance to the scanner.
It will be appreciated that some examples herein relate to a mechanism by which a rotation and/or translation and/or further translation required to bring two clouds of points such that points representing the same features in the scanned space(s) are co-located in the common coordinate system. Thereafter, the two separate datasets may be merged and treated as a single dataset. This computation and optional subsequent merging may be performed in a computationally efficient manner described in the following paragraph without any requirement for any targets to be manually placed in the overlapping region of the imaged space, or for their positions to be surveyed, or for natural targets to be identified and their position and orientation noted, and without the need for a person to manually recognise corresponding features in the two scans and manually rotate and translate the scans to bring the corresponding features into alignment.
An example of a computationally efficient manner is described by a projection of a dataset onto a plane is effectively the collapsing of the dataset onto the plane. The projection may be along a vector normal to the plane or along a vector angled with respect to the plane or along a curve intersecting the plane. A projection of a dataset onto a line is effectively the collapsing of the dataset onto the line. The projection may be along a vector normal to the line or along a vector angled with respect to the line or along a curve intersecting the line. Any line of projection may be the Cartesian co-ordinate axes or a geographic co-ordinate system or spherical or cylindrical co-ordinates.
Examples using surface normal vectors may comprise normalising the vectors.
Examples wherein a best match is determined may comprise using a mathematical correlation function.
Herein, the term “determining” should be understood to comprise any of identifying, obtaining, calculating, measuring, choosing or selecting. As such, determining a quantity comprises calculating or measuring that quantity or, indeed, obtaining that quantity (e.g. from another entity), or identifying, choosing or selecting that quantity (e g. from a list).
As used herein, “a number” of entities (e.g. a grid cell comprising a number of points) should be understood to mean any number including zero, such that a grid of cells, each cell comprising a number of points, includes a grid of cells where some of those cells have no points.
Each of the methods described herein may comprise a computer-implemented method and/or an image processing method and/or a dataset alignment method. The individual steps of each method may be performed by one of more processors (e.g. executing machine-readable instructions). Each apparatus according to each example comprises one or more processors configured to perform tasks (e.g. executing machine-readable instructions). For example, each one of the tasks (e.g. record a projection, determine an amount etc.) may be performed by the same processor or by one processor core of a processing unit such as a CPU, or a plurality of processors may perform the set of tasks with one of the plurality performing one or more tasks and another one of the plurality performing one or more remaining tasks etc.
Description of the Figures
Examples of the disclosure with now be described with reference to the accompanying drawing in which:
Figures 1-3 are each a schematic diagram of an example apparatus;
Figure 4-14 are each flowcharts illustrating part of a method.
Figure 15 is a schematic diagram of a machine-readable medium in association with a processor; and
Figure 16 is a schematic diagram of a hardware configuration.
Detailed Description
Each of Figures 1-3 depict an example apparatus 10, 20, 30 according to one example of this disclosure, each apparatus comprising at least one processor. It will be appreciated that any one of the apparatuses 10-30 (e.g. the at least one processor thereof) may also be configured to perform the task according to any one of the other apparatuses (e g. before or after) such that any one of the apparatuses 10-30 is configured to perform the methods of any of the first-third examples, in any order.
Figure 1 shows an apparatus 10 comprising a 3D dataset acquisition unit 12 configured to obtain a first 3D dataset of a first space at a first time, the first 3D dataset being a first point cloud in three dimensions, each point in the first point cloud representing a reading within the first space being taken by an imaging apparatus. The apparatus 10 comprises a storage unit 14 that is configured to store the first 3D dataset as a first cloud of points in a coordinate system. The apparatus comprises at least one processor 16 that is configured to determine coordinates of a centre of rotation for the first point cloud in the coordinate system. The at least one processor 16 is configured to, for each point in the point cloud: record the projection of each point in the first point cloud onto a first 2D plane in the coordinate system, the first 2D plane being defined by first and second perpendicular axes in the coordinate system, each of the first and second axes being perpendicular to each other, to form a first 2D dataset, each point of the first 2D dataset being a projection of a corresponding point in the first point cloud onto the first 2D plane; define, for the first 2D dataset, a 2D grid of cells contained in the first plane, each cell being of a predetermined size and each cell comprising a number of points of the first 2D dataset; determine a mean position of the first 2D dataset or mean grid cell; select, as the components of the centre of rotation in the first and second axes, the coordinates in the first and second axes of the point that is either the mean of the 2D dataset or that is at the centre of the mean grid cell; and to store or output the determined components of the centre of rotation in the first and second axes. In other words, the apparatus 10 of Figure 1 is configured to determine a centre of rotation of a point cloud (3D dataset). Thereafter, the apparatus 10 (or at least one processor thereof) may be configured to determine a translational amount in a plane (relative to a second 3D dataset) (as per the apparatus of Figure 2) and/or a translational amount in an axis (e.g. perpendicular to the plane) (as per the apparatus of Figure 3).
Figure 2 shows an apparatus 20 comprising a 3D dataset acquisition unit 22 configured to obtain a first 3D dataset of a first space at a first time, the first 3D dataset being a first point cloud in three dimensions, each point in the first point cloud representing a reading within the first space being taken by an imaging apparatus; and to obtain a second 3D dataset of a second space at a second time, the second space and the first space at least partially overlapping, the second 3D dataset being a second point cloud in three dimensions, each point in the second point cloud representing a reading within the second space being taken by an imaging apparatus. The apparatus 20 also comprises a storage unit 24 configured to store the first 3D dataset and the second 3D dataset as, respectively, a first and second clouds of points in a common coordinate system. The apparatus 20 also comprises at least one processor that is configured to, for each of the first and second point clouds: record the projection of each point in the first and second point clouds onto a first 2D plane in the common coordinate system, the first 2D plane being defined by first and second perpendicular axes in the coordinate system, each of the first and second axes being perpendicular to each other, to form, for each respective point cloud, respective 2D datasets comprising a first 2D dataset and a second 2D dataset, wherein each point of the first 2D dataset is a projection of a corresponding point in the first point cloud onto the first 2D plane, and wherein each point of the second 2D dataset is a projection of a corresponding point in the second point cloud onto the first 2D plane; and to determine: a first angle about an axis of rotation, the axis of rotation extending in a direction perpendicular to the first 2D plane and through a point on the first 2D plane having its components in the first and second axes as the components of a centre of rotation of the first point cloud in the common coordinate system; and a first translational amount for which the first 2D dataset, rotated about the axis of rotation by the determined angle, and moved in the first 2D plane by the first translational amount, aligns with the second 2D dataset in the first 2D plane; and to: store or output the determined first angle and first translational amount and/or a degree of match; or to determine a rotated and translated first 3D dataset, the rotated and translated first 3D dataset being the first point cloud rotated, about the axis of rotation, by an amount proportional to the first angle and translated, in a direction of the first 2D plane, by an amount proportional to the first translational amount; and to: store or output the determined rotated and translated first 3D dataset and/or a degree of match. In other words, the apparatus 20 of Figure 2 is configured to determine a rotation and translation amount to align one point cloud with another in a plane. The angle of rotation about which the first point cloud is rotated in the plane may be as determined by the at least one processor 16 of Figure 1. The apparatus
20 (or at least one processor thereof) may be configured to determine a further translational amount in an axis perpendicular to the plane (as per the apparatus of Figure 3).
Figure 3 shows an example apparatus 30 comprising a 3D dataset acquisition unit 32 configured to obtain a first 3D dataset of a first space at a first time, the first 3D dataset being a first point cloud in three dimensions, each point in the first point cloud representing a reading within the first space being taken by an imaging apparatus; and obtain a second 3D dataset of a second space at a second time, the second space and the first space at least partially overlapping, the second 3D dataset being a second point cloud in three dimensions, each point in the second point cloud representing a reading within the second space being taken by an imaging apparatus. The apparatus 30 also comprises a storage unit 34 configured to store the first 3D dataset and the second 3D dataset as, respectively, a first and second clouds of points in a common coordinate system. The apparatus 30 also comprises at least one processor 36 configured to determine a first translational amount, being an amount in a first direction in the common coordinate system by which the first point cloud, moved by the first translational amount in the first direction, aligns with the second point cloud in the first direction. The at least one processor 36 is configured to: record, for each of the first point cloud and the second point cloud, the projection of each point in the respective point clouds onto a first 2D plane in the common coordinate system in the first direction, the first 2D plane having a normal vector parallel to the first direction, to form respective first and second 2D datasets, each point of the respective first and second 2D datasets being a projection of a corresponding point in the respective first point cloud and the second point cloud onto the first plane; define, for each of the first and second 2D datasets, a grid of cells contained in the first plane, each cell of each respective grid being of the same predetermined size, each cell in each respective grid comprising a number of points (which may be 0 since some cells may not contain any points) of the respective first and second datasets; determine an overlap subset of grid cells defining an overlap region, being those grid cells that contain a point from both the first and second 2D datasets; and if the overlap subset contains no such cells and the overlap region is empty, then the at least one processor is configured to: determine 0 for the first translational amount; and, otherwise, the at least one processor is configured to, for each cell in the overlap region, either: determine the point in the first point cloud having the largest value in the first axis and the point in the first point cloud having the lowest value in the first axis; and record these largest and lowest values, these values defining a first range for the first point cloud, for each cell; and determine the point in the second point cloud having the largest value in the first axis and
determine the point in the second point cloud having the lowest value in the first axis; and record these largest and lowest values, these values defining a second range for the second point cloud, to thereby define, for each cell in the overlap region, first and second ranges, the first range having as its endpoints the lowest and largest values in the first axis for the points of the first point cloud contained in the cell, and the second range having as its endpoints the lowest and largest values in the first axis for the points of the second point cloud contained in the cell; determine, for each cell, a midpoint for the first range and a midpoint for the second range; determine, for each cell, a translation amount in the first direction to bring the midpoint of the first range into alignment with the midpoint for the second range; and determine the second translation amount to be the mean value of the translation amounts; or, alternatively, for each cell in the overlap region: to determine the point in the first point cloud having the highest value in the first axis or having the lowest value in the first axis; determine the point in second point cloud having the highest value in the first axis or having the lower value in the first axis; determine, for each cell, a translation amount in the first direction to bring the highest value for the first point cloud into alignment with the highest value for the second point cloud or a translation amount to bring the lowest value for the first point cloud into alignment with the lowest value for the second point cloud; and determine the first translation amount to be the mean value of the translation amounts; and, in either case: store or output the determined first translational amount; or to: determine an translated first 3D dataset, the translated first 3D dataset being the first point cloud translated, in the first direction, by an amount proportional to the second translational amount; and to: store or output the determined translated first 3D dataset. The apparatus 30 (or at least one processor thereof) may be configured to determine a further rotation and translational amount in a plane whose normal vector is perpendicular to the first axis (as per the apparatus of Figure 2), in which case the determined centre of rotation may be as per the apparatus of Figure 1.
Each of the apparatuses 10-30 may be configured to perform the method according to any of Figures 4-14, e.g. any of the blocks/steps thereof. For example, each of the at least one processors of the apparatuses 10-30 may be configured to perform the method according to any of Figures 4-14, e.g. any of the blocks/steps thereof. Figures 4-14 will be described later.
Examples of the hardware described with reference to Figures 1-3 (e.g. the storage unit, processor, imaging apparatus, scan acquisition unit etc.), including various concepts (such as readings, imaging a space, overlap, match etc.) are set out, and may be according to, the patent
publication WO 2018/138516 A1 (PCT/GB2018/050233), titled “apparatus, method, and system for alignment of 3D datasets,” the entire disclosure of which is incorporated by reference.
Referring back to Figures 1-3, any of the scan acquisition units 12-32 may comprise a functional component to obtain one or more datasets, e g. from an imaging apparatus that took a scan or from another device (e g. storing the datasets but that did not take the scan). Any of the scan acquisition units 12-32 may comprise a programmed CPU and/or a GPU and/or a FGPA for real-time processing or an array of them together with an array of memory as in a computer cloud for parallel processing, and I/O (input/output) hardware configured to receive or read the or each dataset volatile or non-volatile memory for storage of processing instructions and image data during processing.
In some examples, each of the apparatuses 10-30 may comprise the imaging apparatus. The imaging apparatus, or the apparatuses 10-30 may be to use any of the following methods to calculate or derive data: shape from focus, shape from shadow, shape from texture, shape from silhouette, shape from shading, shape from template, shape from symmetry, shape from varying illumination and viewpoint, shape from movement or optical flow, shape from interaction, shape from positional contrast, shape from defocus, virtual 3D computer generated images, in computer aided design, computer generated models or plans. Some methods that may be used by the imaging apparatus are: SIFT (V. Vijayan and P. Kp, 2019), SURF (H. Bay, et al., 2006), FAST, Harris comer points, ORB. The best and fasted may be ORB (Dong, Pengfei, et al., 2019), (R. Mur-Artal, et al., 2015) which detects keypoints using Features from Accelerated Segment Test (FAST) (R. Mur-Artal and J. D. Tardos, 2017) and the orientation and colour information is stored in BRIEF descriptors (M. Calonder, et al , 2012). Recently there has been some research on finding lines (Ma, Jiayi, et al., 2019) (R. Wang, et al., 2018), (R. G. V. Gioi, et al., 2008), (Y.-H. Choi, et al., 2008), line intersections (Li, Xianlong, et al., 2018) or planes (Ming, H., et al., 2017) instead of points. Lines can be found using the Fast Line Detector (FLD) and matched by comparing their corresponding Line Band Descriptor (LBD), and RANS AC, Random Sample Consensus, to remove outlier matches and keeps inlier matches (Li and Selviah, 2011). The imaging apparatus may be an apparatus for generating photogrammetric scans, which use two or more adjacent or side by side or slightly spaced cameras, or two views from a single moving camera. When using a 2D camera to record images for later conversion to 3D it may be calibrated according to the following method of real time
calibration: Shawash, Janti, and David R. Selviah. "Real-time nonlinear parameter estimation using the Levenberg-Marquardt algorithm on field programmable gate arrays." IEEE Transactions on industrial electronics 60, no. 1 (2012): 170-176. Then the slightly different spacing of features in one photo from those in the other are used to work out the depth of each feature. The output format is pointcloud as for LIDAR.
The imaging apparatus may be an apparatus for projection mapping: This makes use of projected structured light patterns which are projected onto the 3D scene. Examples include Kinect and Hololens. The patterns projected may be lines or grids or random spots. The scene is then viewed using a camera often with a wide angle lens. The light projected is usually infrared so that it is not visible to the human eye. Other formats such as unsigned shorts generated from a video stream, or meshes, may be converted into point clouds as a pre processing step and then input to examples. For projection mapping, the following methods may be used: Endres, Felix, Jiirgen Hess, Jiirgen Sturm, Daniel Cremers, and Wolfram Burgard. "3-D mapping with an RGB-D camera." IEEE transactions on robotics 30, no. 1 (2013): 177-187; Huang, Albert S., Abraham Bachrach, Peter Henry, Michael Krainin, Daniel Maturana, Dieter Fox, and Nicholas Roy. "Visual odometry and mapping for autonomous flight using an RGB-D camera." In Robotics Research, pp. 235-252. Springer, Cham, 2017; Engelhard, Nikolas, Felix Endres, Jiirgen Hess, Jiirgen Sturm, and Wolfram Burgard. "Real time 3D visual SLAM with a hand-held RGB-D camera." In Proc. of the RGB-D Workshop on 3D Perception in Robotics at the European Robotics Forum, Vasteras, Sweden, vol. 180, pp. 1-15. 2011.
The data formats that are possible for the input 3D datasets include one or more from among .fls, ptx, pts, . ply, zfs, rdbx, res, leica format, ptl, e57, xyz, pod, wrl, obj, las, laz which are typically used for surveying applications and DICOM .dcm, PAR/REC, .ima, NIFTI, which are typically used for medical applications and other formats for other applications. Data formats may be Cartesian or in spherical angular and radial co-ordinates or in other co ordinates. Suitable point cloud structures, imaging apparatuses, and data formats that are possible for the input 3D datasets are set out in WO 2018/138516 Al, the entire disclosure of which are incorporated by reference. The following site gives information about building data formats: https://info.vercator.com/blog/what-are-the-most-common-3d-point-cloud-file- formats-and-how-to-solve-interoperability -issues.
The or each dataset may comprise a point cloud, or cloud of points, which may be derived direct from a laser scanner, and/or which may comprise mesh, voxel, CAD data such as IFC which has been converted into a point cloud. The point cloud may have been derived from photogrammetry using one or more cameras, or from structured light projection as in RGB-D cameras (such as Kinect vl or v2, Orbbec Astra Pro, Google Project Tango,), in which the depth image has been converted into a point cloud and the RGB image projected onto it, or by LIDAR laser scanner, or by a mobile Time of Flight scanner or by a laser interferometric scanner.
Other methods to calculate 3D point clouds include at least one of: shape from focus, shape from shadow, shape from texture, shape from silhouette, shape from shading, shape from template, shape from symmetry, shape from varying illumination and viewpoint, shape from movement or optical flow, shape from interaction, shape from positional contrast, shape from defocus, and/or virtual 3D computer generated images, in computer aided design, computer generated models or plans. The dataset(s) described herein may be taken by an imaging apparatus including an optical 3D scanner. The imaging apparatus may comprise any one of: a handheld LIDAR scanner; a static tripod based LIDAR scanner; a ground penetrating RADAR, or RADAR and a doppler RADAR; a 3D mobile LIDAR, or a 2D LIDAR from which 3D datasets are generated; an Electrical Resistivity Tomography (ERT) or electrical resistivity imaging (ERI) scanner; a CT (computerised tomography) or CAT (computerised axial tomography) scanner; a positron emission tomography (PET) scanner; an MRI (magnetic resonance imaging) scanner; a scanner that is configured to detect radioactive decay emission points; a nuclear quadrupole scanner; a 3D terahertz wave scanner; a projection mapping scanner based on structured light patterns; a photogrammetric scanner using spaced-apart 2D cameras; a 3D ultrasound scanner; a 3D seismic scanner; a 3D sonar scanner; optical interferometer; photogrammetry; projection imaging; surface 3D profiling instruments such as Alphastep, Talystep, Atomic Force Microscope, SNOM; scanning focal plane, Z-stack.
The or each dataset may comprise a prerequisite condition that it/they are compatible with being stored as a point cloud in a common coordinate system (common in this context meaning common to both datasets) and that, in the case of aligning two datasets, that the datasets are of spaces that are at least partially overlapping. The first and second datasets may be obtained from the same imaging apparatus, or different imaging apparatuses.
The first and second 3D datasets may be transmitted, for example, from an imaging apparatus or pre-processor, and received by the 3D dataset acquisition units 12-32. Alternatively, the first and second 3D datasets may be stored in physical storage at an address accessible to the 3D dataset acquisition unit 12-32 for retrieval thereby. The processor may be inside the imaging apparatus.
The 3D dataset acquisition units 12-32 may be configurable by an apparatus user, to enable specification of a storage location from which to obtain the 3D datasets, such configuration being, for example, via an interface. Alternatively, the 3D dataset acquisition unit may comprise a storage location or interface to which 3D datasets are submitted or transmitted by a user of a imaging apparatus (user here means a imaging apparatus operator or any party analysing or processing the 3D datasets produced by the imaging apparatus).
The illustrated interconnections between the 3D dataset acquisition unit 12-32 and the storage unit 14-34 may represent the submission of the obtained first and second 3D datasets to the storage unit 14-34 by the respective 3D dataset acquisition unit 12-32. The first and second 3D datasets are stored as clouds of points in a common coordinate system by the storage unit 14-34.
The obtained 3D datasets are clouds of points in three dimensions. The 3D dataset acquisition unit 12-32 may execute processing on the obtained 3D datasets to define the respective clouds of points in a common coordinate system (common here denotes common to the two 3D datasets). Any output by the at least one processor 16-36 for example a value, vector, or a rotated and/or translated dataset may be transmitted to the storage unit 14-34 for storage thereby.
The storage units 14-34 are configured to store the first 3D dataset and the second 3D dataset as respective clouds of points in a common coordinate system and may comprise a volatile or non-volatile storage hardware that is accessible to the 3D dataset acquisition unit 12-32 and the at least one processor 16-36. The storage units 14-34 may comprise a controller or management unit to control read and write accesses to stored data.
The common coordinate system need not be expressed in the same way for both clouds of points. However, there is a defined spatial and angular relationship between the coordinate system in which the two clouds of points are stored, so that whatever the expression of the clouds of points in storage, they are defined in a common coordinate system.
The methods of Figures 4-14, each individual blocks of which, alone or in combination, may be performed by the apparatuses of Figures 1-3 will now be described. They may comprise computer-implemented methods and/or methods of aligning datasets and/or imaging processing methods.
Figure 4 is a flowchart illustrating a method which may be performed by any of the apparatuses described with respect to Figures 1-3. The method of Figure 4 comprises a method of determining the coordinates of a centre of rotation of a point cloud. At 402 a first 3D dataset is obtained. The first 3D dataset is a first point cloud in three dimensions, with each point in the point cloud representing a reading within the first space being taken by an imaging apparatus. At 404 the first 3D dataset is stored as a first cloud of points in a coordinate system. The method comprises, at 406, determining the coordinates of a centre of rotation for the first point cloud in the coordinate system, by, for each point in the point cloud which comprises 408-416.
At 408, the method comprises recording the projection of each point in the first point cloud onto a first 2D plane in the coordinate system, the first 2D plane being defined by first and second perpendicular axes in the coordinate system, each of the first and second axes being perpendicular to each other, to form a first 2D dataset, each point of the first 2D dataset being a projection of a corresponding point in the first point cloud onto the first 2D plane. Therefore, the point cloud is projected onto a plane. The plane may be regarded as a “horizontal” plane and the first and second axes defining the plane may be regarded as the x and y axes, according to one example implementation of a Cartesian coordinate system. The direction of projection for the point cloud onto the plane may be the direction normal or perpendicular to the plane (e.g. along the z direction) or may be along a different line, e.g. a non-straight or angled line with respect to the plane etc.
At 410, the method comprises defining, for the first 2D dataset, a 2D grid of cells contained in the first plane, each cell being of a predetermined size and each cell comprising a number of points of the first 2D dataset (that number may include 0 since some cells may not comprise
any projected points). At 410 a square grid of pixels is effectively placed over the points to create a sparse 2D grid in the plane, where cells inside the grid cell contain points in the point cloud which have “height” in the direction perpendicular to the plane (e.g. the z-direction if the plane is an (x,y) plane). The length and width of each square cell may be chosen as 0.25m.
At 412, the method comprises determining a mean position of the first 2D dataset or mean grid cell. At 412 the method therefore comprises calculating the mean position of all remaining grid cells. At 414, the method comprises selecting, as the components of the centre of rotation in the first and second axes, the coordinates in the first and second axes of the point that is either the mean of the 2D dataset or that is at the centre of the mean grid cell. At 416, the method comprises storing or outputting the determined components of the centre of rotation in the first and second axes.
Referring to Figure 7 which is a flowchart illustrating a method which may be performed by any of the apparatuses described with respect to Figures 1-3, and which may be performed in conjunction with the method of Figure 4, determining then centre of rotation may comprise steps 702-708, optionally where step 708 comprises steps 710-714. At 702, the method comprises removing a subset of the grid cells, the removed subset may correspond to a boundary of the first space (e.g. a wall). The subset of grid cells to be removed may comprise those that are less than a predetermined threshold distance away from their nearest cells containing points. At step 704 the method comprises determining the mean position of the remaining grid cells. At step 706 the method comprises selecting the grid cell that is nearest to the determined mean. At step 708 the method comprises determining the centre of the grid cell that is nearest to the determined mean. At step 710 the method comprises selecting, as the components of the centre of rotation in the first and second axes, the coordinates in the first and second axes of the centre of the grid cell that is nearest to the determined mean.
Optionally, removing 702 a subset of the grid cells comprises, at 710, creating a distance map for each cell in the grid by assigning a value to each cell, the assigned value representing a distance between that cell and the nearest cell containing points; and either removing those grid cells having a distance value that is less than a predetermined threshold 712 or determining 714 which grid cells have distance values that are local maxima, and removing the remaining grid cells.
A mean grid cell position could correspond to a location outside the physical space that was scanned (e g. it could correspond to an inaccessible location beyond the walls of a curved tunnel) and so a point that is part of the original dataset closest to the mean position may be selected to ensure that the chosen point is within the dataset. In other words, the grid cell that is closest to the calculated mean position of the cell centre of gravity may be chosen and the centre of this pixel may be used as the centre of rotation. If the plane is an (x,y) plane then the x, y co-ordinate values of this cell are used as the sensor location’s (x,y) coordinates and therefore the (x,y) coordinates of the centre of rotation.
Step 710 of creating the distance map may comprise applying a distance transform and may, as a result, distinguish between locations that are near to, or internal too, the centre of the scanned environment and the edges of that environment. According to one example, for each pixel in turn the distance to the nearest pixel containing a point may be calculated in units of pixels, and the distance map may label the points or voxels furthest from the walls of the tunnel with the highest number (as being most inside) and labels the points or voxels towards the walls with a lower number (as being less near the centre) of the space. Cells at edges of the scanned regions may contain low values (e.g. 1) whereas internal cells have a larger value, with the value increasing with the distance to the edges. Removing grid cells may comprise removing cells that have a distance value of 1 and/or less than a predetermined threshold (e.g. less than 2 or 3 etc.) and/or that have neighbouring cells with a larger value) (step 712). By applying a threshold to the output of the distance map operation the points sufficiently inside and near the centre of the space being scanned may be determined and used for further processing. As an alternative (step 714) pixels which are local maxima may be chosen and discarded, which leaves the cells located near the centre of the scanned space.
The methods as described above may comprise sparsely sampling the point cloud data to reduce the amount of data that needs to be processed in subsequent steps. For example, the method may comprise applying a KDtree or Octree method to sparsely sample the data. The resulting sampled point cloud may comprise a more uniform spacing between points, for example 2.5cm or 5cm.
In examples, the point cloud may be divided into 0.5 metre voxels and/or or the projected 2D dataset and/or grid of cells may be divided into 0.5 metre pixels. The cell grid may then comprise a 2D grid of 601x601 pixels. The centre of rotation determined by Figure 4 may also
be alternatively termed a “centre of mass” or “centre of inertia” or “centre of density” or “centre of gravity” of the point cloud, and may be the mean of the point cloud. It may be determined by choosing one of the co-ordinate axes in the workspace and then calculating the distance from the axis to each point in the point cloud, multiplied by the strength of that point and then add both together. This is then made equal to the distance of the “centre of gravity” multiplied by the sum of all the strengths of all of the points in the point cloud. This may then be repeated for each co-ordinate axis to find the co-ordinates of the “centre of gravity”. The strength of each point in the point cloud could be 1, or unity. Alternatively they could be weighted according to the intensity of the reflected signal from them. This latter method could be used in examples where there is a lot of dust or smoke. Weak reflections could be filtered first. If the space is pixelated or voxelated space then this could be performed for every pixel or voxel and the strength may then be the number of points in that pixel or voxel.
According to some examples, the method may proceed to determine a value of the centre of rotation in the axes perpendicular to the plane on which the dataset was projected, e g. considering the plane as an (x,y) plane, the method may determine a coordinate of the centre of rotation in the z-axis.
Figure 8 is a flowchart illustrating such a method 802 which may be performed by any of the apparatuses described with respect to Figures 1-3, and which may be performed in conjunction with the method of Figure 4 (e.g. in conjunction with the Figure 7 method). Method 802 is a method if determining the coordinate of the centre of rotation in a third axis. The third axis is perpendicular to the first and second axes (defining the first 2D plane) and so the third axis defines a direction normal to the first 2D plane, wherein. For example, if the plane is an (x,y) plane then the third axis is the z-direction and the method 802 is a method of determine the coordinate of the centre of rotation in the z direction. To determine the coordinate of the centre of rotation in the third axis, for the points in the first point cloud that correspond to the proj ected points contained in the grid cell that is nearest to the determined mean, at 804 it is determined whether the range of coordinates in the third axis is less than a predetermined threshold. If they are (yes), the method comprises, at step 806, determining the coordinate of the centre of rotation in the third axis to be the mean value of all the third axis coordinates of the point. If otherwise (no), the method comprises, at step 810, determining the point having the highest value of the coordinate in the third axis and determining the point having the lowest value of the coordinate in the third axis and then, at 812, determining the midpoint between the lowest and highest
value. In this case, the midpoint may be selected as the coordinate of the centre of rotation in the third axis.
In other words, to determine the value in the third axes, it may be determined whether all scan coordinates are contained in a single plane (this may be determined at step 804 by determine whether the range of values in the third axis is less than the grid cell size parameter). Step 804 may therefore comprise determining whether all scan coordinates are contained in a single plane. If all scan coordinates are determined to be contained in a single plane then the cell may be determined to correspond to an outdoor location and/or to a part of a physical space that has no roof or ceiling. In this case, the third axis value can be taken as the mean value of all of the point (which will be approximately equal to the ground level in the cell). As represented by the dotted step 808, the method could optionally comprise summing the determined mean (at 806) with a parameter proportional to the height of the imaging apparatus that took the readings of the first space. The imaging apparatus offset parameter (e.g. scanner height offset parameter) may be 1.5m in some examples but, of course, this may depend on the height of the imaging apparatus that took the readings of the space. If the scan coordinates within the cell are determined not to be contained in a single plane (e.g. the range of third axis values is larger than the grid cell size parameter), then the third axis value may be set as the half-way point between the lowest and highest point within that grid cell. This is approximately equal to the half-way point between a floor (or ground) and a ceiling (or upper surface) in a room or space.
Figure 13 is a flowchart illustrating a which may be performed by any of the apparatuses described with respect to Figures 1-3, and which may be performed in conjunction with the method of Figure 8. Once the third axis value has been determined it may be then determined, at 1302, whether the determined coordinate of the centre of rotation in the coordinate system lies within a minimum bounding box of the first point cloud or of the first translated point cloud. If not, at 1304, the method comprises projecting the determined centre of rotation into the minimum bounding box. The projected centre of rotation may then be selected as the new coordinates of the centre of rotation. The Figure 13 method may be useful in the case of a scan having no roof or upper surface (such as an outdoor scan) or of a scan of a very flat environment where the elevation changes in the environment are less than the scanner height offset.
The parameters mentioned above (0.25 m for the grid cell size and 1.5 m for the offset parameter) are exemplary only and may be implementation specific. Some examples may use
these values as “base” parameters and then scale them by respective scaling factors to account for smaller or larger scale scans (e g. aerial scans of cities or forests, etc. or a scan of a doll house etc.). Any such scaling may be performed automatically or by reading the scale of a scan and finding the maximum dimensions of the scan in the datafile input. If the scanned environment type is not known, the grid cell size parameter may be set to a value that is larger (e.g. by a factor of 10) than the typical scan point-to-point distance, which can ensure that the created 2D grid is contiguous, without many “holes” and separated regions (due to low scan point density). However, the value should be small enough that it is possible to distinguish between internal and external grid cells (in case of a value that is too large, all cells would contain a distance value of 1, i.e., they would all be edge cells).
Some of the methods below rotate a 3D dataset about a centre of rotation. In these examples, the centre of rotation may be as determined according to Figure 4 or may be a different centre of rotation. For example, if the scan is a static scan then there may be a known location of the sensor and a point that corresponds to this known sensor location may be selected as the centre of rotation. Some examples may comprise a method of selecting a centre of rotation for a point cloud, this method comprising determining whether the scan (3D dataset) contains structure information. If yes, the scan may be assumed to be static and the sensor location of (0,0,0) (e.g. the sensor location corresponds to the origin of the data may be chosen as the centre of rotation). If no, it may be determined whether a fde format of the data is assumed to be structured. If yes, as above (0,0,0) may be chosen. If not, the scan may be assumed to be mobile such that the sensor location is not fixed and therefore unknown and, in these examples, either the sensor location may be estimated or the method of Figure 4 may be used to determine a centre of rotation. Thus, the Figure 4 example is particularly useful when the imaging apparatus was mobile and/or when the data is unstructured.
Before a centre of rotation is determined, methods may comprise pre-processing scan data. This may comprise reading a raw scan file (e.g. provided by a user). It may be determined, based on the read data and the scan file extension, whether the scan is structured or unstructured. If the data has structure information, the method may determine that the scan is static. Otherwise the method determined that the scan is mobile.
A summary of some file formats, and whether they are treated as static or mobile, is shown in the table below:
Scans in E57 format can be either structured or unstructured. For this format, the method may decide which centre of rotation to choose (e.g. according to the Figure 4 method or choose the origin of the data) based on the data inside the scan after reading the scan.
Structured (assumed static) scans may have their centre of rotation selected as (0,0,0) whereas unstructured scans (assumed to be mobile) may have their centre of rotation selected as per the method of Figure 4. The centre of rotation may be considered identical with the location of the imaging apparatus. Once a centre has been chosen, pre-processing may be performed according to a method as described above. This may comprise a point filtering operation and/or a surface normal calculation, where the determined centre is chosen as a reference point. For example, in methods comprising point distance threshold filtering, all points that are further than a predetermined threshold from the determined centre may be removed. For a static scan, the spacing of points reduces the further from the scanner they are recorded, so to maintain resolution, any points further than a predetermined threshold distance away from the scanner may be excluded. Mobile scanners may utilise a SLAM algorithm to remove distance points. As stated above, for a 3D dataset to be rotated a fixed, well-defined, location should be chosen. For static scans, the origin (0,0,0), or scanner/sensor location may be chosen but for mobile scans, the Figure 4 process may be used.
The Figure 4 process has the following advantages. A centre of rotation within a bounding box of the scan points may be chosen (see the method of Figure 13). The chosen centre of rotation will be approximately at the centre of the scan rather than near the edge. The chosen centre may represent a physical location that is accessible to the scan operator in the physical space, e.g. inside a room or corridor rather than located in a wall or “in air” or in a region that has not been scanned. It also provides a more “realistic” sensor location (e.g. from which a 360 degree or “bubble” view image may be created). It also means that when the point cloud data is placed into a virtual computer workspace (e.g. the common coordinate system), then the workspace only needs to be as long as the length of the scan for the rotation to be determined, and a higher resolution and level of detail can be retained.
In examples where surface normal vectors are calculated, the surface normal vectors may be made consistent with respect to the sensor location (initially the vectors may be randomly flipped by 180 degrees as a result of the Principal Component Analysis, PC A, algorithm). For example, in the case of a simple room, the surface normal vectors should point inwards to the room, but if the sensor location is outside the room, some vectors may point inwards and some outwards. In the case of a static terrestrial laser scanner on a tripod, the origin for measurements may be the centre of azimuthal and elevational rotation and all co-ordinates are assumed to be Cartesian or can be transformed into Cartesian co-ordinates relative to this origin. The original co-ordinates may be spherical in azimuthal and elevational angle, but these are often transformed into other co-ordinates in the scanner itself.
Figure 5 is a flowchart illustrating a method which may be performed by any of the apparatuses described with respect to Figures 1-3. The method of Figure 5 comprises a method of determining an aligned dataset, e g. an aligned first point cloud, with respect to a second dataset, and/or may comprise a method of aligning a point cloud in a plane, and/or a method of rotating and translating a point cloud in a plane, and/or a method of determining a rotational and translational amount to align a point cloud in a plane. At 502, respective first and second 3D datasets are obtained. The first 3D dataset is of a first space at a first time and is a first point cloud in three dimensions, each point in the first point cloud representing a reading within the first space being taken by an imaging apparatus. The second 3D dataset is of a second space at a second time, the second space and the first space at least partially overlapping, and is a second point cloud in three dimensions, each point in the second point cloud representing a reading within the second space being taken by an imaging apparatus. At 504 the method comprises
storing the first 3D dataset and the second 3D dataset as, respectively, a first and second clouds of points in a common coordinate system. At 506 the method comprises, for each of the first and second point clouds recording the projection of each point in the first and second point clouds onto a first 2D plane in the common coordinate system, the first 2D plane being defined by first and second perpendicular axes in the coordinate system, each of the first and second axes being perpendicular to each other, to form, for each respective point cloud, respective 2D datasets comprising a first 2D dataset and a second 2D dataset, wherein each point of the first 2D dataset is a projection of a corresponding point in the first point cloud onto the first 2D plane, and wherein each point of the second 2D dataset is a projection of a corresponding point in the second point cloud onto the first 2D plane. At 508 the method comprises determining a first angle about an axis of rotation, the axis of rotation extending in a direction perpendicular to the first 2D plane and through a point on the first 2D plane having its components in the first and second axes as the components of a centre of rotation of the first point cloud in the common coordinate system; and a first translational amount for which the first 2D dataset, rotated about the axis of rotation by the determined angle, and moved in the first 2D plane by the first translational amount, aligns with the second 2D dataset in the first 2D plane. At 512 the method comprises either storing or outputting (at 514) the determined first angle and first translational amount; or determining (at 5160 a rotated and translated first 3D dataset, the rotated and translated first 3D dataset being the first point cloud rotated, about the axis of rotation, by an amount proportional to the first angle and translated, in a direction of the first 2D plane, by an amount proportional to the first translational amount and then storing or outputting (518) the determined rotated and translated first 3D dataset.
Referring to Figure 9, which is a flowchart illustrating a method which may be performed by any of the apparatuses described with respect to Figures 1-3, and which may be performed in conjunction with the method of Figure 5, a method comprises, at 902, defining, for the first and second 2D datasets, respective 2D grids of cells contained in the first 2D plane, each cell in each respective grid being of a predetermined size and each cell comprising a number of points of the respective first or second 2D dataset. As indicated by the dotted box, the defining a grid may be regarded as an optional step. At 904 the method comprises applying a filter to the respective datasets to produce respective filtered datasets. As indicated by the dotted box, the defining a grid may be regarded as an optional step. Any such filter may be used to derive the or each filtered dataset. One example filter is: for each cell in each respective 2D dataset having a number of points of the respective point cloud below a predetermined threshold, increasing
the number of points in the cell; and for each cell in each respective 2D dataset having a number of points of the respective point cloud above a predetermined threshold, decreasing the number of points in the cell.
The Figure 5 method results in a plan-view image being created for the point clouds, and this may be calculated as a number of pixels, or an image size based on a number of pixels. The size of the 2D grid(s) may be calculated based on the maximum ranges of overlapping scans (e.g. distance between the scan sensor location/determined centre of rotation (see methods above) and the farthest point) and/or the size of the grid(s) may be automatically limited to a use-defined pixel maximum. For long-range scans, each pixel corresponds to a larger distance and for short-range scans each pixel represents a shorter distance. The total image size can then never exceed some maximum value. For example, if the maximum image size has been defined as 1000 pixels, and the largest scan range is 50 metres, each pixel in the plan view images represents a square region of (2 x 50) / 1000 = 0.1 metres squared in a plane (e.g. an (x,y) plane). The square pixels may have sides of 0.5 m.
When each of the two (filtered or unfiltered) point clouds onto a (horizontal) plane to obtain the two 2D datasets, effectively two plan view images are obtained (as a note, this is similar to the method of Figure 4 according to which it was performed for one scan and either one of the plane onto which the or each dataset was projected or the projected dataset itself according to Figure 5 may comprise the same plane/projected dataset as that above described with reference to Figure 4. Put another way, the “first plane” and “first dataset” described above with reference to Figure 4 may comprise the “first plane” and “first dataset” as described above with reference to Figure 5). For the proj ected sets and grids, the pixel intensities represent the number of points classified as being on a vertical surface, within the plane (e.g. (x,y)) region corresponding to the pixel. That is, darker pixels may have fewer or no vertical surfaces above them, while brighter pixels may contain a higher number of points on vertical surfaces above them; thus, each grid of cells may constitute a point density plan view image. For any filtering (at 904), the pixel intensity values may be scaled according to a logarithmic scale (e.g. loglO) to reduce pixel intensity variations due to a changes in point density regions (e.g. any walls far from the scanner which have fewer points on them should be as visible as any walls that are nearby).
In examples where one or more datasets have been filtered (e g. to reduce or improve the signal- to-noise ratio) - e g. by the method of Figure 14, see the discussion below, this may, in turn, improve the signal -to-noise ratio of the 2D plan view image and create clear features or patterns on the horizontal plane for accurate alignment. Any such filtering is performed when proj ecting datasets onto a plane or onto an axis (e.g. perpendicular to the plane). The surface normal filtering may effectively remove floor and/or ceiling and/or wall points resulting in a large change in point density from vertical features and non-vertical features. In this way the filtered dataset creates features in the 2D projected plan view image which can be used for alignment, even in examples where “prior” 3D (e.g. pre its collapse to 2D) point cloud does not have distinct or reproducibly found features or keypoints.
As a result of the method of Figure 5 (optionally performed in conjunction with the method of Figure 9) overlapping plan-view images (projected sets on a plane) may have been created and a transformation for best overlap (or best degree of match) is calculated. This transformation comprises a rotation (about the first angle) and translation (by a first amount). The rotation is about an axis extending through the plane in a direction perpendicular to the plane (e.g. the z direction if the plane is the (x,y) plane).
Referring back to Figure 9, the method comprises, to determine the first angle and first translation amount, at 906, rotating the first 2D dataset about the axis of rotation by a predetermined amount and, at 908, performing a spatial correlation in the first 2D plane to determine a degree of offset between the rotated first 2D dataset and the second 2D dataset. As indicate by the looping arrow, steps 906 and 908 may be repeated, or performed multiple times, and at 910 the method comprises determining, based on the rotation(s) and spatial correlation(s), depending on the number of reptations of steps 906-908, an angle and a translational amount for which a degree of match between the first 2D dataset, rotated by the angle and translated by the translational amount, and the second 2D dataset is the largest. That angle and translational amount may be recorded as the first angle and first translational amount. In one example, step 906 is performed iteratively and step 908 is performed after each iterative (or incremental rotation). In some examples, each incremental rotation is by 1 degree. In some examples, steps 906 and 908 are repeated until the first dataset has been rotated 360 degrees to its starting position. Therefore, in one example, steps 906 and 908 are performed 360 times, with the amount of rotation in step 906 being 1 degree each time but, of course, the amount of rotation could be by any other amount.
The spatial correlation (step 908) may comprise a phase correlation and may be calculated between the rotated scan (2D scan, e.g. in plan view) and the non-rotated scan (2D scan, e.g. in plan view). The result that gives the highest correlation value may be chosen. Performing the spatial correlation (at 908) may comprise performing a 2D Fourier transform on both point clouds (the rotated and non-rotated 2D datasets), then multiplying those Fourier-transformed point clouds, then inverse 2D Fourier transforming the point clouds to obtain the spatial correlation. The position, or coordinates, in the plane (e.g (x,y) positions if the plane is considered to be an (x,y) plane) of the local maxima may be determined and recorded in some examples. In some examples, the magnitudes of the local maxima themselves may be determined and recorded. The ratio of the magnitude of the local maxima to the average level of the correlation may be determined in some examples and output to a user. The maximum correlation may also be output to a user. The ratio of the magnitude of the local maximum to the average level of the correlation may also be output to a user.
The rotation and 2D spatial correlation and local maxima positions and magnitudes calculation may all be repeated after each iteration until the rotated point cloud has performed a 360-degree rotation.
As stated above, in one example each iterative rotation is by 1 degree, but it could be another amount, e.g. 2 or 3 degrees. In one example, the rotation angles are not uniform. In one example the iterative rotation begins at a large angle and then decreases. For example, the rotations may start at a large step size (e.g. 3 degrees), increasing the sampling density around a known or newly identified “best” angle down to an angular step size (in one example, equal to arctangent (2 / image width)) to achieve an accuracy (e.g. of 1 -pixel movement at the edge of the image). So, for example, a 1000 x 1000 pixels image may have a rotation accuracy of arctangent (1/500) ~ 0.1 degrees. The translation accuracy may correspond to 1 pixel, which represents some distance in metres, depending on the scan range.
Once the first angle and translational amounts are calculated, at steps 516 and 518 amounts proportional to these amounts may be calculated and the dataset itself may be rotated and translated by these proportional amounts. These steps may be utilised in examples where the angle and translational amount are in “image units”, and so need to be converted to units compatible with the point clouds. So the method 500 may comprise converting the image angle
and translation result to units that are compatible with the point clouds. For example, if the pair of plan-view images has a transformation result of 5 degrees rotation around the perpendicular (e.g. z-axis) and translation of 10 and -15 pixels in the two directions of the plane (e.g. x and y directions), with the metres per pixel calculated as 0.05 (see step 4 above), then the corresponding point cloud transformation would be 5 degrees rotation around z and (0.05 * 10) and (0.05 * -15) metres translations in the x and y directions.
Regarding step 910 (finding greatest, or best, match), out of all the recorded local maximum correlation peaks (for the spatial correlation), the largest one may be selected and then the rotation corresponding to that local maximum correlation may be stored. The position/coordinates in the plane (e.g. x and y positions) of that local maximum correlation may then be stored. In some examples, this may be repeated for several of the local maxima correlations in descending order of magnitude and it may then be determined which is best by applying them to one point cloud and calculating a degree of match such as the point-to-plane Iterative Closest Point (ICP) Root Mean Square (RMS) error. The local maximum giving the highest degree of match or lowest point-to-plane RMS error may then be selected. The ratio of the magnitude of the local maxima to the average level of the correlation may be calculated.
Referring back to Figure 5, the first angle and translational amounts may be output or stored. Amounts (e.g. in point cloud units) may be output or stored. A rotated and translated (e.g. by the determined amounts) may be determined and/or output and/or stored. Therefore, in these latter examples, the outcome of the Figure 5 process is a 3D point cloud, rotated and translated in a plane, brought into alignment with another 3D point cloud in the plane.
Figure 6 is a flowchart illustrating a method which may be performed by any of the apparatuses described with respect to Figures 1-3. Figure 6 is a method of aligning (e.g. translating) two datasets in an axis. When used in conjunction with the method of Figure 5, the axis may be perpendicular (e.g. orthogonal) to the plane onto which the datasets are proj ected and therefore, used together, in any order, the combination of Figures 5 and 6 may result in a dataset rotated and translated in a plane and translated in an axis perpendicular to the plane (e.g. rotated and aligned in three axes). The method of Figure 4 may be performed to determine the centre of rotation (about which the point cloud is rotated in the Figure 5 method) and, therefore, in this sense, the methods of Figures 4-6 (optionally performed with any of the methods of 7-14, alone or in combination) are fully compatible and may be performed in any order.
The Figure 6 method comprises, at 602, obtaining first and second 3D datasets. The first 3D dataset is of a first space at a first time, and is a first point cloud in three dimensions, each point in the first point cloud representing a reading within the first space being taken by an imaging apparatus. The second 3D dataset is of a second space at a second time, the second space and the first space at least partially overlapping, and is a second point cloud in three dimensions, each point in the second point cloud representing a reading within the second space being taken by an imaging apparatus. At 604, the method comprises storing the first 3D dataset and the second 3D dataset as, respectively, a first and second clouds of points in a common coordinate system. At 606, the method comprises determining a first translational amount, which is an amount in a first direction in the common coordinate system by which the first point cloud, moved by the first translational amount in the first direction, aligns with the second point cloud in the first direction. The determining, at 606, comprises steps 616-638 as will now be described.
At 616, the method comprises recording, for each of the first point cloud and the second point cloud, the projection of each point in the respective point clouds onto a first 2D plane in the common coordinate system in the first direction, the first 2D plane having a normal vector parallel to the first direction, to form respective first and second 2D datasets, each point of the respective first and second 2D datasets being a projection of a corresponding point in the respective first point cloud and the second point cloud onto the first plane. At 618, the method comprises defining, for each of the first and second 2D datasets, a grid of cells contained in the first plane, each cell of each respective grid being of the same predetermined size, each cell in each respective grid comprising a number of points of the respective first and second datasets. At 620 the method comprises determining an overlap subset of grid cells defining an overlap region, being those grid cells that contain a point from both the first and second 2D datasets.
The third example method may then comprise blocks 622-638 as will now be described or, alternatively, may proceed to the method of Figure 11 or the method of Figure 12.
At 622 it is determined whether the overlap subset contains no such cells and the overlap region is empty. If it is determined that the overlap region is empty (yes) then at step 624 the method comprises determining 0 for the first translational amount. Otherwise (no), the method comprises either steps 626-630 or steps 632-638.
Step 632 comprises, for each cell in the overlap region, determining the point in the first point cloud having the largest value in the first axis and the point in the first point cloud having the lowest value in the first axis, and recording these largest and lowest values, these values defining a first range for the first point cloud, for each cell, and determining the point in the second point cloud having the largest value in the first axis and determine the point in the second point cloud having the lowest value in the first axis, and recording these largest and lowest values, these values defining a second range for the second point cloud, to thereby define, for each cell in the overlap region, first and second ranges, the first range having as its endpoints the lowest and largest values in the first axis for the points of the first point cloud contained in the cell, and the second range having as its endpoints the lowest and largest values in the first axis for the points of the second point cloud contained in the cell. At 634, the method comprises determining, for each cell, a midpoint for the first range and a midpoint for the second range. At 636, the method comprises determining, for each cell, a translation amount in the first direction to bring the midpoint of the first range into alignment with the midpoint for the second range. At 638, the method comprises determining the second translation amount to be the mean value of the translation amounts.
Alternatively, at step 626, the method comprises determining the point in the first point cloud having the highest value in the first axis or having the lowest value in the first axis and determining the point in second point cloud having the highest value in the first axis or having the lower value in the first axis. At step 630 the method comprises determining, for each cell, a translation amount in the first direction to bring the highest value for the first point cloud into alignment with the highest value for the second point cloud or a translation amount to bring the lowest value for the first point cloud into alignment with the lowest value for the second point cloud. At step 632 the method comprises determining the first translation amount to be the mean value of the translation amounts.
Whichever method (or example) is used to calculate the translational amount in the axis, the determined amount may be stored or output (at 610) and/or a translated first 3D dataset, being the first point cloud translated, in the first direction, by an amount proportional to the second translational amount may be determined (at step 612) and this may be stored or output (at 614).
The Figure 6 method may be used on two point clouds correctly registered in a plane (e.g. according to the Figure 5 method) and the Figure 6 method determines a translational amount in the perpendicular (orthogonal) direction to bring them into alignment in that direction, and in three dimensions.
The method relates to determining whether and, if so, in which region of a 2D plane (which may be the same plane onto which the clouds were projected in the Figure 5 method and which may be the same plane onto which the first cloud was projected in Figure 4 to determine its centre of rotation), the projections of the two scans to the plane have some overlap. This can be done by laying out a grid of pixels on the plane and projecting each point cloud onto the plane into the grid pixels while retaining their label as to which point cloud they came from. Then the method may search through the pixels to determine those which have points from both point clouds. These pixels may define the overlap region. If there is no overlap, a translation of 0 may be determined.
Within the 2D overlap region, separate 2D grids may be defined for each of the overlapping scans in the plane having a cell size (e.g., 0.25 x 0.25 metres). For each of the scans the highest and lowest (measured in the axis along which the method is determining the translation, e.g. perpendicular to the plane) points within each grid cell (a point is said to be inside a grid cell if its plane projection is inside the 2D cell) and record them in their respective grids. The grid cells can now be thought of as square columns, or pillars, along the direction of the axis, where the bottom of the column is at the local “floor” level and the top is at the “ceiling” level of the scan, at the plane locations of the columns.
The grids for each of the overlapping scans has an equal number of “columns” (or ranges, the terms may be regarded as synonymous) (although some may be empty, or 0 if no point cloud data exists within that cell), and the columns have the same locations in plane. Outlier columns may be removed by iterating over each pair of columns (of matching grid indices in each axes defining the plane) in the overlapping scans (where the column for each scan is at the same location in the plane) and compare their heights and remove those pairs of columns whose heights differ markedly (the predetermined threshold being, e.g., if column height difference > grid cell size). For each remaining column, the translation along the axis that that would align the centre of the one scan’s column with the overlapping scan’s column, and these translations may be recorded in a list of translation candidates. A further outlier rejection may be performed
by creating a histogram of the available translation candidates and then rejecting some translations if they are higher than a predetermined threshold. This may be useful for mobile scans where any single scan may contain multiple floors above each other. The final translation value may then be calculated as the mean value of the remaining, un-discarded, translation candidates.
In these examples, the width of this histogram distribution about the final translation value may be determined and output.
Regarding steps 612 and 614, the method may comprise one point cloud into alignment with the other point cloud in the direction in which the translation may be determined.
The method of Figure 6 may be advantageous in open-air environments. In these examples, the highest voxels may correspond to, for example, the top of trees or grass, or clouds and the lowest voxels may correspond to, for example, the soil, earth, dirt level. In these examples, the tops of the trees or grass or clouds may be swaying in the wind and so may not form a reproducibly reliable height datum level for vertical alignment.
The method of Figure 6 may also be used for aerial laser scanning from helicopters and drones e.g. of rain forests or over areas where people lived in pre-history, but which have now been covered in vegetation. As the area is scanned, some laser beams find their way down to the forest or vegetation floor and the lowest point of the point cloud in each vertical column of voxels can be found and joined to form a map of the topology of the forest or vegetation floor, and the methods can also be used determine previously unknown earthworks and dwellings which cannot otherwise be recognized. See, for example: Chase, Arlen F , Diane Z Chase, John F. Weishampel, Jason B. Drake, Ramesh L. Shrestha, K. Clint Slatton, Jaime J. Awe, and William E. Carter. "Airborne LiDAR, archaeology, and the ancient Maya landscape at Caracol, Belize." Journal of Archaeological Science 38, no. 2 (2011): 387-398; Chase, Arlen F., Diane Z. Chase, Christopher T. Fisher, Stephen J. Leisz, and John F. Weishampel. "Geospatial revolution and remote sensing LiDAR in Mesoamerican archaeology." Proceedings of the National Academy of Sciences 109, no. 32 (2012): 12916-12921; Rowlands, Aled, and Apostolos Sarris. "Detection of exposed and subsurface archaeological remains using multi sensor remote sensing." Journal of archaeological Science 34, no. 5 (2007): 795-803.
For an aerial laser scanner or terrestrial tripod-mounted scanner scanning underwater, in a mirror flat pond, the pond surface may form the lower or upper, respectively level of the scan. In any case, as per the flowchart of Figure 6 the method is adaptable in that it may use ranges and/or midpoints or only the uppermost points or only lowermost points in each column (or range) of voxels for translational alignment in the perpendicular axis and therefore the method is applicable for a wide range of scans representing a wide range of physical spaces.
The method of Figure 6 is not the only method according to which a translational amount in axis may be determined. Further methods may be used according to Figures 10-12 which will now be described. Each of these methods may be performed by any of the apparatuses described with respect to Figures 1 -3 and may be used in conjunction with the method of Figure 4 or 5. For example, any of the methods of Figures 10-12 may be used on a pair of datasets having been aligned in a plane using the method of Figure 5.
Referring to Figure 10, at step 1002 the method comprises, for each grid cell in the overlap region determining a measure of the distance between the first and second ranges. At step 1004 the method comprises discarding those first and second ranges for which the distance between them is above a predetermined threshold, wherein the determination of the translation amounts (as per figure 6 or Figures 11 or 12) is performed for the remaining, un-discarded, ranges. Essentially, Figure 10 represents a filter being applied to the grid cells.
Referring to Figure 11, the method continues following step 620 of the method of Figure 6, at which the overlap region is defined. At 1102 the method comprises, for each grid cell in the overlap region recording the proj ection of each point in the given point cloud whose proj ection is contained in the grid cell onto the third axis to form a first histogram in the third axis and recording the projection of each point in the second point cloud whose projection is contained in the grid cell onto the third axis to form a second histogram in the third axis. At 1104, the method comprises determining, for each cell, a translation amount to bring the first histogram into alignment with the second histogram in the third direction. At 1106, the method comprises determining the second translation amount to be the most common translation amount. As per steps 610-614, the translational amount may be stored/outputted and/or an aligned 3D dataset (being translated in the third direction by an amount proportional to the determined translational amount); and may be determined and/or stored or output.
The projection of a point or pixel etc. onto the perpendicular axis may be regarded as a virtual rod. With the method of Figure 6 as described above, one such rod (extending in the direction of the perpendicular axis) may extend for each pixel in the 2D/plan view image. According to the Figure 11 method, rather than creating one vertical rod for all the points in the overlap region there is one vertical rod for each pixel in the plan view image overlap region, where only the points above each pixel are projected onto its vertical rod and form a vertical histogram. The bins having the most points may be identified for each pixel or the bins which have a local maximum in the number of points are kept. This may be done for each scan and the histogram for each pixel for each scan is slid over the other to find the best degree of match in the perpendicular axis. The degree of match can be the correlation of the two histograms and the highest correlation chosen. The translations across all the rods extending in the perpendicular axis above each pixel in the overlap region may be chosen and the most common translation and direction of translation may be found. The most common translation and direction may be chosen to be applied to the whole point cloud to bring it into alignment in the perpendicular direction, or a histogram can be plotted of the number of rods that give a particular z translation versus the z translation and all the local peaks in this histogram can be chosen to investigate further, as explained in subsequent paragraphs. These methods can be advantageous for scans of spaces where the floor level changes such as at steps, landings or continuous slopes, for ceilings where the height to the ceiling changes such as with partial suspended false ceilings, air ducts, pipes next to the ceiling, electrical conduits hanging from the ceiling. By using multiple vertical rods in pixels and only projecting the points in the point cloud above each pixel onto the rod corresponding to that pixel, the signal to noise ratio can be improved. Alternatively, the highest and lowest point in the point cloud above each pixel can be determined and used for alignment. The method of doing a ID correlation on each vertical rod in each pixel can be advantageous in example spaces containing, for example, a spiral ramp in a car park or in a building with multiple floors equally spaced in the vertical direction or in a lift shaft. The vertical correlation of the histograms may take into account other objects, and features within each floor level for better translation. The averaging over several rods can overcome any problems which might occur for any one rod (depending on the example).
If there are several different translations in the perpendicular direction having similar magnitudes of correlation or degree of match and these are almost equally common amongst the rods above each pixel, then those translations may be applied in turn to the whole point cloud and the RMS point-to-Plane Error of ICP calculated, with the translation giving the
lowest ICP error is kept. This may be advantageous, e.g. for a building with many floors above one another or a spiral ramp or a lift shaft, where many vertical translations by one floor level might give a similar level of match, due to the periodicity in the vertical direction.
The number of pixels in the plane image which give the same perpendicular direction translation for alignment may then then be output to the user. If there were two such translations giving similar magnitudes of correlation and are almost equally common amongst the rods above each pixel, the ratio of the number of pixels having the best translation to the number of pixels having the second best translation may be calculated and output to the user.
Referring to Figure 12, the method comprises, at 1202, for each grid cell in the overlap region, determining a surface normal vector for each point in the given point cloud whose projection is contained in the grid cell to form a first set of surface normal vectors and determining a surface normal vector for each point in the second point cloud whose projection is contained in the grid cell to form a second set of surface normal vectors. At 1208 the method comprises recording a projection onto a plane parallel to the first 2D plane of the first and second sets of surface normal vectors to form respective first and second projected sets and recording a projection onto the third axis of the first and second projected sets to form first and second projected histograms. Prior to step 1208, the surface normal may be filtered according to the method of Figure 14. At 1210, the method comprises determining, for each cell, a translation amount to bring the first projected histogram into alignment with the second projected histogram in the third direction to form, for each cell, a degree of match. Step 1210 may comprise dividing the axis into bins, each bin having a sub-histogram dividing the points into rotational angle of the surface normal relative to the axis azimuthal and elevational bins. Then, a both the vertical histogram and the two angle sub -histograms may be matched. Alternatively, each may comprise sub-histograms, with the points in each bin being divided according to projected component magnitudes of the surface normal vectors, and both the histograms and sub -histograms may be matched. At 1212 the method comprises determining the second translation amount to be the maximum degree of match. As per steps 610-614, the translational amount may be stored/outputted and/or an aligned 3D dataset (being translated in the third direction by an amount proportional to the determined translational amount); may be determined and/or stored or output; and/or merged with the other dataset and output and/or stored as a merged point cloud.
Steps 1204 and 1206 are dotted and should be regarded as optional to the method of Figure 12. They comprise, at step 1204, determining the absolute value of each component of each surface normal vector in the first and second sets of surface normal vectors along the third axis, and, at step 1206, discarding those surface normal vectors from their respective sets whose absolute values of their components along the third axis are above or below a predetermined threshold. Therefore, the steps 1204-1206 represent a filter being applied to the surface normal vectors before the translational amount is determined.
The surface normals are projected onto a plane and therefore keep their component magnitudes in the axes spanning the plane. Those points for which their horizontal component magnitude of surface normal vectors is less than a predetermined threshold (e.g. 0.1) may be discarded for the translation calculation in the axis (the points are not permanently discarded however, later all points in the point cloud will be translated according to the determined amount). This has the effect of discarding “vertical” surfaces (e.g. in the direction of the perpendicular axis/parallel to the plane) while keeping those surfaces which are either exactly or almost “horizontal” (e.g. parallel to the plane). Then, the points in the parts of each point cloud above the overlapping region are projected onto rods extending in the perpendicular direction. Then a histogram is formed of point density bins along the rods. Then the histogram for one point cloud is moved relative to that for the other point cloud until they come into alignment and the best alignments are found by measuring the degrees of match. The degree of match can be, for example, a mathematical correlation and may be determined, e.g. by ID Fourier transforming each histogram along the perpendicular direction, multiplying them, and Inverse ID Fourier Transforming them to get the correlation function. Then the maximum value of this function is found. The maximum correlation may then be output. The ratio of the magnitude of the local maximum to the average level of the correlation may also be output.
Figure 14 is a flowchart of a method which may be performed by any of the apparatuses described with respect to Figures 1-3 and which may be performed in conjunction with any of the methods described herein. Then method is a method 1402 of filtering first and second datasets to be aligned by projecting the datasets onto a plane or onto a line (or rod). At 1404 the method comprises determining, for each point in the respective first and second point clouds, a surface normal vector. At 1406 the method comprises recording the projection of each surface normal vector onto an axis (such as the third axis) or onto a plane (such as a plane whose normal is parallel to the third axis) to resolve the component of each surface normal
vector along an axis perpendicular to the plane onto which the datasets have been projected, or are to be projected onto a plane (e g., in examples where the dataset have been projected onto a plane spanned by first and second axes, the third axis). At 1408 the method comprises determining the absolute value of each component of each surface normal vector along the perpendicular axis or onto the plane, respectively. At 1410 the method comprises discarding (albeit temporarily for the purpose of the calculation; any discarded points remain in the point clouds for later alignment/outputting) those points from their respective point clouds who have surface normal vectors whose absolute values of their components along the perpendicular axis are below a predetermined threshold. A filter processing, such as the one of Figure 14, may result in an improved signal-to-noise ratio. The outcome of the Figure 14 method may be that only those points on walls or vertical surfaces are kept in the filtered dataset when projected onto the plane, and may be floor levels and ceiling levels when projected into a vertical rod. To calculate the surface normals, the method of Figure 14 may comprise determining whether the surface normals point away from the determined centre of the respective point cloud and, if so, reverse their directions by 180 degrees. If a space is convex in shape a convex hull method may be used. The filter of Figure 14 may, as a result, keep those points having surface normals within a certain angle (e.g. of the horizontal or vertical) by resolving the component on the plane or perpendicular axis and if that absolute value is below a threshold (e.g. 0.1 in some examples), the point may be discarded as it may be assumed to be on a horizontal surface.
Filtering the dataset(s) in this way may keep points on walls of the space (e.g. tunnel walls) or on almost vertical surfaces (such as the trunks of trees). If, for example, a tunnel wall has been formed by chipping away at the surface it will have facets with surface normals in a wide range of directions or if the trunk of a tree has deeply creviced bark it may also give surfaces with normals in a wide range of directions. So, this filtering process may lose some of the wall or vertical surface points, but it will keep most of them. If the filtering process causes too many points to be lost from the walls or vertical surfaces for good alignment, then precede this step by filtering using another filter to smooth the surfaces of both point clouds first but not to move the surfaces otherwise later alignment will not be accurate. This can be automated by calculating the signal to noise ratio at the start and at the end (as described at the start of this section) and if the signal processing gain in signal to noise ratio is above a threshold, then it may be kept, and, if not, the method may return to apply another filter. The decision of whether to filter or not, which method to use, and whether to repeat or not is implementation specific.
Figure 15 depicts, schematically, a non-transitory and machine-readable medium 1500 in association with a processor 1502. The medium 1500 comprises a set of executable instructions 1504 stored thereon which, when executed by the processor 1502 causes the processor 1502 to perform the method according to any of Figures 4-14 (e.g. any one or more of the steps thereof).
Figure 16 is a block diagram of a computing device, such as a server, which may be used to implement a method of any of the examples as described above (e.g. any one or more of the steps of any method alone or in combination). In particular, the computing device of Figure 16 may be the hardware configuration of one or more from among the 3D image acquisition unit 12-32, the storage unit 14-34, and/or the one or more processors 36. The computing device of Figure 16 may be used to implement the method of any of Figures 4-14.
The computing device comprises a device processing unit 1601 (such as a CPU if the device is a computer, the device may comprise a smart device such as a phone or tablet, or the processing unit may be embedded inside an imaging apparatus, such as a scanner), a memory, such as Random Access Memory (RAM) 1603, and storage, such as a hard disk, 1604. Optionally, the computing device also includes a network interface 1607 for communication with other such computing devices of examples herein. An example may be composed of a network of such computing devices operating as a cloud computing cluster. Optionally, the computing device also includes Read Only Memory 1602, one or more input mechanisms such as keyboard and mouse 1606, and a display unit such as one or more monitors 1605. The components are connectable to one another via a bus 1600.
The CPU 1601 is configured to control the computing device and execute processing operations. The RAM 1603 stores data being read and written by the CPU 1601. In addition, there may be a GPU. The storage unit 1604 may be, for example, a non-volatile storage unit, and is configured to store data.
The optional display unit 1605 displays a representation of data stored by the computing device and displays a cursor and dialog boxes and screens enabling interaction between a user and the programs and data stored on the computing device. The input mechanisms 1606 enable a user to input data and instructions to the computing device. The display could be a 3D display using stereo glasses or in a helmet like Holdens or a holographic display or an autostereoscopic display none of which need special glasses.
The network interface (network I/F) 1607 is connected to a network, such as the Internet, and is connectable to other such computing devices via the network. The network I/F 1607 controls data input/output from/to other apparatus via the network. The network I/F may provide a connection to a computing device from which the 3D datasets were obtained, and may receive guidance or instructions defining elements of the processing (for example, selecting algorithms).
Other peripheral devices such as microphone, speakers, printer, power supply unit, fan, case, scanner, trackerball etc may be included in the computing device.
Any one of the 3D image alignment apparatus 10-30 may be embodied as functionality realised by a computing device such as that illustrated in Figure 16. The functionality of the 3D image alignment apparatus 10-30 may be realised by a single computing device or by a plurality of computing devices functioning cooperatively or by a cloud processing network. Methods embodying the present invention may be carried out on, or implemented by, a computing device such as that illustrated in Figure 16. One or more such computing devices may be used to execute a computer program of any of the first-third examples disclosed herein. Computing devices embodying or used for implementing examples need not have every component illustrated in Figure 16, and may be composed of a subset of those components.
The at least one processor may be programmed processor hardware, comprising processing instructions stored on a storage unit, a processor to execute the processing instructions, and a RAM to store information objects during the execution of the processing instructions. In either case the processor could be a CPU or a GPU or FPGA or an array of them.
‘Appendix’
The teaching of the any of the following documents may be used to image a space (e.g. to take readings of a space). The entire contents of the following documents are incorporated by reference:
Selviah, David R., and Epaminondas Stamos. "Similarity suppression algorithm for designing pattern discrimination filters." Asian Journal of Physics 11, no. 2 (2002): 367-389.
Stamos, Epaminondas, and David R. Selviah. "Feature enhancement and similarity suppression algorithm for noisy pattern recognition. " In Optical Pattern Recognition IX, vol. 3386, pp. 182- 189. International Society for Optics and Photonics, 1998.
Stamos, Epaminondas. "Algorithms for designing filters for optical pattern recognition." PhD diss., University College London (United Kingdom), 2001.
Li, Zhaowei, and David R. Selviah. "Comparison of Image Alignment Algorithms." In Proceedings Paper, London Communications Symposium, University College London. 2011. Ma, Chengqi, Bang Wu, Stefan Poslad, and David R. Selviah. "Wi-Fi RTT Ranging Performance Characterization and Positioning System Design." IEEE Transactions on Mobile Computing (2020).
Potorti, Francesco, Sangjoon Park, Antonino Crivello, Filippo Palumbo, Michele Girolami, Paolo Barsocchi, Soyeon Lee et al. "The IPIN 2019 Indoor Localisation Competition — Description and Results." IEEE Access 8 (2020): 206674-206718.
Ma, Chengqi, Chenyang Wan, Yuen Wun Chau, Soong Moon Kang, and David R. Selviah. "Subway station real-time indoor positioning system for cell phones." In 2017 International Conference on Indoor Positioning and Indoor Navigation (IPIN), pp. 1-7. IEEE, 2017.
H. Bay, T. Tuytelaars, and L. Van Gool, “SURF: Speeded Up Robust Features”, in Computer Vision- ECCV 2006, A. Leonardis, H. Bischof, and A. Pinz, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006, pp. 404-417.
M. Calonder, V. Lepetit, M. Ozuysal, T. Trzcinski, C. Strecha,andP. Fua, “BRIEF: Computing a local binary descriptor very fast,’ ’ IEEE Trans. Pattern Anal. Mach. Intell, vol. 34, no. 7, pp. 1281-1298, Nov. 2012.
Y.-H. Choi, T.-K. Lee, and S.-Y. Oh, “A line feature based SLAM with low grade range sensors using geometric constraints and active exploration for mobile robot”, Autonomous Robots, vol. 24, no. 1, pp. 13-27, 2008. [Online] Available: https://doi.org/10.1007/sl0514-007-9050-y Dong, Pengfei, Ruan, Xiaogang, Huang, Jing, Zhu, Xiaoqing, and Xiao, Yao, “A RGB-D SLAM algorithm combining ORB features and bow,” ACM International Conference Proceeding Series, vol. 118, no. 4, pp. 1-6, 2018, doi: 10.1145/3207677.3278061. Galvez-Lopez, Dorian and Tardos, J. D., “Bags of Binary Words for Fast Place Recognition in Image Sequences”, IEEE Transactions on Robotics, vol. 28, no. 5, pp. 1188-1197, 2012, doi: 10.1109/TRO.2012.2197158.
S. Hong, I. Kim, J. Pyo, and S. C. Yu, “A robust loop-closure method for visual SLAM in unstructured seafloor environments”, Autonomous Robots, vol. 40, no. 6, pp. 1095-1109, 2016, doi: 10.1007/sl0514-015-9512-6
Hiibner, Patrick, Kate Clintworth, Qingyi Liu, Martin Weinmann, and Sven Wursthom. "Evaluation of HoloLens tracking and depth sensing for indoor mapping applications." Sensors 20, no. 4 (2020): 1021.
Koide, Kenji et al, “Voxelized GICP for Fast and Accurate 3D Point Cloud Registration”, 16th Intelligent Autonomous Systems Conference (IAS16), EasyChair Preprint, vol.5, no. 2703, pp. 1-13, 2020. [Online] Available: https://easychair.org/publications/preprint/ftvV Li, Shao peng, Zhang, Tao, Gao, Xiang, Wang, Duo, and Xian, Yong, “Semi-direct monocular visual and visual-inertial SLAM with loop closure detection”, Robotics and Autonomous Systems, vol. 112, no. 2, pp. 201-210, 2019, doi: 10.1016/j.robot.2018.11.009.
Li, Xianlong, and Chongyang Zhang. "Robust RGB-D Visual Odometry Based on the Line Intersection Structure Feature in Low-Textured Scenes." In 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), pp. 390-394. IEEE, 2018. Ma, Jiayi, Wang, Xinya, He, Yijia, Mei, Xiaoguang, and Zhao, Ji, “Line-Based Stereo SLAM by Junction Matching and Vanishing Point Alignment”, IEEE Access, vol. 7, no. 5, pp. 181800-181811, 2019, doi: 10.1109/ACCESS.2019.2960282.
Hsiao, Ming, Eric Westman, Guofeng Zhang, and Michael Kaess. "Keyframe-based dense planar SLAM." In 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 5110-5117. IEEE, 2017.
R. Mur-Artal, J. Montiel, and J. Tardos, “ORB-SLAM: A Versatile and Accurate Monocular SLAM System”, IEEE transactions on robotics: a publication of the IEEE Robotics and Automation Society., vol. 31, no. 5, pp. 1147-1163, 2015.
R. Mur-Artal and J. D. Tardos, "ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras," in IEEE Transactions on Robotics, vol. 33, no. 5, pp. 1255-1262, Oct. 2017, doi: 10.1109/TR0.2017.2705103.
C. Park, S. Kim, P. Moghadam, J. Guo, S. Sridharan, and C. Fookes, “Robust photogeometric localization over time for map-centric loop closure,” IEEE Robotics and Automation Letters, vol. 4, no. 2, pp. 1768-1775, 2019
Ren, Zhuli, Wang, Liguan, and Bi, Lin, “Robust GICP -based 3D LiDAR SLAM for underground mining environment”, Sensors (Switzerland), vol. 19, no. 6, pp. 122-155, 2019, doi: 10.3390/s 19132915.
X. Shi, J. Peng, J. Li, P. Yan, and H. Gong, “The Iterative Closest Point Registration Algorithm Based on the Normal Distribution Transformation”, Procedia Computer Science, vol. 147, pp. 181-190, 2019, doi: 10.1016/j.procs.2019.01.219
Torroba, Ignacio, Sprague, Christopher Iliffe, Bore, Nils, and Folkesson, John, “PointNetKL: Deep Inference for GICP Covariance Estimation in Bathymetric SLAM”, IEEE Robotics and Automation Letters, vol. 5, no. 3, pp. 4078M085, 2020, doi: 10.1109/LRA.2020.2988180.
V. Vijayan and P. Kp, “FLANN Based Matching with SIFT Descriptors for Drowsy Features Extraction,” Proceedings of the IEEE International Conference Image Information Processing, vol. 2019-November, pp. 600-605, 2019, doi: 10.1109/ICIIP47207.2019.8985924.
Von Gioi, Rafael Grompone, Jeremie Jakubowicz, Jean-Michel Morel, and Gregory Randall. "LSD: A fast line segment detector with a false detection control." IEEE Transactions on Pattern Analysis and Machine Intelligence vol. 32, no. 4, pp. 722-732, 2008 R. Wang, Y. Wang, W. Wan and K. Di, "A Point-Line Feature based Visual SLAM Method in Dynamic Indoor Scene", 2018 Ubiquitous Positioning, Indoor Navigation and Location- Based Services (UPINLBS), Wuhan, 2018, pp. 1-6, doi: 10.1109/UPINLBS.2018.8559749.
Y. Zhang, H. Zhang, Z. Xiong, and X. Sheng, “A visual SLAM System with Laser Assisted Optimization,” IEEE/ASME International Conference on Advanced Intelligent Mechatronics, AIM, vol. 2019- July, pp. 187-192, 2019, doi: 10.1109/AIM.2019.8868664.
In any of the above aspects, the various features may be implemented in hardware, or as software modules running on one or more processors. Features of one aspect may be applied to any of the other aspects.
The invention also provides a computer program or a computer program product or a cloud- based service for carrying out any of the methods described herein, and a computer readable medium having stored thereon a program for carrying out any of the methods described herein. A computer program embodying the invention may be stored on a computer-readable medium, or it could, for example, be in the form of a signal such as a downloadable data signal provided from an Internet website, or it could be in any other form.
A computing device, such as a data storage server, may embody the present invention, and may be used to implement a method of an example of the invention. The computing device may comprise a processor and memory. The computing device may also include a network interface for communication with other computing devices, for example with other computing devices of invention examples.
The memory may include a computer readable medium, which may refer to a single medium or multiple media (e g., a centralized or distributed database and/or associated caches and servers) configured to carry computer-executable instructions or have data structures stored thereon. Computer-executable instructions may include, for example, instructions and data accessible by and causing a general purpose computer, special purpose computer, or special purpose processing device (e.g., one or more processors) to perform one or more functions or operations. Thus, the term “computer-readable storage medium” may also include any medium that is capable of storing, encoding or carrying a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methods of the present disclosure. The term “computer-readable storage medium” may accordingly be taken to include, but not be limited to, solid-state memories, optical media and magnetic media. By way of example, and not limitation, such computer-readable media may include non-transitory computer-readable storage media, including Random Access Memory (RAM), Read-Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory (CD-ROM) or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory devices (e.g., solid state memory devices).
The processor may be is configured to control the computing device and execute processing operations, for example executing code stored in the memory to implement the various different functions of modules described here and in the claims. The memory may store data being read and written by the processor. As referred to herein, a processor may include one or more general-purpose processing devices such as a microprocessor, central processing unit, general processing unit, or a distributed cloud network of processors. The processor may include a complex instruction set computing (CISC) microprocessor, reduced instruction set computing (RISC) microprocessor, very long instruction word (VLIW) microprocessor, or a processor implementing other instruction sets or processors implementing a combination of instruction sets. The processor may also include one or more special-purpose processing devices such as an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a digital signal processor (DSP), network processor, or the like. In one or more examples, a processor is configured to execute instructions for performing the operations and steps discussed herein.
The display unit may display a representation of data stored by the computing device and may also display a cursor and dialog boxes and screens enabling interaction between a user and the programs and data stored on the computing device. The input mechanisms may enable a user to input data and instructions to the computing device.
The network interface (network I/F) may be connected to a network, such as the Internet, and may be connectable to other such computing devices via the network. The network I/F may control data input/output from/to other apparatus via the network. Other peripheral devices such as microphone, speakers, printer, power supply unit, fan, case, scanner, trackerball etc may be included in the computing device.
The examples of the present disclosure and the various features and advantageous details thereof are explained more fully with reference to the non-limiting examples that are described and/or illustrated in the drawings and detailed in the following description. It should be noted that the features illustrated in the drawings are not necessarily drawn to scale, and features of one example may be employed with other examples as the skilled artisan would recognize, even if not explicitly stated herein. Descriptions of well-known components and processing techniques may be omitted so as to not unnecessarily obscure the examples of the present disclosure. The examples used herein are intended merely to facilitate an understanding of ways in which the examples of the present may be practiced and to further enable those of skill in the art to practice the same. Accordingly, the examples herein should not be construed as limiting the scope of the examples of the present disclosure, which is defined solely by the appended claims and applicable law.
It is understood that the examples of the present disclosure are not limited to the particular methodology, protocols, devices, apparatus, materials, applications, etc., described herein, as these may vary. It is also to be understood that the terminology used herein is used for the purpose of describing particular examples only, and is not intended to be limiting in scope of the examples as claimed. It must be noted that as used herein and in the appended claims, the singular forms "a," "an," and "the" include plural reference unless the context clearly dictates otherwise.
Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of ordinary skill in the art to which the examples of the present
disclosure belong. Preferred methods, devices, and materials are described, although any methods and materials similar or equivalent to those described herein may be used in the practice or testing of the examples.
Although only a few exemplary examples have been described in detail above, those skilled in the art will readily appreciate that many modifications are possible in the examples without materially departing from the novel teachings and advantages of the examples of the present disclosure. The above-described examples of the present invention may advantageously be used independently of any other of the examples or in any feasible combination with one or more others of the examples. Accordingly, all such modifications are intended to be included within the scope of the examples of the present disclosure as defined in the following claims.
In addition, any reference signs placed in parentheses in one or more claims shall not be construed as limiting the claims. The word "comprising" and "comprises," and the like, does not exclude the presence of elements or steps other than those listed in any claim or the specification as a whole. The singular reference of an element does not exclude the plural references of such elements and vice-versa. One or more of the examples may be implemented by means of hardware comprising several distinct elements. In a device or apparatus claim enumerating several means, several of these means may be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to an advantage.
Claims
1. An apparatus comprising: a 3D dataset acquisition unit configured to: obtain a first 3D dataset of a first space at a first time, the first 3D dataset being a first point cloud in three dimensions, each point in the first point cloud representing a reading within the first space being taken by an imaging apparatus; a storage unit configured to store the first 3D dataset as a first cloud of points in a coordinate system; and at least one processor configured to determine coordinates of a centre of rotation for the first point cloud in the coordinate system, the at least one processor being configured to, for each point in the point cloud: record the projection of each point in the first point cloud onto a first 2D plane in the coordinate system, the first 2D plane being defined by first and second perpendicular axes in the coordinate system, each of the first and second axes being perpendicular to each other, to form a first 2D dataset, each point of the first 2D dataset being a projection of a corresponding point in the first point cloud onto the first 2D plane; define, for the first 2D dataset, a 2D grid of cells contained in the first plane, each cell being of a predetermined size and each cell comprising a number of points of the first 2D dataset; determine a mean position of the first 2D dataset or mean grid cell; select, as the components of the centre of rotation in the first and second axes, the coordinates in the first and second axes of the point that is either the mean of the 2D dataset or that is at the centre of the mean grid cell; and to store or output the determined components of the centre of rotation in the first and second axes.
2. The apparatus of claim 1, wherein, to determine the components of the centre of rotation in the first and second axes, the at least one processor is configured to: remove a subset of the grid cells that are less than a predetermined threshold distance away from their nearest cells containing points; determine the mean position of the remaining grid cells; select the grid cell that is nearest to the determined mean;
determine the centre of the grid cell that is nearest to the determined mean; and select, as the components of the centre of rotation in the first and second axes, the coordinates in the first and second axes of the centre of the grid cell that is nearest to the determined mean.
3. The apparatus of claim 2, wherein, to remove a subset of the grid cells, the at least one processor is configured to: create a distance map for each cell in the grid by assigning a value to each cell, the assigned value representing a distance between that cell and the nearest cell containing points; and either remove those grid cells having a distance value that is less than a predetermined threshold; or determine which grid cells have distance values that are local maxima, and remove the remaining grid cells.
4. The apparatus of any preceding claim, wherein the at least one processor is further configured to determine the coordinate of the centre of rotation in a third axis, the third axis being perpendicular to the first and second axes, the third axis thereby defining a direction normal to the first 2D plane, wherein, to determine the coordinate of the centre of rotation in the third axis, the at least one processor is configured to, for the points in the first point cloud that correspond to the projected points contained in the grid cell that is nearest to the determined mean: if the range of coordinates in the third axis is less than a predetermined threshold, determine the coordinate of the centre of rotation in the third axis to be the mean value of all the third axis coordinates of the points, optionally summed with a parameter proportional to the height of the imaging apparatus that took the readings of the first space; and, otherwise: determine the point having the highest value of the coordinate in the third axis; determine the point having the lowest value of the coordinate in the third axis; determine the midpoint between the lowest and highest value; and select the midpoint as the coordinate of the centre of rotation in the third axis.
5. The apparatus of any preceding claim, wherein the 3D dataset acquisition unit is further configured to:
obtain a second 3D dataset of a second space at a second time, the second space at and the first space at least partially overlapping the second 3D dataset being a second point cloud in three dimensions, each point in the second point cloud representing a reading within the second space being taken by an imaging apparatus, and wherein the storage unit is configured to store the first 3D dataset and the second 3D dataset as, respectively, a first and second clouds of points in a common coordinate system, and wherein the at least one processor is configured to, for each of the first and second point clouds: record the projection of each point in the first and second point clouds onto a second 2D plane in the common coordinate system, the second 2D plane being parallel to the first 2D plane, to form, for each respective point cloud, respective 2D datasets comprising a second 2D dataset and a third 2D dataset, wherein each point of the second 2D dataset is a projection of a corresponding point in the first point cloud onto the second 2D plane, and wherein each point of the third 2D dataset is a projection of a corresponding point in the second point cloud onto the second 2D plane; and determine: a first angle about an axis of rotation, the axis of rotation extending in a direction perpendicular to the second 2D plane and through the point on the second 2D plane having its components in the first and second axes as the determined components of the centre of rotation; and a first translational amount for which the second 2D dataset, rotated about the axis of rotation by the determined angle, and moved in the second 2D plane by the first translational amount, aligns with the third 2D dataset in the second 2D plane; and to: store or output the determined first angle and first translational amount; and/or to determine a rotated and translated first 3D dataset, the rotated and translated first 3D dataset being the first point cloud rotated, about the axis of rotation, by an amount proportional to the first angle and translated, in a direction of the second 2D plane, by an amount proportional to the first translational amount; and to: store or output the determined rotated and translated first 3D dataset.
6. The apparatus of claim 5, wherein the at least one processor is configured to: define, for the second and third 2D datasets, respective 2D grids of cells contained in the second 2D plane, each cell in each respective grid being of a predetermined size and each cell comprising a number of points of the respective second or third 2D dataset.
7. The apparatus of claim 5 or 6, wherein the at least one processor is configured to: apply a filter to the second and third 2D datasets to produce respective second and third filtered datasets, by: for each cell in each respective 2D dataset having a number of points of the respective point cloud below a predetermined threshold, increasing the number of points in the cell; and for each cell in each respective 2D dataset having a number of points of the respective point cloud above a predetermined threshold, decreasing the number of points in the cell.
8. The apparatus of any of claims 5-7 wherein, to determine the first angle and first translational amount, the at least one processor is configured to: iteratively rotate the second 2D dataset about the axis of rotation by a predetermined amount at each iteration until the second 2D dataset has been rotated by 360 degrees back to its starting point; and, perform, after each iteration, a spatial correlation in the second 2D plane to determine a degree of offset between the rotated second 2D dataset and the third 2D dataset; and determine, based on the iterative rotations and spatial correlations, an angle and a translational amount for which a degree of match between the second 2D dataset, rotated by the angle and translated by the translational amount, and the third 2D dataset is the largest; and record that angle and translational amount as the first angle and first translational amount.
9. An apparatus comprising: a 3D dataset acquisition unit configured to: obtain a first 3D dataset of a first space at a first time, the first 3D dataset being a first point cloud in three dimensions, each point in the first point cloud representing a reading within the first space being taken by an imaging apparatus; obtain a second 3D dataset of a second space at a second time, the second space and the first space at least partially overlapping, the second 3D dataset being a second point cloud in three dimensions, each point in the second point cloud representing a reading within the second space being taken by an imaging apparatus; a storage unit configured to store the first 3D dataset and the second 3D dataset as, respectively, a first and second clouds of points in a common coordinate system; and
at least one processor is configured to, for each of the first and second point clouds: record the proj ection of each point in the first and second point clouds onto a first 2D plane in the common coordinate system, the first 2D plane being defined by first and second perpendicular axes in the coordinate system, each of the first and second axes being perpendicular to each other, to form, for each respective point cloud, respective 2D datasets comprising a first 2D dataset and a second 2D dataset, wherein each point of the first 2D dataset is a projection of a corresponding point in the first point cloud onto the first 2D plane, and wherein each point of the second 2D dataset is a projection of a corresponding point in the second point cloud onto the first 2D plane; and determine: a first angle about an axis of rotation, the axis of rotation extending in a direction perpendicular to the first 2D plane and through a point on the first 2D plane having its components in the first and second axes as the components of a centre of rotation of the first point cloud in the common coordinate system; and a first translational amount for which the first 2D dataset, rotated about the axis of rotation by the determined angle, and moved in the first 2D plane by the first translational amount, aligns with the second 2D dataset in the first 2D plane; and to: store or output the determined first angle and first translational amount; and/or to determine a rotated and translated first 3D dataset, the rotated and translated first 3D dataset being the first point cloud rotated, about the axis of rotation, by an amount proportional to the first angle and translated, in a direction of the first 2D plane, by an amount proportional to the first translational amount; and to: store or output the determined rotated and translated first 3D dataset.
10. The apparatus of claim 9, wherein the at least one processor is configured to: define, for the first and second 2D datasets, respective 2D grids of cells contained in the first 2D plane, each cell in each respective grid being of a predetermined size and each cell comprising a number of points of the respective first or second 2D dataset.
11. The apparatus of claim 9 or 10, wherein the at least one processor is configured to: apply a filter to the first and second 2D datasets to produce respective first and second filtered datasets, by: for each cell in each respective 2D dataset having a number of points of the respective point cloud below a predetermined threshold, increasing the number of points in the cell; and
for each cell in each respective 2D dataset having a number of points of the respective point cloud above a predetermined threshold, decreasing the number of points in the cell.
12. The apparatus of any of claims 9-11 wherein, to determine the first angle and first translational amount, the at least one processor is configured to: iteratively rotate the first 2D dataset about the axis of rotation by a predetermined amount at each iteration until the first 2D dataset has been rotated by 360 degrees back to its starting point; and, after each iteration, perform a spatial correlation in the first 2D plane to determine a degree of offset between the rotated first 2D dataset and the second 2D dataset based on the position of the highest correlation peak or degree of match; and determine, based on the iterative rotations and spatial correlations, an angle and a translational amount for which a degree of match between the first 2D dataset, rotated by the angle and translated by the translational amount, and the second 2D dataset is the largest; and record that angle and translational amount as the first angle and first translational amount.
13. The apparatus any of claims 9-12, wherein the at least one processor is configured to determine coordinates of the centre of rotation of the first point cloud in the coordinate system, the at least one processor being configured to, for each point in the first point cloud: record the projection of each point in the first point cloud onto a second 2D plane in the coordinate system, the second 2D plane being parallel to the first 2D plane, to form a third 2D dataset, each point of the third 2D dataset being a projection of a corresponding point in the first point cloud onto the second 2D plane; define, for the third 2D dataset, a 2D grid of cells contained in the second plane, each cell being of a predetermined size and each cell comprising a number of points of the third 2D dataset; determine a mean position of the third 2D dataset or mean grid cell; select, as the components of the centre of rotation in the first and second axes, the coordinates in the first and second axes of the point that is either the mean of the third 2D dataset or that is at the centre of the mean grid cell; and to store or output the determined components of the centre of rotation in the first and second axes.
14. The apparatus of claim 13, wherein, to determine the components of the centre of rotation in the first and second axes, the at least one processor is configured to: remove a subset of the grid cells that are less than a predetermined threshold distance away from their nearest cells containing points; determine the mean position of the remaining grid cells; select the grid cell that is nearest to the determined mean; determine the centre of the grid cell that is nearest to the determined mean; and select, as the components of the centre of rotation in the first and second axes, the coordinates in the first and second axes of the centre of the grid cell that is nearest to the determined mean.
15. The apparatus of claim 13, wherein, to remove a subset of the grid cells, the at least one processor is configured to: create a distance map for each cell in the grid by assigning a value to each cell, the assigned value representing a distance between that cell and the nearest cell containing points; and either remove those grid cells having a distance value that is less than a predetermined threshold; or determine which grid cells have distance values that are local maxima, and remove the remaining grid cells.
16. The apparatus of any of claims 13-15, wherein the at least one processor is further configured to determine the coordinate of the centre of rotation in a third axis, the third axis being perpendicular to the first and second axes, the third axis thereby defining a direction normal to the second 2D plane, wherein, to determine the coordinate of the centre of rotation in the third axis, the at least one processor is configured to, for the points in the first point cloud that correspond to the projected points contained in the grid cell that is nearest to the determined mean: if the range of coordinates in the third axis is less than a predetermined threshold, determine the coordinate of the centre of rotation in the third axis to be the mean value of all the third axis coordinates of the points, optionally summed with a parameter proportional to the height of the imaging apparatus that took the readings of the first space; and, otherwise: determine the point having the highest value of the coordinate in the third axis; determine the point having the lowest value of the coordinate in the third axis;
determine the midpoint between the lowest and highest value; and select the midpoint as the coordinate of the centre of rotation in the third axis.
17. The apparatus of any preceding claim, wherein the at least one processor is configured to: determine a second translational amount, being an amount in a third direction, being the direction perpendicular to the first and second axes, the third direction thereby defining a third axis in the common coordinate system, by which a given point cloud, the given point cloud comprising either the first point cloud or the rotated and translated first point cloud, being the first point cloud rotated, about the axis of rotation, by an amount proportional to the first angle and translated, in a direction of the first plane, by an amount proportional to the first translational amount, moved by the second translational amount in the third direction, aligns with the second point cloud in the third direction; the at least one processor is configured to: record, for each of the given point cloud and the second point cloud, the projection of each point in the respective point clouds onto a third 2D plane in the common coordinate system in the third direction, the third 2D plane being parallel to the first 2D plane, to form respective fourth and fifth 2D datasets, each point of the respective fourth and fifth 2D dataset being a projection of a corresponding point in the respective given point cloud and the second point cloud onto the third plane; define, for each of the fourth and fifth 2D datasets, a grid of cells contained in the third plane, each cell of each respective grid being of the same predetermined size, each cell in each respective grid comprising a number of points of the respective fourth and fifth datasets; and to determine an overlap subset of grid cells defining an overlap region, being those grid cells that contain a point from both the fourth and fifth 2D datasets.
18. The apparatus of claim 17 wherein, if the overlap subset contains no such cells and the overlap region is empty, then the at least one processor is configured to: determine 0 for the second translational amount; and, otherwise, the at least one processor is configured to, for each cell in the overlap region, either: determine the point in the given point cloud having the largest value in the third axis and the point in the given point cloud having the lowest value in the third axis; and
record these largest and lowest values, these values defining a first range for the given point cloud, for each cell; and determine the point in the second point cloud having the largest value in the third axis and determine the point in the second point cloud having the lowest value in the third axis; and record these largest and lowest values, these values defining a second range for the second point cloud, to thereby define, for each cell in the overlap region, first and second ranges, the first range having as its endpoints the lowest and largest values in the third axis for the points of the given point cloud contained in the cell, and the second range having as its endpoints the lowest and largest values in the third axis for the points of the second point cloud contained in the cell; determine, for each cell, a midpoint for the first range and a midpoint for the second range; determine, for each cell, a translation amount in the third direction to bring the midpoint of the first range into alignment with the midpoint for the second range; and determine the second translation amount to be the mean value of the translation amounts; or determine the point in the given point cloud having the highest value in the third axis or having the lowest value in the third axis; determine the point in second point cloud having the highest value in the third axis or having the lower value in the third axis; determine, for each cell, a translation amount in the third direction to bring the highest value for the given point cloud into alignment with the highest value for the second point cloud or a translation amount to bring the lowest value for the given point cloud into alignment with the lowest value for the second point cloud; and determine the second translation amount to be the mean value of the translation amounts; and, in either case, to: store or output the determined second translational amount; and/or to: determine an aligned first 3D dataset, the aligned first 3D dataset being the given point cloud translated, in the third direction, by an amount proportional to the second translational amount; and to:
store or output the determined aligned first 3D dataset; wherein, to determine the second translational amount.
19. The apparatus of claim 17 or 18, wherein the at least one processor is configured to, for each grid cell in the overlap region: determine a measure of the distance between the first and second ranges; and discard those first and second ranges for which the distance between them is above a predetermined threshold, wherein the determination of the translation amounts is performed for the remaining, un-discarded, ranges.
20. The apparatus of claim 17, wherein the at least one processor is configured to, for each grid cell in the overlap region: record the projection of each point in the given point cloud whose projection is contained in the grid cell onto the third axis to form a first histogram in the third axis; record the projection of each point in the second point cloud whose projection is contained in the grid cell onto the third axis to form a second histogram in the third axis; and determine, for each cell, a translation amount to bring the first histogram into alignment with the second histogram in the third direction; and determine the second translation amount to be the most common translation amount; and to: store or output the determined second translational amount; or to: determine an aligned first 3D dataset, the aligned first 3D dataset being the given point cloud translated, in the third direction, by an amount proportional to the second translational amount; and to: store or output the determined aligned first 3D dataset; wherein, to determine the second translational amount.
21. The apparatus of claim 17, wherein the at least one processor is configured to, for each grid cell in the overlap region: determine a surface normal vector for each point in the given point cloud whose projection is contained in the grid cell to form a first set of surface normal vectors; determine a surface normal vector for each point in the second point cloud whose projection is contained in the grid cell to form a second set of surface normal vectors;
record a projection onto a plane parallel to the first 2D plane of the first and second sets of surface normal vectors to form respective first and second projected sets; record a projection onto the third axis of the first and second projected sets to form first and second projected histograms; determine, for each cell, a translation amount to bring the first projected histogram into alignment with the second projected histogram in the third direction to form, for each cell, a degree of match; and determine the second translation amount to be the maximum degree of match; and to: store or output the determined second translational amount; or to: determine an aligned first 3D dataset, the aligned first 3D dataset being the given point cloud translated, in the third direction, by an amount proportional to the second translational amount; and to: store or output the determined aligned first 3D dataset; wherein, to determine the second translational amount.
22. The apparatus of claim 21 wherein the at least one processor is configured to: determine the absolute value of each component of each surface normal vector in the first and second sets of surface normal vectors along the third axis; and discard those surface normal vectors from their respective sets whose absolute values of their components along the third axis are below a predetermined threshold.
23. A method comprising: obtaining a first 3D dataset of a first space at a first time, the first 3D dataset being a first point cloud in three dimensions, each point in the first point cloud representing a reading within the first space being taken by an imaging apparatus; storing the first 3D dataset as a first cloud of points in a coordinate system; determining the coordinates of a centre of rotation for the first point cloud in the coordinate system, by, for each point in the point cloud: recording the projection of each point in the first point cloud onto a first 2D plane in the coordinate system, the first 2D plane being defined by first and second perpendicular axes in the coordinate system, each of the first and second axes being perpendicular to each other, to form a first 2D dataset, each point of the first 2D dataset being a projection of a corresponding point in the first point cloud onto the first 2D plane;
defining, for the first 2D dataset, a 2D grid of cells contained in the first plane, each cell being of a predetermined size and each cell comprising a number of points of the first 2D dataset; determining a mean position of the first 2D dataset or mean grid cell; selecting, as the components of the centre of rotation in the first and second axes, the coordinates in the first and second axes of the point that is either the mean of the 2D dataset or that is at the centre of the mean grid cell; and storing or outputting the determined components of the centre of rotation in the first and second axes.
24. The method of claim 23, wherein determining the components of the centre of rotation in the first and second axes comprises: removing a subset of the grid cells that are less than a predetermined threshold distance away from their nearest cells containing points; determining the mean position of the remaining grid cells; selecting the grid cell that is nearest to the determined mean; determining the centre of the grid cell that is nearest to the determined mean; and selecting, as the components of the centre of rotation in the first and second axes, the coordinates in the first and second axes of the centre of the grid cell that is nearest to the determined mean.
25. The method of claim 24, wherein, removing a subset of the grid cells comprises: creating a distance map for each cell in the grid by assigning a value to each cell, the assigned value representing a distance between that cell and the nearest cell containing points; and either removing those grid cells having a distance value that is less than a predetermined threshold; or determining which grid cells have distance values that are local maxima, and removing the remaining grid cells.
26. The method of any of claims 23-25, further comprising: determining the coordinate of the centre of rotation in a third axis, the third axis being perpendicular to the first and second axes, the third axis thereby defining a direction normal to the first 2D plane, wherein, to determine the coordinate of the centre of rotation in the third
axis, by, for the points in the first point cloud that correspond to the projected points contained in the grid cell that is nearest to the determined mean: if the range of coordinates in the third axis is less than a predetermined threshold, determining the coordinate of the centre of rotation in the third axis to be the mean value of all the third axis coordinates of the points, optionally summed with a parameter proportional to the height of the imaging apparatus that took the readings of the first space; and, otherwise: determining the point having the highest value of the coordinate in the third axis; determining the point having the lowest value of the coordinate in the third axis; determining the midpoint between the lowest and highest value; and selecting the midpoint as the coordinate of the centre of rotation in the third axis.
27. The method of any of claims 23-26, further comprising: obtaining a second 3D dataset of a second space at a second time, the second space and the first space at least partially overlapping, the second 3D dataset being a second point cloud in three dimensions, each point in the second point cloud representing a reading within the second space being taken by an imaging apparatus; storing the first 3D dataset and the second 3D dataset as, respectively, a first and second clouds of points in a common coordinate system, and, for each of the first and second point clouds: recording the projection of each point in the first and second point clouds onto a second 2D plane in the common coordinate system, the second 2D plane being parallel to the first 2D plane, to form, for each respective point cloud, respective 2D datasets comprising a second 2D dataset and a third 2D dataset, wherein each point of the second 2D dataset is a projection of a corresponding point in the first point cloud onto the second 2D plane, and wherein each point of the third 2D dataset is a projection of a corresponding point in the second point cloud onto the second 2D plane; and determining: a first angle about an axis of rotation, the axis of rotation extending in a direction perpendicular to the second 2D plane and through the point on the second 2D plane having its components in the first and second axes as the determined components of the centre of rotation; and a first translational amount for which the second 2D dataset, rotated about the axis of rotation by the determined angle, and moved in the second 2D plane by the first translational amount, aligns with the third 2D dataset in the second 2D plane; and:
storing or outputting the determined first angle and first translational amount; and/or determining a rotated and translated first 3D dataset, the rotated and translated first 3D dataset being the first point cloud rotated, about the axis of rotation, by an amount proportional to the first angle and translated, in a direction of the second 2D plane, by an amount proportional to the first translational amount; and: storing or outputting the determined rotated and translated first 3D dataset.
28. The method of claim 27, further comprising: defining, for the second and third 2D datasets, respective 2D grids of cells contained in the second 2D plane, each cell in each respective grid being of a predetermined size and each cell comprising a number of points of the respective second or third 2D dataset.
29. The method of claim 27 or 28, further comprising: applying a filter to the second and third 2D datasets to produce respective second and third filtered datasets, by: for each cell in each respective 2D dataset having a number of points of the respective point cloud below a predetermined threshold, increasing the number of points in the cell; and for each cell in each respective 2D dataset having a number of points of the respective point cloud above a predetermined threshold, decreasing the number of points in the cell.
30. The method of any of claims 27-29 wherein determining the first angle and first translational amount comprises: iteratively rotating the second 2D dataset about the axis of rotation by a predetermined amount at each iteration until the second 2D dataset has been rotated by 360 degrees back to its starting point; and, after each iteration, performing a spatial correlation in the second 2D plane to determine a degree of offset between the rotated second 2D dataset and the third 2D dataset based on the position of the highest correlation peak or degree of match; and determining, based on the iterative rotations and spatial correlations, an angle and a translational amount for which a degree of match between the second 2D dataset, rotated by the angle and translated by the translational amount, and the third 2D dataset is the largest; and recording that angle and translational amount as the first angle and first translational amount.
31. A method comprising: obtaining a first 3D dataset of a first space at a first time, the first 3D dataset being a first point cloud in three dimensions, each point in the first point cloud representing a reading within the first space being taken by an imaging apparatus; obtaining a second 3D dataset of a second space at a second time, the second space and the first space at least partially overlapping, the second 3D dataset being a second point cloud in three dimensions, each point in the second point cloud representing a reading within the second space being taken by an imaging apparatus; storing the first 3D dataset and the second 3D dataset as, respectively, a first and second clouds of points in a common coordinate system; and, for each of the first and second point clouds: recording the projection of each point in the first and second point clouds onto a first 2D plane in the common coordinate system, the first 2D plane being defined by first and second perpendicular axes in the coordinate system, each of the first and second axes being perpendicular to each other, to form, for each respective point cloud, respective 2D datasets comprising a first 2D dataset and a second 2D dataset, wherein each point of the first 2D dataset is a projection of a corresponding point in the first point cloud onto the first 2D plane, and wherein each point of the second 2D dataset is a projection of a corresponding point in the second point cloud onto the first 2D plane; and determining: a first angle about an axis of rotation, the axis of rotation extending in a direction perpendicular to the first 2D plane and through a point on the first 2D plane having its components in the first and second axes as the components of a centre of rotation of the first point cloud in the common coordinate system; and a first translational amount for which the first 2D dataset, rotated about the axis of rotation by the determined angle, and moved in the first 2D plane by the first translational amount, aligns with the second 2D dataset in the first 2D plane; and: storing or outputting the determined first angle and first translational amount; or determining a rotated and translated first 3D dataset, the rotated and translated first 3D dataset being the first point cloud rotated, about the axis of rotation, by an amount proportional to the first angle and translated, in a direction of the first 2D plane, by an amount proportional to the first translational amount; and: storing or outputting the determined rotated and translated first 3D dataset.
32. The method of claim 31, further comprising: defining, for the first and second 2D datasets, respective 2D grids of cells contained in the first 2D plane, each cell in each respective grid being of a predetermined size and each cell comprising a number of points of the respective first or second 2D dataset.
33. The method of claim 31 or 32, further comprising: applying a filter to the first and second 2D datasets to produce respective first and second filtered datasets, by: for each cell in each respective 2D dataset having a number of points of the respective point cloud below a predetermined threshold, increasing the number of points in the cell; and for each cell in each respective 2D dataset having a number of points of the respective point cloud above a predetermined threshold, decreasing the number of points in the cell.
34. The method of any of claims 31-33 wherein determining the first angle and first translational amount comprises: iteratively rotating the first 2D dataset about the axis of rotation by a predetermined amount at each iteration until the first 2D dataset has been rotated by 360 degrees back to its starting point; and, after each iteration, performing a spatial correlation in the first 2D plane to determine a degree of offset between the rotated first 2D dataset and the second 2D dataset based on the position of the highest correlation peak or the degree of match; and determining, based on the iterative rotations and spatial correlations, an angle and a translational amount for which a degree of match between the first 2D dataset, rotated by the angle and translated by the translational amount, and the second 2D dataset is the largest; and recording that angle and translational amount as the first angle and first translational amount.
35. The method any of claims 31-34, further comprising determining the coordinates of the centre of rotation of the first point cloud in the coordinate system, by, for each point in the first point cloud: recording the projection of each point in the first point cloud onto a second 2D plane in the coordinate system, the second 2D plane being parallel to the first 2D plane, to form a third 2D dataset, each point of the third 2D dataset being a projection of a corresponding point in the first point cloud onto the second 2D plane;
defining, for the third 2D dataset, a 2D grid of cells contained in the second plane, each cell being of a predetermined size and each cell comprising a number of points of the third 2D dataset; determining a mean position of the third 2D dataset or mean grid cell; selecting, as the components of the centre of rotation in the first and second axes, the coordinates in the first and second axes of the point that is either the mean of the third 2D dataset or that is at the centre of the mean grid cell; and storing or outputting the determined components of the centre of rotation in the first and second axes.
36. The method of claim 35, wherein, determining the components of the centre of rotation in the first and second axes comprises: removing a subset of the grid cells that are less than a predetermined threshold distance away from their nearest cells containing points; determining the mean position of the remaining grid cells; selecting the grid cell that is nearest to the determined mean; determining the centre of the grid cell that is nearest to the determined mean; and selecting, as the components of the centre of rotation in the first and second axes, the coordinates in the first and second axes of the centre of the grid cell that is nearest to the determined mean.
37. The method of claim 36, wherein removing the subset of the grid cells comprises: creating a distance map for each cell in the grid by assigning a value to each cell, the assigned value representing a distance between that cell and the nearest cell containing points; and either removing those grid cells having a distance value that is less than a predetermined threshold; or determining which grid cells have distance values that are local maxima, and removing the remaining grid cells.
38. The method of any of claims 35-37, further comprising: determining the coordinate of the centre of rotation in a third axis, the third axis being perpendicular to the first and second axes, the third axis thereby defining a direction normal
to the second 2D plane, by, for the points in the first point cloud that correspond to the projected points contained in the grid cell that is nearest to the determined mean: if the range of coordinates in the third axis is less than a predetermined threshold, determining the coordinate of the centre of rotation in the third axis to be the mean value of all the third axis coordinates of the points, optionally summed with a parameter proportional to the height of the imaging apparatus that took the readings of the first space; and, otherwise: determining the point having the highest value of the coordinate in the third axis; determining the point having the lowest value of the coordinate in the third axis; determining the midpoint between the lowest and highest value; and selecting the midpoint as the coordinate of the centre of rotation in the third axis.
39. The method of any of claims 23-38, further comprising: determining a second translational amount, being an amount in a third direction, being the direction perpendicular to the first and second axes, the third direction thereby defining a third axis in the common coordinate system, by which a given point cloud, the given point cloud comprising either the first point cloud or the rotated and translated first point cloud, being the first point cloud rotated, about the axis of rotation, by an amount proportional to the first angle and translated, in a direction of the first plane, by an amount proportional to the first translational amount, moved by the second translational amount in the third direction, aligns with the second point cloud in the third direction; by: recording, for each of the given point cloud and the second point cloud, the projection of each point in the respective point clouds onto a third 2D plane in the common coordinate system in the third direction, the third 2D plane being parallel to the first 2D plane, to form respective fourth and fifth 2D datasets, each point of the respective fourth and fifth 2D dataset being a projection of a corresponding point in the respective given point cloud and the second point cloud onto the third plane; defining, for each of the fourth and fifth 2D datasets, a grid of cells contained in the third plane, each cell of each respective grid being of the same predetermined size, each cell in each respective grid comprising a number of points of the respective fourth and fifth datasets; and to determining an overlap subset of grid cells defining an overlap region, being those grid cells that contain a point from both the fourth and fifth 2D datasets.
40. The method of claim 39 wherein, if the overlap subset contains no such cells and the overlap region is empty, then the method comprises: determining 0 for the second translational amount; and, otherwise, the method comprises, for each cell in the overlap region, either: determining the point in the given point cloud having the largest value in the third axis and the point in the given point cloud having the lowest value in the third axis; and recording these largest and lowest values, these values defining a first range for the rotated and translated first point cloud, for each cell; and determining the point in the second point cloud having the largest value in the third axis and the point in the second point cloud having the lowest value in the third axis; and recording these largest and lowest values, these values defining a second range for the second point cloud, to thereby define, for each cell in the overlap region, first and second ranges, the first range having as its endpoints the lowest and largest values in the third axis for the points of the given point cloud contained in the cell, and the second range having as its endpoints the lowest and largest values in the third axis for the points of the second point cloud contained in the cell; determining, for each cell, a midpoint for the first range and a midpoint for the second range; determining, for each cell, a translation amount in the third direction to bring the midpoint of the first range into alignment with the midpoint for the second range; and determining the second translation amount to be the mean value of the translation amounts; or determining the point in the given point cloud having the highest value in the third axis or having the lowest value in the third axis; determining the point in second point cloud having the highest value in the third axis or having the lower value in the third axis; determining, for each cell, a translation amount in the third direction to bring the highest value for the given point cloud into alignment with the highest value for the second point cloud or a translation amount to bring the lowest value for the given point cloud into alignment with the lowest value for the second point cloud; and determining the second translation amount to be the mean value of the translation amounts; and, in either case: storing or outputting the determined second translational amount; or:
determining an aligned first 3D dataset, the aligned first 3D dataset being the given point cloud translated, in the third direction, by an amount proportional to the second translational amount; and: storing or outputting the determined aligned first 3D dataset.
41. The method of claim 39 or 40, further comprising, for each grid cell in the overlap region: determining a measure of the distance between the first and second ranges; and discarding those first and second ranges for which the distance between them is above a predetermined threshold, wherein the determination of the translation amounts is performed for the remaining, un-discarded, ranges.
42. The method of claim 39, further comprising, for each grid cell in the overlap region: recording the projection of each point in the given point cloud whose projection is contained in the grid cell onto the third axis to form a first histogram in the third axis; recording the projection of each point in the second point cloud whose projection is contained in the grid cell onto the third axis to form a second histogram in the third axis; and determining, for each cell, a translation amount to bring the first histogram into alignment with the second histogram in the third direction; determining the second translation amount to be the most common translation amount; and: storing or outputting the determined second translational amount; or: determining an aligned first 3D dataset, the aligned first 3D dataset being the given point cloud translated, in the third direction, by an amount proportional to the second translational amount; and: storing or outputting the determined aligned first 3D dataset.
43. The method of claim 39, further comprising, for each grid cell in the overlap region: determining a surface normal vector for each point in the given point cloud whose projection is contained in the grid cell to form a first set of surface normal vectors; determining a surface normal vector for each point in the second point cloud whose projection is contained in the grid cell to form a second set of surface normal vectors; recording a projection onto a plane parallel to the first 2D plane of the first and second sets of surface normal vectors to form respective first and second projected sets;
recording a projection onto the third axis of the first and second projected sets to form first and second projected histograms; determining, for each cell, a translation amount to bring the first projected histogram into alignment with the second projected histogram in the third direction to form, for each cell, a degree of match; and determining the second translation amount to be the maximum degree of match; and: storing or outputting the determined second translational amount; or: determining an aligned first 3D dataset, the aligned first 3D dataset being the given point cloud translated, in the third direction, by an amount proportional to the second translational amount; and: storing or outputting the determined aligned first 3D dataset.
44. The method of claim 43 further comprising: determining the absolute value of each component of each surface normal vector in the first and second sets of surface normal vectors along the third axis; and discarding those surface normal vectors from their respective sets whose absolute values of their components along the third axis are below a predetermined threshold.
45. An apparatus comprising: a 3D dataset acquisition unit configured to: obtain a first 3D dataset of a first space at a first time, the first 3D dataset being a first point cloud in three dimensions, each point in the first point cloud representing a reading within the first space being taken by an imaging apparatus; obtain a second 3D dataset of a second space at a second time, the second space and the first space at least partially overlapping, the second 3D dataset being a second point cloud in three dimensions, each point in the second point cloud representing a reading within the second space being taken by an imaging apparatus; a storage unit configured to store the first 3D dataset and the second 3D dataset as, respectively, a first and second clouds of points in a common coordinate system; and at least one processor configured to determine a first translational amount, being an amount in a first direction in the common coordinate system by which the first point cloud, moved by the first translational amount in the first direction, aligns with the second point cloud in the first direction, the at least one processor being configured to:
record, for each of the first point cloud and the second point cloud, the projection of each point in the respective point clouds onto a first 2D plane in the common coordinate system in the first direction, the first 2D plane having a normal vector parallel to the first direction, to form respective first and second 2D datasets, each point of the respective first and second 2D datasets being a projection of a corresponding point in the respective first point cloud and the second point cloud onto the first plane; define, for each of the first and second 2D datasets, a grid of cells contained in the first plane, each cell of each respective grid being of the same predetermined size, each cell in each respective grid comprising a number of points of the respective first and second datasets; and to: determine an overlap subset of grid cells defining an overlap region, being those grid cells that contain a point from both the first and second 2D datasets.
46. The apparatus of claim 45 wherein if the overlap subset contains no such cells and the overlap region is empty, then the at least one processor is configured to: determine 0 for the first translational amount; and, otherwise, the at least one processor is configured to, for each cell in the overlap region, either: determine the point in the first point cloud having the largest value in the first axis and the point in the first point cloud having the lowest value in the first axis; and record these largest and lowest values, these values defining a first range for the first point cloud, for each cell; and determine the point in the second point cloud having the largest value in the first axis and determine the point in the second point cloud having the lowest value in the first axis; and record these largest and lowest values, these values defining a second range for the second point cloud for each cell, to thereby define, for each cell in the overlap region, first and second ranges, the first range having as its endpoints the lowest and largest values in the first axis for the points of the first point cloud contained in the cell, and the second range having as its endpoints the lowest and largest values in the first axis for the points of the second point cloud contained in the cell; determine, for each cell, a midpoint for the first range and a midpoint for the second range;
determine, for each cell, a translation amount in the first direction to bring the midpoint of the first range into alignment with the midpoint for the second range; and determine the second translation amount to be the mean value of the translation amounts; or determine the point in the first point cloud having the highest value in the first axis or having the lowest value in the first axis; determine the point in second point cloud having the highest value in the first axis or having the lowest value in the first axis; determine, for each cell, a translation amount in the first direction to bring the highest value for the first point cloud into alignment with the highest value for the second point cloud or a translation amount to bring the lowest value for the first point cloud into alignment with the lowest value for the second point cloud; and determine the first translation amount to be the mean value of the translation amounts; and, in either case: store or output the determined first translational amount; or to: determine an translated first 3D dataset, the translated first 3D dataset being the first point cloud translated, in the first direction, by an amount proportional to the second translational amount; and to: store or output the determined translated first 3D dataset.
47. The apparatus of claim 45 or 46, wherein the at least one processor is configured to, for each grid cell in the overlap region: determine a measure of the distance between the first and second ranges; and discard those first and second ranges for which the distance between them is above a predetermined threshold, wherein the determination of the translation amounts is performed for the remaining, un-discarded, ranges.
48. The apparatus of claim 45, wherein the at least one processor is configured to, for each grid cell in the overlap region: record the projection of each point in the given point cloud whose projection is contained in the grid cell onto the third axis to form a first histogram in the third axis; record the projection of each point in the second point cloud whose projection is contained in the grid cell onto the third axis to form a second histogram in the third axis; and
determine, for each cell, a translation amount to bring the first histogram into alignment with the second histogram in the third direction; and determine the second translation amount to be the most common translation amount; and to: store or output the determined second translational amount; or to: determine an aligned first 3D dataset, the aligned first 3D dataset being the given point cloud translated, in the third direction, by an amount proportional to the second translational amount; and to: store or output the determined aligned first 3D dataset; wherein, to determine the second translational amount.
49. The apparatus of any of claims 45-48, wherein the at least one processor is configured to: determine coordinates of a centre of rotation of the translated first point cloud in the coordinate system, the at least one processor being configured to, for each point in the translated first point cloud: record the projection of each point in the translated first point cloud onto a second 2D plane in the coordinate system, the second 2D plane being parallel to the first 2D plane, to form a third 2D dataset, each point of the third 2D dataset being a projection of a corresponding point in the translated first point cloud onto the second 2D plane; define, for the third 2D dataset, a 2D grid of cells contained in the second plane, each cell being of a predetermined size and each cell comprising a number of points of the third 2D dataset; determine a mean position of the third 2D dataset or mean grid cell; select, as the components of the centre of rotation in second and third axes, the second and third axes being normal to the first axis and being parallel to each other, the coordinates in the second and third axes of the point that is either the mean of the third 2D dataset or that is at the centre of the mean grid cell; and to store or output the determined components of the centre of rotation in the second and third axes.
50. The apparatus of claim 49, wherein, to determine the components of the centre of rotation in the second and third axes, the at least one processor is configured to: remove a subset of the grid cells that are less than a predetermined threshold distance away from their nearest cells containing points;
determine the mean position of the remaining grid cells; select the grid cell that is nearest to the determined mean; determine the centre of the grid cell that is nearest to the determined mean; and select, as the components of the centre of rotation in the second and third axes, the coordinates in the second and third axes of the centre of the grid cell that is nearest to the determined mean.
51. The apparatus of claim 50, wherein, to remove a subset of the grid cells, the at least one processor is configured to: create a distance map for each cell in the grid by assigning a value to each cell, the assigned value representing a distance between that cell and the nearest cell containing points; and either remove those grid cells having a distance value that is less than a predetermined threshold; or determine which grid cells have distance values that are local maxima, and remove the remaining grid cells.
52. The apparatus of any of claims 49-51, wherein the at least one processor is further configured to determine the coordinate of the centre of rotation in the first axis, the at least one processor is configured to, for the points in the translated first point cloud that correspond to the projected points contained in the grid cell that is nearest to the determined mean: if the range of coordinates in the first axis is less than a predetermined threshold, determine the coordinate of the centre of rotation in the first axis to be the mean value of all the first axis coordinates of the points, optionally summed with a parameter proportional to the height of the imaging apparatus that took the readings of the first space; and, otherwise: determine the point having the highest value of the coordinate in the first axis; determine the point having the lowest value of the coordinate in the first axis; determine the midpoint between the lowest and highest value; and select the midpoint as the coordinate of the centre of rotation in the first axis.
53. The apparatus of any of claims 49-52, wherein the at least one processor is configured to: record the projection of each point in the translated first point cloud and the second point cloud onto a third 2D plane in the common coordinate system, the third 2D plane being
parallel to the first 2D plane, to form, for each respective point cloud, respective 2D datasets comprising a fourth 2D dataset and a fifth 2D dataset, wherein each point of the fourth 2D dataset is a projection of a corresponding point in the translated first point cloud onto the third 2D plane, and wherein each point of the fifth 2D dataset is a projection of a corresponding point in the second point cloud onto the third 2D plane; and determine: a first angle about an axis of rotation, the axis of rotation extending in the first direction through the point on the third 2D plane having its components in the second and third axes as the determined components of the centre of rotation; and a second translational amount for which the fourth 2D dataset, rotated about the axis of rotation by the determined angle, and moved in the third 2D plane by the second translational amount, aligns with the fifth 2D dataset in the third 2D plane; and to: store or output the determined first angle and second translational amount; or to determine an aligned first 3D dataset, the aligned first 3D dataset being the translated first point cloud rotated, about the axis of rotation, by an amount proportional to the first angle and translated, in a direction of the third 2D plane, by an amount proportional to the second translational amount; and to: store or output the determined aligned first 3D dataset.
54. The apparatus of claim 53, wherein the at least one processor is configured to: define, for the fourth and fifth 2D datasets, respective 2D grids of cells contained in the third 2D plane, each cell in each respective grid being of a predetermined size and each cell comprising a number of points of the respective fourth or fifth 2D dataset.
55. The apparatus of claim 53 or 54, wherein the at least one processor is configured to: apply a filter to the fourth and fifth 2D datasets to produce respective fourth and fifth filtered datasets, by: for each cell in each respective 2D dataset having a number of points of the respective point cloud below a predetermined threshold, increasing the number of points in the cell; and for each cell in each respective 2D dataset having a number of points of the respective point cloud above a predetermined threshold, decreasing the number of points in the cell.
56. The apparatus of any of claims 53-55 wherein, to determine the first angle and second translational amount, the at least one processor is configured to:
iteratively rotate the fourth 2D dataset about the axis of rotation by a predetermined amount at each iteration until the fourth 2D dataset has been rotated by 360 degrees back to its starting point; and, after each iteration, perform a spatial correlation in the third 2D plane to determine a degree of offset between the rotated fourth 2D dataset and the fifth 2D dataset based on the position of the highest correlation peak or degree of match; and determine, based on the iterative rotations and spatial correlations, an angle and a translational amount for which a degree of match between the fourth 2D dataset, rotated by the angle and translated by the translational amount, and the fifth 2D dataset is the largest; and record that angle and translational amount as the first angle and second translational amount.
57. A method comprising: obtaining a first 3D dataset of a first space at a first time, the first 3D dataset being a first point cloud in three dimensions, each point in the first point cloud representing a reading within the first space being taken by an imaging apparatus; obtaining a second 3D dataset of a second space at a second time, the second space and the first space at least partially overlapping, the second 3D dataset being a second point cloud in three dimensions, each point in the second point cloud representing a reading within the second space being taken by an imaging apparatus; storing the first 3D dataset and the second 3D dataset as, respectively, a first and second clouds of points in a common coordinate system; and determining a first translational amount, being an amount in a first direction in the common coordinate system by which the first point cloud, moved by the first translational amount in the first direction, aligns with the second point cloud in the first direction, by: recording, for each of the first point cloud and the second point cloud, the projection of each point in the respective point clouds onto a first 2D plane in the common coordinate system in the first direction, the first 2D plane having a normal vector parallel to the first direction, to form respective first and second 2D datasets, each point of the respective first and second 2D datasets being a projection of a corresponding point in the respective first point cloud and the second point cloud onto the first plane; defining, for each of the first and second 2D datasets, a grid of cells contained in the first plane, each cell of each respective grid being of the same predetermined size, each cell in
each respective grid comprising a number of points of the respective first and second datasets; determining an overlap subset of grid cells defining an overlap region, being those grid cells that contain a point from both the first and second 2D datasets.
58. The method of claim 57 wherein, if the overlap subset contains no such cells and the overlap region is empty, then the method comprises: determining 0 for the first translational amount; and, otherwise, for each cell in the overlap region, either: determining the point in the first point cloud having the largest value in the first axis and the point in the first point cloud having the lowest value in the first axis; and recording these largest and lowest values, these values defining a first range for the first point cloud, for each cell; and determining the point in the second point cloud having the largest value in the first axis and determine the point in the second point cloud having the lowest value in the first axis; and recording these largest and lowest values, these values defining a second range for the second point cloud, to thereby define, for each cell in the overlap region, first and second ranges, the first range having as its endpoints the lowest and largest values in the first axis for the points of the first point cloud contained in the cell, and the second range having as its endpoints the lowest and largest values in the first axis for the points of the second point cloud contained in the cell; determining, for each cell, a midpoint for the first range and a midpoint for the second range; determining, for each cell, a translation amount in the first direction to bring the midpoint of the first range into alignment with the midpoint for the second range; and determining the second translation amount to be the mean value of the translation amounts; or determining the point in the first point cloud having the highest value in the first axis or having the lowest value in the first axis; determining the point in second point cloud having the highest value in the first axis or having the lowest value in the first axis;
determining, for each cell, a translation amount in the first direction to bring the highest value for the first point cloud into alignment with the highest value for the second point cloud or a translation amount to bring the lowest value for the first point cloud into alignment with the lowest value for the second point cloud; and determining the first translation amount to be the mean value of the translation amounts; and, in either case: storing or output the determined first translational amount; or to: determining an translated first 3D dataset, the translated first 3D dataset being the first point cloud translated, in the first direction, by an amount proportional to the second translational amount; and: storing or outputting the determined translated first 3D dataset.
59. The method of claim 57 or 58, further comprising, for each grid cell in the overlap region: determining a measure of the distance between the first and second ranges; and discarding those first and second ranges for which the distance between them is above a predetermined threshold, wherein the determination of the translation amounts is performed for the remaining, un-discarded, ranges.
60. The method of claim 57, further comprising, for each grid cell in the overlap region: recording the projection of each point in the given point cloud whose projection is contained in the grid cell onto the third axis to form a first histogram in the third axis; recording the projection of each point in the second point cloud whose projection is contained in the grid cell onto the third axis to form a second histogram in the third axis; and determining, for each cell, a translation amount to bring the first histogram into alignment with the second histogram in the third direction; and determining the second translation amount to be the most common translation amount; and: storing or outputting the determined second translational amount; or: determining an aligned first 3D dataset, the aligned first 3D dataset being the given point cloud translated, in the third direction, by an amount proportional to the second translational amount; and: storing or outputting the determined aligned first 3D dataset; wherein, to determine the second translational amount.
61. The method of claim 57-60, further comprising: determining coordinates of a centre of rotation of the translated first point cloud in the coordinate system, by, for each point in the translated first point cloud: recording the projection of each point in the translated first point cloud onto a second 2D plane in the coordinate system, the second 2D plane being parallel to the first 2D plane, to form a third 2D dataset, each point of the third 2D dataset being a projection of a corresponding point in the translated first point cloud onto the second 2D plane; defining, for the third 2D dataset, a 2D grid of cells contained in the second plane, each cell being of a predetermined size and each cell comprising a number of points of the third 2D dataset; determining a mean position of the third 2D dataset or mean grid cell; selecting, as the components of the centre of rotation in second and third axes, the second and third axes being normal to the third axis and being parallel to each other, the coordinates in the second and third axes of the point that is either the mean of the third 2D dataset or that is at the centre of the mean grid cell; and storing or outputting the determined components of the centre of rotation in the second and third axes.
62. The method of claim 61, wherein determining the components of the centre of rotation in the second and third axes comprises: removing a subset of the grid cells that are less than a predetermined threshold distance away from their nearest cell containing points; determining the mean position of the remaining grid cells; selecting the grid cell that is nearest to the determined mean; determining the centre of the grid cell that is nearest to the determined mean; and selecting, as the components of the centre of rotation in the second and third axes, the coordinates in the second and third axes of the centre of the grid cell that is nearest to the determined mean.
63. The method of claim 62, wherein removing the subset of the grid cells comprises: creating a distance map for each cell in the grid by assigning a value to each cell, the assigned value representing a distance between that cell and the nearest cell containing points; and either
removing those grid cells having a distance value that is less than a predetermined threshold; or determining which grid cells have distance values that are local maxima, and removing the remaining grid cells.
64. The method of any of claims 61-63, further comprising: determining the coordinate of the centre of rotation in the first axis, by, for the points in the translated first point cloud that correspond to the projected points contained in the grid cell that is nearest to the determined mean: if the range of coordinates in the first axis is less than a predetermined threshold, determining the coordinate of the centre of rotation in the first axis to be the mean value of all the first axis coordinates of the points, optionally summed with a parameter proportional to the height of the imaging apparatus that took the readings of the first space; and, otherwise: determining the point having the highest value of the coordinate in the first axis; determining the point having the lowest value of the coordinate in the first axis; determining the midpoint between the lowest and highest value; and selecting the midpoint as the coordinate of the centre of rotation in the first axis.
65. The method of any of claims 61-64, further comprising: recording the projection of each point in the translated first point cloud and the second point cloud onto a third 2D plane in the common coordinate system, the third 2D plane being parallel to the first 2D plane, to form, for each respective point cloud, respective 2D datasets comprising a fourth 2D dataset and a fifth 2D dataset, wherein each point of the fourth 2D dataset is a projection of a corresponding point in the translated first point cloud onto the third 2D plane, and wherein each point of the fifth 2D dataset is a projection of a corresponding point in the second point cloud onto the third 2D plane; and determining: a first angle about an axis of rotation, the axis of rotation extending in the first direction through the point on the third 2D plane having its components in the second and third axes as the determined components of the centre of rotation; and a second translational amount for which the fourth 2D dataset, rotated about the axis of rotation by the determined angle, and moved in the third 2D plane by the second translational amount, aligns with the fifth 2D dataset in the third 2D plane; and: storing or outputting the determined first angle and second translational amount; or
determining an aligned first 3D dataset, the aligned first 3D dataset being the translated first point cloud rotated, about the axis of rotation, by an amount proportional to the first angle and translated, in a direction of the third 2D plane, by an amount proportional to the second translational amount; and: storing or outputting the determined aligned first 3D dataset.
66. The method of claim 65, further comprising: defining, for the fourth and fifth 2D datasets, respective 2D grids of cells contained in the third 2D plane, each cell in each respective grid being of a predetermined size and each cell comprising a number of points of the respective fourth or fifth 2D dataset.
67. The method of claim 65 or 66, further comprising: applying a filter to the fourth and fifth 2D datasets to produce respective fourth and fifth filtered datasets, by: for each cell in each respective 2D dataset having a number of points of the respective point cloud below a predetermined threshold, increasing the number of points in the cell; and for each cell in each respective 2D dataset having a number of points of the respective point cloud above a predetermined threshold, decreasing the number of points in the cell.
68. The method of any of claims 65-67 wherein determining the first angle and second translational amount comprises: iteratively rotating the fourth 2D dataset about the axis of rotation by a predetermined amount at each iteration until the fourth 2D dataset has been rotated by 360 degrees back to its starting point; and, after each iteration, performing a spatial correlation in the third 2D plane to determine a degree of offset between the rotated fourth 2D dataset and the fifth 2D dataset; and determining, based on the iterative rotations and spatial correlations, an angle and a translational amount for which a degree of match between the fourth 2D dataset, rotated by the angle and translated by the translational amount, and the fifth 2D dataset is the largest; and record that angle and translational amount as the first angle and second translational amount.
69. The apparatus of any of claims 4, 16, or 52, wherein the at least one processor is configured to determine whether the determined coordinate of the centre of rotation in the
coordinate system lies within a minimum bounding box of the first point cloud or of the first translated point cloud and, if not: project the determined centre of rotation into the minimum bounding box; and select, as new coordinates for the centre of rotation, the projected centre of rotation.
70. The apparatus of any of claims 1-22 or 45-56 or 69, wherein the at least one processor is configured to filter the respective first and second point clouds by being configured to: determine, for each point in the respective first and second point clouds, a surface normal vector; record the projection of each surface normal vector onto the third axis to resolve the component of each surface normal vector along the third axis; determine the absolute value of each component of each surface normal vector along the third axis; and discard those points from their respective point clouds who have surface normal vectors whose absolute values of their components along the third axis are below a predetermined threshold.
71. The apparatus of any of claims 18-22 or 53-56 wherein the at least one processor is configured to determine whether a line projected from a centre or origin of the first point cloud or from a point corresponding to the location of the imaging apparatus that took the readings of the first space to a point of the first aligned 3D dataset intersects a point in the second 2D dataset before it intersects the point of the first aligned 3D dataset.
72. The method of any of claims 26, 38, 64, further comprising: determining whether the determined coordinate of the centre of rotation in the coordinate system lies within a minimum bounding box of the first point cloud or of the first translated point cloud and, if not: projecting the determined centre of rotation into the minimum bounding box; and selecting, as new coordinates for the centre of rotation, the projected centre of rotation.
73. The method of any of claims 23-44 or 69 or 72, further comprising: filtering the respective first and second point clouds by:
determining, for each point in the respective first and second point clouds, a surface normal vector; recording the projection of each surface normal vector onto the third axis to resolve the component of each surface normal vector along the third axis; determining the absolute value of each component of each surface normal vector along the third axis; and discarding those points from their respective point clouds who have surface normal vectors whose absolute values of their components along the third axis are below a predetermined threshold.
74. The method of any of claims 40-44 or 65-68, further comprising determining whether a line projected from a centre or origin of the first point cloud or from a point corresponding to the location of the imaging apparatus that took the readings of the first space to a point of the first aligned 3D dataset intersects a point in the second 2D dataset before it intersects the point of the first aligned 3D dataset.
75. A non-transitory machine-readable medium comprising a set of machine-readable instructions stored thereon which, when executed by at least one processor, cause the at least one processor to perform the method of any of claims 23-44, 57-68, or 72-74.
76. A computer program which, when executed by a computing apparatus, causes the computing apparatus to execute a method according to any of claims 23-44, 57-68, or 72-74.
77. A pair of point clouds defined in a common coordinate system obtained by the method according to any of claims 23-44, 57-68, or 72-74.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB2108116.1 | 2021-06-07 | ||
GB2108116.1A GB2607598A (en) | 2021-06-07 | 2021-06-07 | Aligning 3D datasets |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022258947A1 true WO2022258947A1 (en) | 2022-12-15 |
Family
ID=76838875
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/GB2022/051325 WO2022258947A1 (en) | 2021-06-07 | 2022-05-25 | Aligning 3d datasets |
Country Status (2)
Country | Link |
---|---|
GB (1) | GB2607598A (en) |
WO (1) | WO2022258947A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115439644B (en) * | 2022-08-19 | 2023-08-08 | 广东领慧数字空间科技有限公司 | Similar point cloud data alignment method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9558564B1 (en) * | 2014-05-02 | 2017-01-31 | Hrl Laboratories, Llc | Method for finding important changes in 3D point clouds |
WO2018138516A1 (en) | 2017-01-27 | 2018-08-02 | Ucl Business Plc | Apparatus, method, and system for alignment of 3d datasets |
CN109655039A (en) * | 2018-12-30 | 2019-04-19 | 中国南方电网有限责任公司超高压输电公司检修试验中心 | The determination method and device of tilt angle, storage medium, electronic device |
CN109741374A (en) * | 2019-01-30 | 2019-05-10 | 重庆大学 | Point cloud registering rotation transformation methods, point cloud registration method, equipment and readable storage medium storing program for executing |
US20200242820A1 (en) * | 2019-01-30 | 2020-07-30 | Hyundai Motor Company | Apparatus and method for clustering point cloud |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109243289B (en) * | 2018-09-05 | 2021-02-05 | 武汉中海庭数据技术有限公司 | Method and system for extracting parking spaces of underground garage in high-precision map manufacturing |
US11151734B2 (en) * | 2018-09-14 | 2021-10-19 | Huawei Technologies Co., Ltd. | Method and system for generating synthetic point cloud data using a generative model |
-
2021
- 2021-06-07 GB GB2108116.1A patent/GB2607598A/en not_active Withdrawn
-
2022
- 2022-05-25 WO PCT/GB2022/051325 patent/WO2022258947A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9558564B1 (en) * | 2014-05-02 | 2017-01-31 | Hrl Laboratories, Llc | Method for finding important changes in 3D point clouds |
WO2018138516A1 (en) | 2017-01-27 | 2018-08-02 | Ucl Business Plc | Apparatus, method, and system for alignment of 3d datasets |
CN109655039A (en) * | 2018-12-30 | 2019-04-19 | 中国南方电网有限责任公司超高压输电公司检修试验中心 | The determination method and device of tilt angle, storage medium, electronic device |
CN109741374A (en) * | 2019-01-30 | 2019-05-10 | 重庆大学 | Point cloud registering rotation transformation methods, point cloud registration method, equipment and readable storage medium storing program for executing |
US20200242820A1 (en) * | 2019-01-30 | 2020-07-30 | Hyundai Motor Company | Apparatus and method for clustering point cloud |
Non-Patent Citations (43)
Title |
---|
BEN BELLEKENS ET AL: "A benchmark survey of rigid 3D point cloud registration algorithms", INTERNATIONAL JOURNAL ON ADVANCES IN INTELLIGENT SYSTEMS, 1 January 2015 (2015-01-01), pages 2015, XP055742138, Retrieved from the Internet <URL:https://www.thinkmind.org/articles/intsys_v8_n12_2015_10.pdf> [retrieved on 20201020] * |
C. PARKS. KIMP. MOGHADAMJ. GUOS. SRIDHARANC. FOOKES: "Robust photogeometric localization over time for map-centric loop closure", IEEE ROBOTICS AND AUTOMATION LETTERS, vol. 4, no. 2, 2019, pages 1768 - 1775 |
CHASE, ARLEN F.DIANE Z. CHASECHRISTOPHER T. FISHERSTEPHEN J. LEISZJOHN F. WEISHAMPEL: "Geospatial revolution and remote sensing LiDAR in Mesoamerican archaeology", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES, vol. 109, no. 32, 2012, pages 12916 - 12921 |
CHASE, ARLEN F.DIANE Z. CHASEJOHN F. WEISHAMPELJASON B. DRAKERAMESH L. SHRESTHAK. CLINT SLATTONJAIME J. AWEWILLIAM E. CARTER: "Airborne LiDAR, archaeology, and the ancient Maya landscape at Caracol, Belize", JOURNAL OF ARCHAEOLOGICAL SCIENCE, vol. 38, no. 2, 2011, pages 387 - 398, XP027525994 |
DONG, PENGFEIRUAN, XIAOGANGHUANG, JINGZHU, XIAOQINGXIAO, YAO: "A RGB-D SLAM algorithm combining ORB features and bow", ACM INTERNATIONAL CONFERENCE PROCEEDING SERIES, vol. 118, no. 4, 2018, pages 1 - 6, XP058417082, DOI: 10.1145/3207677.3278061 |
ENDRES, FELIXJIIRGEN HESSJIIRGEN STURMDANIEL CREMERSWOLFRAM BURGARD: "3-D mapping with an RGB-D camera", IEEE TRANSACTIONS ON ROBOTICS, vol. 30, no. 1, 2013, pages 177 - 187, XP011539226, DOI: 10.1109/TRO.2013.2279412 |
ENGELHARD, NIKOLASFELIX ENDRESJIIRGEN HESSJIIRGEN STURMWOLFRAM BURGARD: "Real-time 3D visual SLAM with a hand-held RGB-D camera", PROC. OF THE RGB-D WORKSHOP ON 3D PERCEPTION IN ROBOTICS AT THE EUROPEAN ROBOTICS FORUM, VASTERAS, SWEDEN, vol. 180, 2011, pages 1 - 15 |
GALVEZ-LOPEZ, DORIANTARDOS, J. D.: "Bags of Binary Words for Fast Place Recognition in Image Sequences", IEEE TRANSACTIONS ON ROBOTICS, vol. 28, no. 5, 2012, pages 1188 - 1197, XP011474263, DOI: 10.1109/TRO.2012.2197158 |
H. BAYT. TUYTELAARSL. VAN GOOL: "Computer Vision-ECCV 2006", 2006, SPRINGER BERLIN HEIDELBERG, article "SURF: Speeded Up Robust Features", pages: 404 - 417 |
HIIBNER, PATRICKKATE CLINTWORTHQINGYI LIUMARTIN WEINMANNSVEN WURSTHORN: "Evaluation of HoloLens tracking and depth sensing for indoor mapping applications", SENSORS, vol. 20, no. 4, 2020, pages 1021 |
HSIAO, MINGERIC WESTMANGUOFENG ZHANGMICHAEL KAESS: "2017 IEEE International Conference on Robotics and Automation (ICRA", 2017, IEEE, article "Keyframe-based dense planar SLAM", pages: 5110 - 5117 |
HUANG, ALBERT S.ABRAHAM BACHRACHPETER HENRYMICHAEL KRAININDANIEL MATURANADIETER FOXNICHOLAS ROY: "Robotics Research", 2017, SPRINGER, CHAM, article "Visual odometry and mapping for autonomous flight using an RGB-D camera", pages: 235 - 252 |
KOIDE, KENJI ET AL.: "Voxelized GICP for Fast and Accurate 3D Point Cloud Registration", 16TH INTELLIGENT AUTONOMOUS SYSTEMS CONFERENCE (IAS 16, vol. 5, no. 2703, 2020, pages 1 - 13, Retrieved from the Internet <URL:https://easychair.org/publications/preprint/ftvV> |
LI, XIANLONGCHONGYANG ZHANG: "2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS", 2018, IEEE, article "Robust RGB-D Visual Odometry Based on the Line Intersection Structure Feature in Low-Textured Scenes", pages: 390 - 394 |
LI, ZHAOWEIDAVID R. SELVIAH: "Proceedings Paper, London Communications Symposium", 2011, UNIVERSITY COLLEGE LONDON, article "Comparison of Image Alignment Algorithms" |
LISHAO PENGZHANG, TAOGAO, XIANGWANG, DUOXIAN, YONG: "Semi-direct monocular visual and visual-inertial SLAM with loop closure detection", ROBOTICS AND AUTONOMOUS SYSTEMS, vol. 112, no. 2, 2019, pages 201 - 210 |
LOWKOK-LIM: "Chapel Hill", vol. 4, 2004, UNIVERSITY OF NORTH CAROLINA, article "Linear least-squares optimization for point-to-plane icp surface registration", pages: 1 - 3 |
M. CALONDERV. LEPETITM. OZUYSALT. TRZCINSKIC. STRECHAP. FUA: "BRIEF: Computing a local binary descriptor very fast", IEEE TRANS. PATTERN ANAL. MACH. INTELL., vol. 34, no. 7, pages 1281 - 1298 |
MA, CHENGQIBANG WUSTEFAN POSLADDAVID R. SELVIAH: "Wi-Fi RTT Ranging Performance Characterization and Positioning System Design", IEEE TRANSACTIONS ON MOBILE COMPUTING, 2020 |
MA, CHENGQICHENYANG WANYUEN WUN CHAUSOONG MOON KANGDAVID R. SELVIAH: "2017 International Conference on Indoor Positioning and Indoor Navigation (IPIN", 2017, IEEE, article "Subway station real-time indoor positioning system for cell phones", pages: 1 - 7 |
MA, JIAYIWANG, XINYAHE, YIJIAMEI, XIAOGUANGZHAO, JI: "Line-Based Stereo SLAM by Junction Matching and Vanishing Point Alignment", IEEE ACCESS, vol. 7, no. 5, 2019, pages 181800 - 181811, XP011762978, DOI: 10.1109/ACCESS.2019.2960282 |
NISTERDAVIDOLEG NARODITSKYJAMES BERGEN: "Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004", vol. 1, 2004, IEEE, article "Visual odometry" |
PETERS, J. M. H.: "Total curvature of surfaces (via the divergence of the normal", INTERNATIONAL JOURNAL OF MATHEMATICAL EDUCATION IN SCIENCE AND TECHNOLOGY, vol. 32, no. 6, 2001, pages 795 - 810 |
POTORTI, FRANCESCOSANGJOON PARKANTONINO CRIVELLOFILIPPO PALUMBOMICHELE GIROLAMIPAOLO BARSOCCHISOYEON LEE ET AL.: "The IPIN 2019 Indoor Localisation Competition— Description and Results", IEEE ACCESS, vol. 8, 2020, pages 206674 - 206718, XP011822226, DOI: 10.1109/ACCESS.2020.3037221 |
R. MUR-ARTALJ. D. TARDOS: "ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras", IEEE TRANSACTIONS ON ROBOTICS, vol. 33, no. 5, October 2017 (2017-10-01), pages 1255 - 1262, XP055597959, DOI: 10.1109/TRO.2017.2705103 |
R. MUR-ARTALJ. MONTIELJ. TARDOS: "ORB-SLAM: A Versatile and Accurate Monocular SLAM System", IEEE TRANSACTIONS ON ROBOTICS: A PUBLICATION OF THE IEEE ROBOTICS AND AUTOMATION SOCIETY., vol. 31, no. 5, 2015, pages 1147 - 1163, XP011670910, DOI: 10.1109/TRO.2015.2463671 |
R. WANGY. WANGW. WANK. DI: "A Point-Line Feature based Visual SLAM Method in Dynamic Indoor Scene", 2018 UBIQUITOUS POSITIONING, INDOOR NAVIGATION AND LOCATION-BASED SERVICES (UPINLBS), WUHAN, 2018, pages 1 - 6, XP033467054, DOI: 10.1109/UPINLBS.2018.8559749 |
REN, ZHULILIGUAN WANGLIN BI: "Robust GICP-based 3D LiDAR SLAM for underground mining environment", SENSORS, vol. 19, no. 13, 2019, pages 2915 |
REN, ZHULIWANG, LIGUANBI, LIN: "Robust GICP-based 3D LiDAR SLAM for underground mining environment", SENSORS (SWITZERLAND, vol. 19, no. 6, 2019, pages 122 - 155 |
ROWLANDSALEDAPOSTOLOS SARRIS: "Detection of exposed and subsurface archaeological remains using multi-sensor remote sensing", JOURNAL OF ARCHAEOLOGICAL SCIENCE, vol. 34, no. 5, 2007, pages 795 - 803, XP005733317, DOI: 10.1016/j.jas.2006.06.018 |
S. HONGJ. KIMJ. PYOS. C. YU: "A robust loop-closure method for visual SLAM in unstructured seafloor environments", AUTONOMOUS ROBOTS, vol. 40, no. 6, 2016, pages 1095 - 1109, XP036021927, DOI: 10.1007/s10514-015-9512-6 |
SCARAMUZZA, DAVIDEFRIEDRICH FRAUNDORFER: "Visual odometry [tutorial", IEEE ROBOTICS & AUTOMATION MAGAZINE, vol. 18, no. 4, 2011, pages 80 - 92, XP011387821, DOI: 10.1109/MRA.2011.943233 |
SEGAL, ALEKSANDRDIRK HAEHNELSEBASTIAN THRUN: "Generalized-icp", ROBOTICS: SCIENCE AND SYSTEMS, vol. 2, no. 4, 2009, pages 435 |
SELVIAH, DAVID R.EPAMINONDAS STAMOS: "Similarity suppression algorithm for designing pattern discrimination filters", ASIAN JOURNAL OF PHYSICS, vol. 11, no. 2, 2002, pages 367 - 389 |
SHAWASHJANTIDAVID R. SELVIAH: "Real-time nonlinear parameter estimation using the Levenberg-Marquardt algorithm on field programmable gate arrays", IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, vol. 60, no. 1, 2012, pages 170 - 176 |
STAMOS, EPAMINONDASDAVID R. SELVIAH: "Optical Pattern Recognition IX", vol. 3386, 1998, INTERNATIONAL SOCIETY FOR OPTICS AND PHOTONICS, article "Feature enhancement and similarity suppression algorithm for noisy pattern recognition", pages: 182 - 189 |
TAKETOMITAKAFUMIHIDEAKI UCHIYAMASEI IKEDA: "Visual SLAM algorithms: a survey from 2010 to 2016", IPSJ TRANSACTIONS ON COMPUTER VISION AND APPLICATIONS, vol. 9, no. 1, 2017, pages 1 - 11, XP055468667, DOI: 10.1186/s41074-017-0027-2 |
TORROBA, IGNACIOSPRAGUE, CHRISTOPHER ILIFFEBORE, NILSFOLKESSON, JOHN: "PointNetKL: Deep Inference for GICP Covariance Estimation in Bathymetric SLAM", IEEE ROBOTICS AND AUTOMATION LETTERS, vol. 5, no. 3, 2020, pages 4078 - 4085, XP011787180, DOI: 10.1109/LRA.2020.2988180 |
V. VIJAYANP. KP: "FLANN Based Matching with SIFT Descriptors for Drowsy Features Extraction", PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE IMAGE INFORMATION PROCESSING, 2019, pages 600 - 605, XP033708207, DOI: 10.1109/ICIIP47207.2019.8985924 |
VON GIOIRAFAEL GROMPONEJEREMIE JAKUBOWICZJEAN-MICHEL MORELGREGORY RANDALL: "LSD: A fast line segment detector with a false detection control", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 32, no. 4, 2008, pages 722 - 732, XP011280594, DOI: 10.1109/TPAMI.2008.300 |
X. SHIJ. PENGJ. LIP. YANH. GONG: "The Iterative Closest Point Registration Algorithm Based on the Normal Distribution Transformation", PROCEDIA COMPUTER SCIENCE, vol. 147, 2019, pages 181 - 190, XP085592064, DOI: 10.1016/j.procs.2019.01.219 |
Y. ZHANGH. ZHANGZ. XIONGX. SHENG: "A visual SLAM System with Laser Assisted Optimization", IEEE/ASME INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT MECHATRONICS, AIM, 2019, pages 187 - 192, XP033630026, DOI: 10.1109/AIM.2019.8868664 |
Y.-H. CHOIT.-K. LEES.-Y. OH: "A line feature based SLAM with low grade range sensors using geometric constraints and active exploration for mobile robot", AUTONOMOUS ROBOTS, vol. 24, no. 1, 2008, pages 13 - 27, XP019548472, Retrieved from the Internet <URL:https://doi.org/10.1007/s10514-007-9050-y> |
Also Published As
Publication number | Publication date |
---|---|
GB202108116D0 (en) | 2021-07-21 |
GB2607598A (en) | 2022-12-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3574473B1 (en) | Apparatus, method, and system for alignment of 3d datasets | |
Xiao et al. | Planar segment based three‐dimensional point cloud registration in outdoor environments | |
US9958269B2 (en) | Positioning method for a surveying instrument and said surveying instrument | |
CN103959307B (en) | The method of detection and Expressive Features from gray level image | |
Budroni et al. | Automatic 3D modelling of indoor manhattan-world scenes from laser data | |
Alidoost et al. | An image-based technique for 3D building reconstruction using multi-view UAV images | |
US10432915B2 (en) | Systems, methods, and devices for generating three-dimensional models | |
Özdemir et al. | A multi-purpose benchmark for photogrammetric urban 3D reconstruction in a controlled environment | |
Dold | Extended Gaussian images for the registration of terrestrial scan data | |
WO2022258947A1 (en) | Aligning 3d datasets | |
Budianti et al. | Background blurring and removal for 3d modelling of cultural heritage objects | |
Elkhrachy | Feature extraction of laser scan data based on geometric properties | |
Foryś et al. | Fast adaptive multimodal feature registration (FAMFR): an effective high-resolution point clouds registration workflow for cultural heritage interiors | |
Arnaud et al. | On the fly plane detection and time consistency for indoor building wall recognition using a tablet equipped with a depth sensor | |
Sohn et al. | Sequential modelling of building rooftops by integrating airborne LiDAR data and optical imagery: preliminary results | |
Arıcan et al. | Research on 3D reconstruction of small size objects using structure from motion photogrammetry via smartphone images | |
Altuntas | Pair-wise automatic registration of three-dimensional laser scanning data from historical building by created two-dimensional images | |
Daghigh | Efficient automatic extraction of discontinuities from rock mass 3D point cloud data using unsupervised machine learning and RANSAC | |
Wolters | Automatic 3D reconstruction of indoor Manhattan world scenes using Kinect depth data | |
Boerner et al. | DEM based registration of multi-sensor airborne point clouds exemplary shown on a river side in non urban area | |
García-Moreno | Towards Robust Methodology in Large-Scale Urban Reconstruction | |
Onyango | Multi-resolution automated image registration | |
Hajji et al. | Which Data Sources for the BIM Model? | |
Pashova et al. | Photogrammetry | |
Bethmann et al. | Multi-image semi-global matching in object space |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22729268 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22729268 Country of ref document: EP Kind code of ref document: A1 |