CN114283046A

CN114283046A - Point cloud file registration method and device based on ICP algorithm and storage medium

Info

Publication number: CN114283046A
Application number: CN202111408323.0A
Authority: CN
Inventors: 刘洋; 崔家武; 王清泉; 李海军; 甄兆聪; 柴向浩; 陈满花
Original assignee: Guangzhou Urban Planning Survey and Design Institute
Current assignee: Guangzhou Urban Planning Survey and Design Institute
Priority date: 2021-11-19
Filing date: 2021-11-19
Publication date: 2022-04-05
Anticipated expiration: 2041-11-19
Also published as: CN114283046B

Abstract

The invention discloses a point cloud file registration method based on an ICP (inductively coupled plasma) algorithm, which comprises the following steps of: acquiring a point cloud file set and starting a plurality of processes; dividing the point cloud file set into a plurality of point cloud segments, and sequentially pre-allocating the point cloud segments to corresponding processes; creating a shared memory containing a first variable and a second variable, and initializing the shared memory; respectively scheduling an idle GPU or a CPU in each process in real time to execute an ICP algorithm according to the first variable, and registering the target point cloud in each process through the ICP algorithm to obtain a point cloud file after registration; when any process finishes the registration of all the distributed target point clouds in advance, the point cloud section with the maximum number of the unregistered target point clouds is searched through the second variable, the unregistered target point clouds in the point cloud section are dynamically distributed to the idle process for registration, and therefore the parallelization degree of the registration of the series point cloud files, the utilization rate of computing resources and the efficiency of the parallel registration of the point cloud files can be improved.

Description

Point cloud file registration method and device based on ICP algorithm and storage medium

Technical Field

The invention relates to the technical field of signal processing, in particular to a point cloud file registration method and device based on an ICP (inductively coupled plasma) algorithm and a computer readable storage medium.

Background

At present, the method for registering a point cloud file set icp (iterative close point) mainly includes: a series point cloud file serial ICP registration algorithm and an ICP parallel algorithm. Serial ICP registration algorithm for series point cloud files: a Point cloud file library (PCL) provides a gradual matching scheme for every two Point cloud files, and the idea is to transform all the Point cloud files and finally unify all the Point cloud files into a coordinate system of a source Point cloud. ICP parallel algorithm: ICP parallel acceleration between two point cloud files is mainly studied, the idea being to use a GPU or multiple processor cores (the cores of a CPU) to perform parallel acceleration of certain steps in the ICP algorithm.

However, the two registration algorithms mentioned above are to use CPU + GPU or multiple CPU cores to perform accelerated registration on two continuous point cloud files in sequence, that is, only after completing the accelerated registration of a certain pair of continuous point cloud files, the accelerated registration of another pair of continuous point cloud files will be started, and the prior art mainly has two disadvantages:

1. the parallelization degree is insufficient, the prior art focuses on the parallel registration between two point cloud files, mainly accelerates certain steps of an ICP algorithm, is one-level parallel, has low parallel level, and is insufficient relative to the high-level point cloud file set registration.

2. The utilization rate of the computing resources of the computer is low, and the prior art mainly uses a CPU + GPU mode or a plurality of CPU kernel modes for parallel acceleration, so that the computing resources of the computer cannot be fully utilized. (1) The existing CPU + GPU mode is used for carrying out GPU acceleration on a part which is suitable for GPU acceleration in an ICP algorithm, and the rest part of the algorithm still keeps the serial running of a CPU. When the CPU serial execution is carried out, the computer only starts one CPU kernel for calculation, and the rest CPU kernels and the GPU are in an idle state; after serial execution of the CPU part is finished, the computer starts the GPU to accelerate the parallel part, all CPU kernels in the computer enter an idle or waiting state, namely in the whole registration process, the CPU and the GPU are asynchronous, only 1 CPU kernel or GPU runs at the same time, and the computing resources of the computer cannot be fully used. (2) The existing parallel mode of a plurality of CPU cores is used for paralleling a part which is suitable for multi-core paralleling in the ICP algorithm, and the rest part of the algorithm still keeps the serial running of the CPUs. Similarly, when the CPU serial execution is carried out, only one CPU kernel of the computer is in computation at the same time, and the rest CPU kernels and the GPU are in an idle state; when a plurality of CPU kernels are executed in parallel, the GPU of the computer is in an idle state, and the computing resources of the computer are not fully utilized.

Disclosure of Invention

The embodiment of the invention provides a point cloud file registration method and device based on an ICP (inductively coupled plasma) algorithm and a computer readable storage medium, which can improve the parallelization degree of series point cloud file registration, the utilization rate of computing resources and the efficiency of point cloud file parallelization registration.

In order to achieve the above object, an embodiment of the present invention provides a point cloud file registration method based on an ICP algorithm, including:

acquiring a point cloud file set, and starting a plurality of processes through a message transmission interface; the point cloud file set comprises 1 source point cloud and k target point clouds to be registered;

dividing the point cloud file set into a plurality of point cloud sections, and sequentially pre-distributing the point cloud sections to the corresponding processes;

creating a shared memory containing 1 first variable and k second variables, and initializing the shared memory; the first variable corresponds to the number of idle GPUs, and the second variable corresponds to the registration condition of the target point cloud one by one;

respectively scheduling an idle GPU or a CPU in each process in real time to execute an ICP algorithm according to the first variable, and registering the target point cloud in each process through the message transmission interface and the ICP algorithm to obtain k registered point cloud files;

when any one process finishes the registration of all the distributed target point clouds in advance, the point cloud section with the maximum number of the unregistered target point clouds is searched through the second variable, and the unregistered target point clouds in the point cloud section are dynamically distributed to the idle process for registration.

As an improvement of the above scheme, the initializing the shared memory specifically includes:

initializing the value of the first variable to be N; wherein N is the number of the computer GPUs;

initializing the values of k of the second variables to 0.

As an improvement of the above scheme, the scheduling, in real time, an idle GPU or a CPU in each process to execute an ICP algorithm according to the first variable specifically includes:

when any process reads that the second variable is 0, scheduling a CPU to execute an ICP algorithm in real time;

and when any process reads that the second variable is not 0, scheduling the GPU to execute the ICP algorithm in real time.

As an improvement of the above scheme, the registering of the target point cloud in each process through the message transmission interface and the ICP algorithm to obtain k registered point cloud files specifically includes:

carrying out local registration on the target point cloud in each process through the ICP algorithm to obtain a local transformation matrix corresponding to each target point cloud;

collecting a local transformation matrix corresponding to each target point cloud through the message transmission interface, and converting each local transformation matrix through a preset conversion formula to obtain a global transformation matrix corresponding to each target point cloud;

and performing global registration on each target point cloud through the global transformation matrix corresponding to each target point cloud to obtain k registered point cloud files.

As an improvement of the above scheme, when any one of the processes completes the registration of all the allocated target point clouds in advance, the point cloud segment with the largest number of unregistered target point clouds is searched through the second variable, and the unregistered target point clouds in the point cloud segment are dynamically allocated to the idle process for registration, specifically:

when any one process finishes the local registration of all the distributed target point clouds in advance, searching a point cloud section with the maximum number of the target point clouds which are not locally registered through the second variable, and dynamically distributing the target point clouds which are not locally registered in the point cloud section to an idle process for local registration;

when any one process completes the global registration of all the allocated target point clouds in advance, the point cloud section with the maximum number of target point clouds which are not subjected to global registration is searched through the second variable, and the target point clouds which are not subjected to global registration in the point cloud section are allocated to the idle process again for global registration.

As an improvement of the above scheme, the preset conversion formula specifically includes:

wherein, T_iFor the local transformation matrix corresponding to the ith target point cloud, G_iAnd the transformation matrix is a global transformation matrix corresponding to the ith target point cloud.

As an improvement of the above solution, the i-th target point cloud is globally registered according to the following formula:

O_i＝Q_i*G_i(1≤i≤k)；

wherein, O_iFor the registered point cloud file corresponding to the ith target point cloud, Q_iFor the ith target point cloud, the point cloud is,G_iand the transformation matrix is a global transformation matrix corresponding to the ith target point cloud.

In order to achieve the above object, an embodiment of the present invention correspondingly provides a point cloud file registration apparatus based on an ICP algorithm, including:

the data preprocessing module is used for acquiring a point cloud file set and starting a plurality of processes through a message transmission interface; the point cloud file set comprises 1 source point cloud and k target point clouds to be registered;

the data pre-distribution module is used for dividing the point cloud file set into a plurality of point cloud sections and sequentially pre-distributing the point cloud sections to the corresponding processes;

the shared data module is used for creating a shared memory containing 1 first variable and k second variables and initializing the shared memory; the first variable corresponds to the number of idle GPUs, and the second variable corresponds to the registration condition of the target point cloud one by one;

the point cloud registration module is used for scheduling an idle GPU or a CPU in each process in real time to execute an ICP algorithm according to the first variable, and registering the target point cloud in each process through the message transmission interface and the ICP algorithm to obtain k registered point cloud files;

and the dynamic allocation module is used for searching a point cloud section with the maximum number of unregistered target point clouds through the second variable when any one process finishes the allocation of all the allocated target point clouds in advance, and dynamically allocating the unregistered target point clouds in the point cloud section to an idle process for the allocation.

As an improvement of the above solution, the point cloud registration module includes:

the local registration unit is used for carrying out local registration on the target point clouds in each process through the ICP algorithm to obtain a local transformation matrix corresponding to each target point cloud;

the matrix transformation unit is used for collecting a local transformation matrix corresponding to each target point cloud through the message transmission interface and transforming each local transformation matrix through a preset transformation formula to obtain a global transformation matrix corresponding to each target point cloud;

and the global registration unit is used for carrying out global registration on each target point cloud through a global transformation matrix corresponding to each target point cloud to obtain k registered point cloud files.

In order to achieve the above object, an embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program, and when the computer program runs, the apparatus where the computer-readable storage medium is located is controlled to execute the ICP algorithm-based point cloud file registration method according to the above embodiment of the present invention.

Compared with the prior art, the point cloud file registration method, device and computer storage medium based on the ICP algorithm disclosed by the embodiment of the invention have the advantages that firstly, a plurality of processes are started through the message passing interface MPI, a series of point cloud files with continuous serial numbers are divided into a plurality of point cloud segments, and the point cloud segments distributed to each process are respectively subjected to parallel registration to realize primary parallel; secondly, real-time scheduling CPU and GPU computing resources of a computer according to a first variable, setting two paths of CPU execution and GPU acceleration for point cloud file registration, and realizing synchronous parallelism of a plurality of CPU kernels and a plurality of GPUs; the GPU realizes the two-stage parallel accelerated registration of two continuous point cloud files; and then, pre-allocating and dynamically allocating the registration task of the point cloud file to each process through a second variable in the shared memory, and realizing load balance of each process of the message passing interface MPI, so that the parallelization degree of point cloud file set registration, the utilization rate of computer computing resources and the efficiency of point cloud file parallelization registration can be improved.

Drawings

Fig. 1 is a schematic flow chart of a point cloud file registration method based on an ICP algorithm according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of a point cloud file registration method based on an ICP algorithm according to an embodiment of the present invention;

FIG. 3 illustrates an ICP-CPU and ICP-GPU coordination processing framework according to an embodiment of the present invention;

FIG. 4 is a schematic flow chart of pre-allocation and dynamic allocation of a point cloud file according to an embodiment of the present invention;

FIG. 5 is a schematic flow chart of dynamic distribution of point cloud files according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a point cloud file registration apparatus based on an ICP algorithm according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Fig. 1 is a schematic flow chart of a point cloud file registration method based on an ICP algorithm according to an embodiment of the present invention.

The point cloud file registration method based on the ICP algorithm provided by the embodiment of the invention comprises the following steps:

s11, acquiring a point cloud file set, and starting a plurality of processes through a message transmission interface; the point cloud file set comprises 1 source point cloud and k target point clouds to be registered;

s12, dividing the point cloud file set into a plurality of point cloud segments, and sequentially pre-allocating the point cloud segments to the corresponding processes;

s13, creating a shared memory containing 1 first variable and k second variables, and initializing the shared memory; the first variable corresponds to the number of idle GPUs, and the second variable corresponds to the registration condition of the target point cloud one by one;

s14, respectively scheduling an idle GPU or a CPU in each process in real time to execute an ICP algorithm according to the first variable, and registering the target point cloud in each process through the message transmission interface and the ICP algorithm to obtain k registered point cloud files; wherein the content of the first and second substances,

It should be noted that the message passing interface MPI parallel mode adopts a peer-to-peer mode, and the function or the computation flow of each process is the same.

Preferably, the number of the processes is equal to the total core number of the computer or the number of the computer CPUs.

Further, in the step S12, a first parameter and a second parameter are obtained through a preset formula;

when the second parameter is 0, sequentially distributing m continuous point cloud files to each process;

when the second parameter is not 0, sequentially allocating m +1 continuous point cloud files to the first R processes, and sequentially allocating m continuous point cloud files to the remaining n +1-R processes; wherein m is a first parameter and R is a second parameter.

Specifically, the preset formula is as follows:

m＝[(k+1)+n-1]/(n+1)LR；

wherein n +1 is the total process amount, k +1 is the total point cloud file amount, and k is a positive integer.

Illustratively, referring to fig. 2, first, k +1 point cloud files are divided into n +1 point cloud segments according to the total number of processes, and according to the formula m ═ ((k +1) + n-1)/(n +1), m point cloud files with consecutive sequence numbers are pre-allocated to each process, such as process P₀Pre-distribution processing point cloud file [ Q₀,Q_m-1]，P₁Pre-distribution processing point cloud file [ Q_m-1,Q_2m-2]，P₂Pre-distribution processing point cloud file [ Q_2m-2,Q_3m-3]And so on, P_nPre-distribution processing point cloud file [ Q_n*m-n,Q_k](ii) a When the formula m ═ ((k +1) + n-1)/(n +1) cannot be divided evenly and the second parameter is R, then from the first R processes [ P ═ P₀,P_R-1]Respectively pre-distributing and processing m +1 continuous point cloud files, and pre-distributing and processing m continuous point cloud files in other processes, such as: process P₀Pre-distribution processing point cloud file [ Q₀,Q_m]Process P₁Pre-allocation process [ Q_m,Q_2m]。

Referring to fig. 2, for the convenience of those skilled in the art, the reading and registration of the point cloud file will be briefly described below by taking the second parameter R as 0:

and each process of the message passing interface MPI reads the point cloud files pre-distributed to the process in sequence. In the first round of reading, each process first reads the first two files of the pre-assigned point cloud file, such as process P₀Sequentially reading point cloud file Q₀And Q₁Process P₁Sequentially reading point cloud file Q_m-1And Q_mProcess P_nSequentially reading point cloud file Q_n*m-nAnd Q_n*m-n+1(ii) a After the first round of reading is finished, each process takes the previous point cloud file as a source point cloud and the next point cloud file as a target point cloud in sequence, and registration is carried out according to an ICP (inductively coupled plasma) algorithm to obtain a local transformation matrix T corresponding to each target point cloud respectively_i，T_iRepresenting the next point cloud file Q_iRegistered to the previous point cloud file Q_i-1. In the first round of computation, process P₀Calculating to obtain a local transformation matrix T₁Process P₁Calculating to obtain a local transformation matrix T_mProcess P_nCalculating to obtain a local transformation matrix T_n*m-n+1。

After the local transformation matrix of the first round is obtained, each process continuously reads the residual point cloud files according to the point cloud file serial numbers, which is different from the two files read in the first round, and only 1 point cloud file is read each time in the reading of the residual point cloud files. E.g. at round 2 reading, process P₀Reading point cloud file Q₂Process P₁Reading point cloud file Q_m+1Process P_nReading point cloud file Q_n*m-n+2After reading, the current point cloud file Q is read_iAs target point cloud, point cloud file Q read in previous round_i-1As a source point cloudAnd (6) registering. Following with procedure P₀For example, in the case of the 2 nd round alignment, Q is given₂Is a target point cloud, Q₁Registering the source point cloud to obtain a local transformation matrix T₂(ii) a 3 rd round reading point cloud file Q₃With Q₃Is a target point cloud, Q₂Registering the source point cloud to obtain a local transformation matrix T₃(ii) a And circulating the steps until all the point cloud files pre-distributed by a certain process are read and processed by 1 process, ending the point cloud file pre-distribution reading stage and entering a point cloud file dynamic distribution stage. It should be noted that the dynamic allocation is because the workload and the computing resource allocated to each process are different, the time taken for each process to complete the pre-allocated task is different, and in order to balance the load of each process, the unprocessed point cloud file needs to be allocated again.

Specifically, in step S13, a shared memory including 1 first variable and k second variables is created through the MPI _ Win _ allocate _ shared () function.

It should be noted that the shared memory is used for process service, and the data in the shared memory is integer data, that is, the first variable and the second variable are integer data.

In some preferred embodiments, in step S13, the initializing the shared memory specifically includes:

initializing the values of k of the second variables to 0.

Preferably, the first variable and the second variable are initialized by the 1 st process.

It can be understood that the first variable is used for informing the number of GPUs currently idle in the process in real time, so that the message passing interface MPI preferentially schedules the idle GPUs to execute a part applicable to GPU acceleration in the ICP algorithm, so as to accelerate the registration of the target point cloud; the second variable is used for recording the processing state of each current target point cloud so that each process can update and know the processing state of each point cloud file in real time, and dynamic task allocation among the processes is achieved.

In a specific embodiment, in step S14, the scheduling, in real time, an idle GPU or a CPU in each process according to the first variable to execute an ICP algorithm includes:

It is worth mentioning that before scheduling the idle GPUs to execute the ICP algorithm, a subtraction operation needs to be performed on the first variable, and after the accelerated computation of the GPUs is completed, one is added to the first variable to ensure that the first variable reflects the number of the idle GPUs in real time.

It is worth mentioning that an existing computer or single node generally consists of multiple cores and several GPUs, such as: the computer has A CPUs, 1 CPU is composed of B kernels, the number of the CPU kernels of the computer is A × B, and the number of the kernels generally far exceeds the number of the GPUs. Supposing that a computer has N CPU kernels and 1 GPU, N processes can be started by MPI, the N processes are synchronous and parallel, and each process is accelerated by the GPU in the calculation process; however, because the computer has only 1 GPU, it is impossible to accelerate the GPU in N processes simultaneously, each process can only queue up for using the GPU in turn, and each CPU core enters a waiting idle state while waiting for using the GPU, thereby seriously affecting the parallel effect. To solve the problem, as shown in fig. 3, in the ICP-CPU and ICP-GPU co-processing framework provided in an embodiment of the present invention, for a portion of the ICP algorithm suitable for GPU acceleration, a first variable S is determined first₀If the numerical value is 0, all the GPUs are occupied by other processes, so that the CPU is adopted for execution; when the first variable S₀If not 0, it indicates that there is an idle GPU in the computer, and the formula S is first followed₀＝S₀-1 updating the number of idle GPUs, then accelerating the part using the GPU, releasing the GPU after the acceleration is finished, and performing the following formula S₀＝S₀+1 updating the number of idle GPUs, and finally outputting the partial process result to the CPU. Through the ICP-CPU and ICP-GPU cooperative processing framework, two paths of CPU execution and GPU acceleration are set for the registration of the point cloud file, the computing resources of all CPU kernels and GPU in a computer can be fully utilized, the synchronous computation of the CPU and the GPU is realized, the idle time of the CPU and the GPU is greatly reduced, and the utilization rate of the CPU and the GPU is increased.

It should be noted that whether the GPU is used for acceleration or not depends on the ICP algorithm used, and the applicability of GPU acceleration is different and the acceleration method is also different depending on the different algorithms used. In addition, aiming at different ICP algorithms, a plurality of different GPU acceleration algorithms exist, so the ICP-CPU and ICP-GPU cooperative processing framework provided by the invention is not aiming at the specific ICP algorithm.

In some preferred embodiments, in step S14, the target point cloud in each process is registered through the message transmission interface and the ICP algorithm to obtain k registered point cloud files, specifically:

and performing global registration on each target point cloud through a global transformation matrix corresponding to each target point cloud to obtain k registered point cloud files.

Further, before the target point cloud in each process is locally registered, the second variable corresponding to the target point cloud is updated to 1; and updating the second variable corresponding to each target point cloud to be 2 before carrying out global registration on each target point cloud through the global transformation matrix.

It should be noted that the ICP algorithm can be generally divided into 4 processes: 1, searching corresponding points of two point cloud files; process 2, estimating a transformation matrix; step 3, converting the target point cloud data by using the transformation matrix; and 4, judging whether the precision meets the requirement, outputting a transformation matrix if the precision meets the requirement, and returning to the process 1 for iteration if the precision does not meet the requirement. The specific implementation algorithm of ICP is many, such as: the conventional method (traversing all point cloud files), the space cell division method, the octree method, the kd-tree method and the like are used for searching corresponding points of the two point cloud files; the estimation transformation matrix includes an LM method, a unit quaternion method, an SVD singular value decomposition method, and the like.

In a specific embodiment, the step S15 specifically includes:

Preferably, when the second variable is 0, the target point cloud corresponding to the second variable is not locally registered; when the second variable is 1, the target point cloud corresponding to the second variable is locally registered or is already locally registered; when the second variable is 2, the target point cloud corresponding to the second variable is undergoing global registration or has completed global registration.

Preferably, when any one of the processes completes the local registration of all the allocated target point clouds in advance, the point cloud segment with the maximum number of target point clouds which are not locally registered is searched through the second variable, and the target point clouds which are not locally registered in the point cloud segment are dynamically allocated to an idle process for local registration, specifically:

when any one process finishes the local registration of all the distributed target point clouds in advance, all the second variables are traversed, all the target point clouds with the second variable values being 0 are searched, and the point cloud section with the maximum number of the target point clouds which are not locally registered is obtained;

and starting from the last two target point clouds of the point cloud segment with the maximum number of the target point clouds which are not locally registered, dynamically allocating the target point clouds which are not locally registered to an idle process for local registration.

Preferably, when any one of the processes completes the global registration of all the allocated target point clouds in advance, the point cloud segment with the maximum number of target point clouds which are not globally registered is searched through the second variable, and the target point clouds which are not globally registered in the point cloud segment are reallocated to an idle process for global registration, specifically:

when the global registration of all the distributed target point clouds is completed in advance by any one process, all the target point clouds with the second variable values being 1 are searched through traversing all the second variables, and the point cloud section which is not subjected to global registration and has the maximum number of the target point clouds is obtained;

and starting from the last target point cloud of the point cloud segment with the maximum number of target point clouds which are not globally registered, dynamically allocating the target point clouds which are not globally registered to an idle process for global registration.

It can be understood that the invention divides the point cloud file registration process into two modes of pre-allocation and dynamic allocation. Pre-allocation mode: pre-distributing point cloud files for each process according to the total amount of the point cloud files and the total amount of MPI processes; dynamic allocation mode: and when one process in the processes finishes processing the pre-distributed point cloud files, searching the point cloud section with the largest unprocessed number of the point cloud files, and starting reading and registering from the last two point cloud files of the point cloud section.

Exemplary, such as: process P₀Preparing to read a point cloud file Q₂First accessing the second variable S₂If S is₂If 1, point cloud file Q is described₂Registered or being registered by other processes, which indicates the point cloud file Q₂Having entered the dynamic allocation phase, P₀Traverse a second variable [ S ]₁,S_k]Reading and registering from the last two point cloud files of the maximum continuous interval corresponding to the second variable value of 0; if S is₂When the value is equal to 0, the order of S₂Set as 1 and start reading the point cloud file Q₂Calculating point cloud file Q₂Conversion to a point cloud file Q₁Local transformation matrix T of₂。

Illustratively, each process computes a registered point cloud file O according to a pre-assigned task_iBefore, the second variable S is read_iSecond variable S _i2 indicates the point cloud file O after registration_iHaving been calculated by other processes or being calculated, the second variable is traversed [ S ]₁,S_k]Reading from the last 1 point cloud file of the maximum continuous interval with the second variable value of 1, and calculating and outputting the point cloud file O after the point cloud file corresponding to the point cloud file is registered_i(ii) a Second variable S _i1 indicates the point cloud file O after registration_iIs not calculated, S_iUpdating to 2, then calculating and outputting a point cloud file O after registration_i(ii) a The process is circulated until the second variable [ S ]₁,S_k]All are 2, and all point cloud files [ O ] after registration are described₁,O_k]Is being calculated or has been output.

In a specific embodiment, the preset conversion formula is specifically:

In a preferred embodiment, the i-th target point cloud is globally registered according to the following formula:

O_i＝Q_i*G_i(1≤i≤k)；

wherein, O_iRegistered point cloud text corresponding to ith target point cloudElement, Q_iFor the ith target point cloud, G_iAnd the transformation matrix is a global transformation matrix corresponding to the ith target point cloud.

For convenience of explanation, the ICP registration process is divided into steps 1 to n, and the steps are not specific to a particular ICP algorithm, but are merely for simplicity of description. Referring to fig. 2 to 5, a specific flow of the point cloud file registration method based on the ICP algorithm is described below by a specific embodiment.

1) Message passing interface MPI starts n +1 processes P₀,P_n]。

2) Performing point cloud file pre-distribution according to the total number of the point cloud files and the total number of the processes;

it should be noted that only the point cloud file serial number is pre-assigned, but the point cloud file is not read.

3) Utilizing MPI _ Win _ allocate _ shared (·) function to create shared memory for each process, making shared memory data be int integer, and storing k +1 integer data [ S ]₀,S_k]。

4) Utilizing process P₀Initializing integer data, wherein S₀The number of GPUs in the computer is S₀Carry out assignment, [ S ]₁,S_k]All assigned values are 0, representing a point cloud file [ Q₁,Q_k]Not registered.

5) Each process carries out first round reading according to the point cloud file pre-distributed by each process;

it needs to be explained that the first round of reading needs to read two point cloud files, and when the 1 st point cloud file Q is read_i-1Then, do not need to be [ S ]₁,S_k]Performing any operation; reading 2 nd point cloud file Q_iBefore the file, the file needs to be paired with S_iUpdated and assigned a value of 1, representing the point cloud file Q_iWill be read.

6) After the point cloud file is read in the first round, the ICP-CPU and ICP-GPU cooperative processing stage is started, wherein the data preprocessing, ICP registration and the like are included;

referring to fig. 3, it is analyzed whether each step is suitable for GPU acceleration and is not suitable for the ICP algorithm used specificallyThe part adopting the GPU for acceleration is still executed by adopting a CPU, and for the part adopting the GPU for acceleration, the integer data S is judged firstly₀Whether the value of (A) is 0, integer data S₀If the number is 0, all the GPUs are occupied by other processes, so that the CPU is adopted for execution; integer data S₀If not 0, it indicates that there is an idle GPU in the computer, and the formula S is first followed₀＝S₀-1 updating the number of idle GPUs, accelerating the part using the GPU, releasing the GPU after the acceleration is finished, and following the equation S₀＝S₀+1, updating the number of idle GPUs, and finally outputting the process result of the part to the CPU; and continuously executing the rest steps of the ICP algorithm according to the ICP-CPU and ICP-GPU cooperative processing method until a first round local transformation matrix T is obtained.

7) After obtaining the first round of local transformation matrix T, each process starts to execute the 2 nd round of point cloud file reading;

such as: p₀Preparing to read a point cloud file Q₂Reading point cloud file Q₂Before, the integer data S is accessed₂If S is₂If 1, point cloud file Q is described₂Registered or being registered by other processes, which indicates that the point cloud file enters the dynamic allocation stage, P₀Traverse [ S ]₁,S_k]Reading and registering from the last two point cloud files of the maximum continuous interval with the integer data value of 0; if S is₂When the value is equal to 0, the order of S ₂1 and starts reading Q₂Then, Q is calculated according to the step 6) above₂Conversion to Q₁Local transformation matrix T of₂Specific examples are shown with reference to fig. 4 and 5.

8) All processes are processed according to step 7) until [ S ]₁,S_k]All the values of (1) indicate that all the point cloud files are registered or are being registered, the registration in this step is local registration, and the dynamic allocation is finished.

9) Collecting all local transformation matrices [ T ] using MPI _ Allgather (·) function₁,T_k]And computing the ith global transformation matrix G according to equation (1)_i；

Wherein, T in the formula (1)_iRepresenting point cloud files Q_iRegistering to a Point cloud File Q_i-1Local transformation matrix of G_iRepresenting point cloud files Q_iRegistering to a Point cloud File Q₀Global transformation matrix (source point cloud).

10) Each process respectively uses the point cloud file Q according to the formula (2)_iCoordinates unified to source point cloud Q₀And outputting a point cloud file Oi after registration:

O_i＝Q_i*G_i(1. ltoreq. i. ltoreq.k) formula (2)

In this step, each process performs task allocation based on the local transformation matrix calculated by the process, and P is assumed to be₀Calculate T₁、T₂、T₅、T₁₀Then P is pre-allocated₀Calculating and outputting a registered point cloud file O₁、O₂、O₅、O₁₀. When a process firstly processes all pre-distributed tasks, assisting other processes to complete the rest tasks, and calculating the point cloud file O after registration according to the pre-distributed tasks by each process_iFirst, the integer data S is read_iIf the data S is integer _i2, the point cloud file O after registration is shown_iHaving been computed by other processes or being computed, the integer data is traversed [ S ]₁,S_k]Reading from the last 1 point cloud file of the maximum continuous interval with the integer data value of 1, and then calculating and outputting a corresponding point cloud file O after registration; integer data S _i1 indicates the point cloud file O after registration_iIs not calculated, S_iUpdating to 2, then calculating and outputting a point cloud file O after registration_i(ii) a The operation is circulated until the data is shaped₁,S_k]If all are updated to 2, all the point cloud files after registration are indicated [ O ]₁,O_k]Is calculating or has been output;

the processes are counted according to the formula (2)When calculating, the integer data S is accessed first₀If the numerical value is 0, all the GPUs in the computer are occupied by other processes, so that the CPU is adopted for executing; integer data S₀If not 0, it indicates that there is an idle GPU in the computer, and the formula S is first followed₀＝S₀-1 updating the number of idle GPUs, accelerating the following equation using the GPU, releasing the GPU after the acceleration is finished, and following equation S₀＝S₀+1 updating the number of idle GPUs, and finally outputting the point cloud file O after registration_i。

To further describe the pre-allocation and dynamic allocation process of the point cloud file, 4 processes are adopted [ P ]₀,P₃]17 Point cloud File [ Q ]₀,Q₁₆]The description is given. Referring to fig. 4, each process pre-allocates and processes 5 point cloud files, process P₀Responsible for point cloud document [ Q₀,Q₄]Process P₁Responsible for point cloud document [ Q₄,Q₈]Process P₂Responsible for point cloud document [ Q₈,Q₁₂]Process P₃Responsible for point cloud document [ Q₁₂,Q₁₆](ii) a As shown in FIG. 4, each process reads the pre-distributed 1 st point cloud file as the source point cloud in parallel, and the process P₀Reading point cloud file Q₀Process P₁Reading point cloud file Q₄Process P₂Reading point cloud file Q₈Process P₃Reading point cloud file Q₁₂(ii) a After reading, updating the corresponding integer data of each process before reading the pre-distributed 2 nd point cloud file; by process P₀For example, P₀Will S₁Firstly, updating to 1, and then reading the point cloud file Q₁Read out Q₁Then, a local transformation matrix T is calculated₁(ii) a Then reading S₂，S₂When Q is 0, Q is described₂If not registered, then S₂Is updated to 1 and then Q is read₂And calculate Q₂Registration to Q₁Local transformation matrix T of₂(ii) a And circulating the steps until a certain process finishes the pre-allocation task first, and marking the end of the pre-allocation stage.

Because of the computing task of each process in the computing processDifferent amounts of calculation resources are allocated, as shown in fig. 3, the GPU cannot guarantee that the time and the number of times of using the GPU by each process are the same according to the first come first use principle, and the calculation schedules of the processes are different, for example: process P₁Complete all pre-distribution point cloud files [ Q ]₅,Q₈]In the local registration of (2), process P₀Q remains₃And Q₄Unregistered, process P2 still has Q remaining₁₀、Q₁₁And Q₁₂Unregistered, process P₃Q remains₁₅Not registered. Process P₁All pre-assigned tasks are completed first and other processes need to be assisted to complete the remaining tasks. First traverse the integer data [ S ]₁,S_k]It can be found that there are 3 consecutive intervals of 0, respectively [ Q ]₃,Q₄]、[Q₁₀,Q₁₂]、[Q₁₅]Due to P₂Most outstanding tasks remain, [ Q ]₁₀,Q₁₂]In total 3 point cloud files are not registered, process P1 selects [ Q [ ]₁₀,Q₁₂]Last 1 point cloud file Q₁₂Performing registration for registering Q₁₂Like the first round of reading in step 5), it is necessary to read Q first₁₂Corresponding Source Point cloud Q₁₁Then, S is₁₂Is updated to 1 and Q is read₁₂Finally, a local transformation matrix T is calculated₁₂. Other processes after completing the pre-allocation task, as process P₁Similarly, other processes are assisted to complete unfinished tasks until S₁,S_k]All the values are 1, which indicates that all the point cloud files are registered or are being registered, and the dynamic allocation is finished.

Correspondingly, the embodiment of the invention also provides a point cloud file registration device based on the ICP algorithm, which can realize all the processes of the point cloud file registration method based on the ICP algorithm.

The point cloud file registration device based on the ICP algorithm provided by the embodiment of the invention comprises:

the data preprocessing module 21 is configured to obtain a point cloud file set, and start multiple processes through a message transmission interface; the point cloud file set comprises 1 source point cloud and k target point clouds to be registered;

the data pre-distribution module 22 is configured to divide the point cloud file set into a plurality of point cloud segments, and sequentially pre-distribute the point cloud segments to the corresponding processes;

a shared data module 23, configured to create a shared memory including 1 first variable and k second variables, and initialize the shared memory; the first variable corresponds to the number of idle GPUs, and the second variable corresponds to the registration condition of the target point cloud one by one;

the point cloud registration module 24 is configured to respectively schedule an idle GPU or a CPU in each process in real time to execute an ICP algorithm according to the first variable, and register the target point cloud in each process through the message transmission interface and the ICP algorithm to obtain k registered point cloud files;

and the dynamic allocation module 25 is configured to, when any one of the processes completes the allocation of all the allocated target point clouds in advance, search a point cloud segment with the largest number of unregistered target point clouds through the second variable, and dynamically allocate the unregistered target point clouds in the point cloud segment to an idle process for the allocation.

As one optional implementation, the apparatus for registering a point cloud file based on an ICP algorithm further includes:

a data initialization module 26, configured to initialize a value of the first variable to be N; wherein N is the number of the computer GPUs; and is also used for initializing the values of k second variables to be 0.

As one of the optional embodiments, the point cloud registration module 24 includes:

As one optional implementation, the point cloud registration module 24 further includes:

the resource scheduling unit is used for scheduling the CPU to execute the ICP algorithm in real time when any process reads that the second variable is 0; and when any process reads that the second variable is not 0, scheduling the GPU to execute the ICP algorithm in real time.

Preferably, the dynamic allocation module 25 includes:

the local registration dynamic allocation unit is used for searching a point cloud section with the maximum number of target point clouds which are not locally registered through the second variable when any process finishes the local registration of all the distributed target point clouds in advance, and dynamically allocating the target point clouds which are not locally registered in the point cloud section to an idle process for local registration;

and the global registration dynamic allocation unit is used for searching a point cloud segment with the maximum number of target point clouds which are not subjected to global registration through the second variable when the global registration of all the target point clouds allocated to any process is completed in advance, and reallocating the target point clouds which are not subjected to global registration in the point cloud segment to an idle process for global registration.

Preferably, in the matrix transformation unit, the preset transformation formula specifically includes:

Further, the global registration unit is specifically configured to:

globally registering the ith target point cloud according to the following formula:

O_i＝Q_i*G_i(1≤i≤k)；

wherein, O_iFor the registered point cloud file corresponding to the ith target point cloud, Q_iFor the ith target point cloud, G_iAnd the transformation matrix is a global transformation matrix corresponding to the ith target point cloud.

It should be noted that, for the specific description and the beneficial effects related to each embodiment of the point cloud file registration apparatus based on the ICP algorithm in this embodiment, reference may be made to the specific description and the beneficial effects related to each embodiment of the point cloud file registration method based on the ICP algorithm, which are not described herein again.

It should be noted that the above-described device embodiments are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiment of the apparatus provided by the present invention, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.

Accordingly, an embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program; wherein the computer program controls, when running, a device where the computer readable storage medium is located to execute the ICP algorithm based point cloud file registration method according to any one of the above embodiments.

To sum up, according to the point cloud file registration method, device and computer-readable storage medium based on the ICP algorithm provided by the embodiments of the present invention, first, a plurality of processes are started through an message passing interface MPI, a series of point cloud files with continuous sequence numbers are divided into a plurality of point cloud segments, and the point cloud segments allocated to each process are respectively subjected to parallel registration to realize first-level parallelism; secondly, real-time scheduling CPU and GPU computing resources of a computer according to a first variable, setting two paths of CPU execution and GPU acceleration for point cloud file registration, and realizing synchronous parallelism of a plurality of CPU kernels and a plurality of GPUs; the GPU realizes the two-stage parallel accelerated registration of two continuous point cloud files; and then, pre-allocating and dynamically allocating the registration task of the point cloud file to each process through a second variable in the shared memory, and realizing load balance of each process of the message passing interface MPI, so that the parallelization degree of point cloud file set registration, the utilization rate of computer computing resources and the efficiency of point cloud file parallelization registration can be improved.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims

1. An ICP algorithm-based point cloud file registration method is characterized by comprising the following steps:

respectively scheduling an idle GPU or a CPU in each process in real time to execute an ICP algorithm according to the first variable, and registering the target point cloud in each process through the message transmission interface and the ICP algorithm to obtain k registered point cloud files; wherein the content of the first and second substances,

2. An ICP algorithm based point cloud file registration method according to claim 1, wherein the initializing the shared memory specifically comprises:

initializing the values of k of the second variables to 0.

3. An ICP algorithm based point cloud file registration method according to claim 2, wherein the scheduling of an idle GPU or CPU in each process in real time to execute an ICP algorithm according to the first variable respectively is specifically:

4. The ICP algorithm based point cloud file registration method of claim 1, wherein the target point cloud in each process is registered through the message passing interface and the ICP algorithm to obtain k registered point cloud files, specifically:

5. The ICP algorithm-based point cloud file registration method according to claim 4, wherein when any one of the processes completes the registration of all the allocated target point clouds in advance, a point cloud segment with the largest number of unregistered target point clouds is searched through the second variable, and the unregistered target point clouds in the point cloud segment are dynamically allocated to an idle process for registration, specifically:

6. An ICP algorithm based point cloud file registration method according to claim 4, wherein the preset conversion formula specifically is:

7. An ICP algorithm based point cloud file registration method according to claim 4, wherein the i-th target point cloud is globally registered according to the following formula:

O_i＝Q_i*G_i (1≤i≤k)；

8. An ICP algorithm-based point cloud file registration device is characterized by comprising:

9. An ICP algorithm based point cloud file registration apparatus as claimed in claim 8, wherein the point cloud registration module comprises:

10. A computer-readable storage medium, comprising a stored computer program, wherein the computer program, when executed, controls a device on which the computer-readable storage medium is located to perform the ICP algorithm based point cloud file registration method according to any one of claims 1 to 7.