CN116912295A

CN116912295A - Target tracking method and readable storage medium

Info

Publication number: CN116912295A
Application number: CN202310897525.9A
Authority: CN
Inventors: 戴安乐; 毕馨方; 夏瑞隆
Original assignee: Younao Yinhe Hunan Technology Co ltd
Current assignee: Beijing Yinhe Fangyuan Technology Co ltd
Priority date: 2023-07-20
Filing date: 2023-07-20
Publication date: 2023-10-20

Abstract

The invention discloses a target tracking method and a readable storage medium. The target tracking method comprises the following steps: obtaining a scalp point cloud based on a medical image of the head of the subject, and obtaining a first data frame face point cloud based on a first data frame face image of the subject; registering the scalp point cloud with the first data frame face point cloud based on a first registration method to obtain a first spatial transformation relationship between the scalp point cloud and the first data frame face point cloud; obtaining a second data frame facial point cloud based on a second data frame facial image of the subject; registering the first data frame face point cloud with the second data frame face point cloud based on a second registration method to obtain a second spatial transformation relationship between the first data frame face point cloud and the second data frame face point cloud; acquiring the position information of the target point in the face image of the second data frame based on the initial position information of the target point and the first and second spatial transformation relations; and obtaining real-time position information of the target point based on the real-time space transformation relation.

Description

Target tracking method and readable storage medium

Technical Field

The invention relates to the field of medical image processing, in particular to a target tracking method and a readable storage medium.

Background

Target navigation is an operation auxiliary system, reconstructing focus conditions based on preoperative image data such as CT (computed tomography), MRI (magnetic resonance imaging) and the like, accurately tracking a real target by using an ultrasonic, electromagnetic, optical and the like tracking system in operation, and displaying the relative positions of a surgical instrument and the real target in real time so as to assist in operation.

The cloud of target points is typically obtained by image reconstruction using preoperative images of the subject. The point clouds generated by the target point cloud and the optical camera are different-source point clouds, and compared with the registration of the same-source point clouds, the registration between the different-source point clouds has the problems of density difference, partial overlapping, large noise and abnormal values and the like, and the problems of low registration speed, low precision and the like are often caused. The heterogeneous point cloud registration is continuously carried out in the target system, so that deviation of target positioning is easy to occur and the real-time performance is poor. Meanwhile, as the scanning range of the optical camera in the system is limited, only partial face surface point cloud participates in registration, and a larger registration error is caused.

Disclosure of Invention

In order to solve at least one of the above problems and disadvantages of the prior art, the present invention provides a target tracking method and a readable storage medium. The technical method comprises the following steps:

according to one aspect of the present invention, there is provided a target tracking method comprising the steps of:

obtaining a scalp point cloud based on a medical image of the head of the subject, and obtaining a first data frame face point cloud based on a first data frame face image of the subject;

registering the scalp point cloud with the first data frame face point cloud based on a first registration method to obtain a first spatial transformation relationship between the scalp point cloud and the first data frame face point cloud;

obtaining a second data frame facial point cloud based on a second data frame facial image of the subject;

registering the first data frame face point cloud with the second data frame face point cloud based on a second registration method to obtain a second spatial transformation relationship between the first data frame face point cloud and the second data frame face point cloud;

acquiring the position information of the target point in the face image of the second data frame based on the initial position information of the target point, the first space transformation relation and the second space transformation relation;

and obtaining real-time position information of the target point based on the real-time space transformation relation between the face point cloud of the previous frame and the face point cloud of the current frame.

Further, the method of obtaining a facial point cloud of a current frame based on a facial image of the current frame of a subject includes the steps of:

reconstructing a point cloud image based on a face image of a current frame of the subject to obtain a reconstructed face point cloud of the current frame;

and obtaining the facial point cloud of the current frame through downsampling based on the reconstructed facial point cloud of the current frame.

Specifically, the downsampling method comprises the following steps:

rasterizing a spatial voxel where the reconstructed facial point cloud of the current frame is located;

judging whether each voxel grid contains points in the reconstructed face point cloud of the current frame, and selecting the closest point from the reconstructed face point cloud of the current frame as a sampling point when the current voxel grid contains points in the reconstructed face point cloud of the current frame;

traversing the reconstructed facial point cloud of the current frame to obtain a sampling point set, wherein the sampling point set is the facial point cloud of the current frame.

Preferably, the method for judging whether each voxel grid contains the point in the reconstructed facial point cloud of the current frame is to judge whether the coordinates of the point are in the coordinate range of the current voxel grid, and when the coordinates of the point are in the coordinate range, determining that the current voxel grid contains the point.

Further, the method of registering the scalp point cloud with the first data frame facial point cloud based on the first registration method comprises the steps of:

performing point cloud preprocessing on the basis of the scalp point cloud and the first data frame face point cloud to obtain preprocessed scalp point cloud and preprocessed first data frame face point cloud;

and registering the curved surface features of the scalp point cloud and the curved surface features of the first data frame face point cloud through a first registration method based on the preprocessed scalp point cloud and the preprocessed first data frame face point cloud to obtain the first spatial transformation relation.

Specifically, the method for preprocessing the point cloud comprises the following steps:

downsampling based on the scalp point cloud and the first data frame facial point cloud to obtain a downsampled scalp point cloud and a downsampled first data frame facial point cloud;

and filtering based on the downsampled scalp point cloud and the downsampled first data frame face point cloud to obtain a preprocessed scalp point cloud and a preprocessed first data frame face point cloud.

Preferably, the filtering processing method is a statistical outlier removal method or a radius outlier removal method.

Specifically, the first registration method includes the steps of:

extracting curved surface features of the scalp point cloud after preprocessing and the first data frame facial point cloud after preprocessing and curved surface features of the first data frame facial point cloud;

obtaining a preliminary first spatial transformation relationship between the scalp point cloud and the first data frame face point cloud based on the curved surface features of the scalp point cloud and the curved surface features of the first data frame face point cloud;

the preliminary first spatial transformation relationship is optimized to obtain the first spatial transformation relationship.

Specifically, the method for registering the facial point cloud of the previous frame with the facial point cloud of the current frame based on the second registration method comprises the following steps:

downsampling the face point cloud of the previous frame and the face point cloud of the current frame;

image segmentation is carried out on the face point cloud of the previous frame after downsampling and the face point cloud of the current frame after downsampling;

extracting curved surface features in the face point cloud of the previous frame after image segmentation and curved surface features in the face point cloud of the current frame after image segmentation;

registering the curved surface features in the facial point cloud of the previous frame with the curved surface features in the facial point cloud of the current frame to obtain a preliminary second spatial transformation relationship;

optimizing the preliminary second spatial transformation relationship to obtain the second spatial transformation relationship.

According to still another aspect of the present invention, there is provided a readable storage medium, wherein,

the readable storage medium stores a program or instructions that when executed by a processor perform the target tracking method described above.

The target tracking method and the readable storage medium according to the embodiments of the present invention have at least one of the following advantages:

(1) The target tracking method and the readable storage medium realize real-time tracking of the target by the method of primary heterogeneous point cloud registration and subsequent real-time registration of homologous point clouds;

(2) The target tracking method and the readable storage medium provided by the invention perform target tracking in a mode of combining the heterogeneous point cloud registration and the homogeneous point cloud registration, and the target tracking speed is improved on the basis of ensuring the target tracking precision.

Drawings

These and/or other aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the preferred embodiments, taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a flow chart of a target tracking method according to one embodiment of the invention.

Detailed Description

The technical scheme of the invention is further specifically described below through examples and with reference to the accompanying drawings. In the specification, the same or similar reference numerals denote the same or similar components. The following description of embodiments of the present invention with reference to the accompanying drawings is intended to illustrate the general inventive concept and should not be taken as limiting the invention.

Referring to FIG. 1, a target tracking method of one embodiment of the present invention is shown. The target tracking method comprises the following steps:

A point cloud refers to a set of data at points in space that may represent a three-dimensional shape or object, typically acquired by a three-dimensional scanner. The location of each point in the point cloud is described by a set of Cartesian coordinates (x, y, z), some of which may contain color information (R, G, B) or object reflection surface Intensity (Intensity) information.

Point cloud registration refers to the process of finding a spatial transformation (e.g., scaling, rotation, and translation) to align two point clouds in computer vision, pattern recognition, and robotics. The objective of finding such a transformation includes merging multiple datasets into one globally consistent model (or coordinate frame) and mapping new measurements to known datasets to identify features or estimate their pose. Raw three-dimensional point cloud data is typically obtained from laser radar and RGB-D cameras. Three-dimensional point clouds may also be generated from computer vision algorithms such as triangulation, binding adjustments, and monocular image depth estimation that has recently used deep learning.

The alien point cloud registration, namely cross-source point cloud registration, transforms the point clouds acquired by different three-dimensional sensors into the same space to realize the process of point cloud alignment.

The medical image of the subject may be an MRI image, a CT imaging image, an ultrasound imaging image, preferably an MRI image, more preferably an MRI image with an accuracy of 1 mm.

In one example, coordinate information of a target point in a medical image coordinate system is obtained from a medical image of a head of a subject by an automatic target point positioning method, and the coordinate information is a target point initial position.

In one example, an automatic target location method includes the steps of:

preprocessing a medical image of the head of the subject; preferably, the pre-treatment includes offset field (INU) correction, skull removal, and the like.

Performing cerebral cortex reconstruction on the preprocessed head medical image of the subject to obtain a cerebral cortex model of the subject; preferably, the cerebral cortex reconstruction may be achieved by Freesurfer, fastCSR software.

Mapping the subject's cerebral cortex model into a brain structure partition model (i.e., a standard brain) based on the subject's cerebral cortex model to obtain a brain structure partition of the subject;

the cortical model of the subject with a partition of brain structure is further partitioned by the k-means method to obtain multiple large regions (e.g., five regions).

Taking the central point of each large area in the large areas as a reference target point, and then screening the multiple target points according to preset rules (for example, screening based on the treatment risk probability of the target points) so as to obtain a final target point (for example, a target point with small treatment risk).

In one example, a medical image of the subject's head is radiographically extracted from a point cloud belonging to a scalp portion of the subject's head, thereby obtaining a scalp point cloud of the subject that will no longer contain characteristic attributes within the cranium.

The following describes an exemplary method for optimizing registration by taking a medical image as an MRI image of a subject's head, and the principle of adopting an optimizing method for registration based on a CT imaging image or an ultrasound imaging image is completely consistent with the principle of adopting an optimizing method for registration based on an MRI image, which will not be described in detail herein.

The ray method specifically comprises the following steps:

establishing an O '-RAS coordinate system based on an MRI image of the head of the subject, wherein an R axis is parallel to a horizontal plane and a forward direction is directed to the right of the subject, an S axis is perpendicular to the R axis and the forward direction is directed to the upper side of the subject, and an A axis is perpendicular to an RO' S plane and the forward direction is directed to the front side of the subject;

recursion iteration A axis and S axis, extracting one-dimensional sequence S of MRI image _a＝v,s＝b A plane is obtained, for example in the O '-RAS coordinate system, which passes through the straight line a=v and is parallel to the RO' S plane, where v represents a value on the a axis; then extracting straight line s=b in the plane to obtain one-dimensional sequence S _a＝v,s＝b Wherein b represents a value on the S axis; traversing the axis A and the axis S to obtain a set of all straight lines, obtaining and storing the first and last non-zero coordinates at two ends of each straight line, and further obtaining a scalp coordinate set, namely scalp point cloud C1.

The first data frame image of the subject is obtained by a stereo camera, and the first data frame image is an image with depth information. The stereoscopic camera is a device for stereoscopic imaging by stereoscopic image technology, and comprises a structured light stereoscopic camera, a TOF3D camera, a binocular stereoscopic vision camera and the like. The person skilled in the art can choose according to the actual need, as long as a stereoscopic image of the subject can be obtained.

In one example, a first data frame facial image D of a subject is based on a stereo camera intrinsic matrix M _i And calculating to obtain a first data frame facial point cloud C2. In one example, a stereo camera intrinsic matrix M _i Can be obtained by calibrating a stereo camera. The expression of the first data frame facial point cloud C2 is:

after the first data frame face point cloud C2 of the subject is obtained, the head point cloud C1 and the first data frame face point cloud C2 are subjected to heterologous point cloud registration by adopting a first registration method.

A person skilled in the art may perform point cloud preprocessing on the head point cloud C1 and the first data frame face point cloud C2 before registering the head point cloud C1 and the first data frame face point cloud C2, and then register the preprocessed head point cloud C1 and the preprocessed first data frame face point cloud C2.

The method for preprocessing the point cloud comprises the following steps:

respectively downsampling the scalp point cloud and the first data frame face point cloud to obtain a preprocessed scalp point cloud and a downsampled first data frame face point cloud;

and carrying out filtering processing on the face point cloud of the first data frame after the downsampling so as to obtain a face point cloud of the first data frame after the preprocessing.

The downsampling method is concretely a voxel grid downsampling method, and comprises the following steps:

respectively establishing an axial bounding box for the scalp point cloud and the first data frame facial point cloud, equally dividing the bounding box into n parts along the directions of all coordinate axes, and filling a layer of voxels in the uppermost layer, the rightmost layer and the last layer of the bounding box to receive points on the uppermost surface, the rightmost surface and the last surface or the edge line of the bounding box; the voxels are then grouped and the barycenter of all points in each voxel grid is calculated, and finally the barycenter is taken as the sampling point. Traversing the reconstructed facial point cloud of the first data frame to obtain all sampling points, thereby obtaining a preprocessed scalp point cloud and a downsampled facial point cloud of the first data frame.

The downsampling method may also be to downsample the scalp point cloud to a resolution of 10mm through a voxel grid using Open3D, where the side length of the voxel grid is set to 10mm. Similarly, the first data frame facial point cloud is downsampled to 10mm resolution through a voxel grid using Open 3D.

And filtering the down-sampled first data frame point cloud to remove outliers generated by the stereo camera. The filtering processing method comprises a statistical outlier removal method or a radius outlier removal method. Of course, those skilled in the art may also use, for example, a k-nearest neighbor point cloud filtering method to perform filtering processing on the down-sampled first data frame point cloud to remove outliers generated by the stereo camera. This example is merely an illustrative example, and other existing filtering, noise reduction, etc. methods may be employed by those skilled in the art to reduce or eliminate noise in the point cloud.

For example, the statistical outlier removal method is specifically: firstly, calculating the average distance between each point in the face point cloud of the downsampled first data frame and 20 (parameter, adjustable) adjacent neighboring points, and recording the average distance as the neighboring distance of the point; then, the average value and standard deviation of the neighbor distances of all points in the first data frame point cloud are calculated, the average value of the neighbor distances of all points is marked as the average neighbor distance, and the standard deviation is marked as the neighbor distance standard deviation. When the absolute value of the difference of the neighbor distance of the point minus the average neighbor distance is greater than 2 times the standard deviation of the neighbor distances, then the store is determined to be an outlier, and the point is finally removed.

For example, the radius outlier removal method is specifically: and counting the distance from other points in the first data frame face point cloud to the current point for each point in the first data frame face point cloud, and recording the number of points with the distance to the current point being less than 5mm, wherein the points with the distance being less than 5mm are current neighbor points, and the number of the points with the distance being less than 5mm is the neighbor number of the current point. When the number of neighbors of the current point is less than 20, determining the current point as an outlier, and then removing the outlier.

In one example, the first registration method may include extracting FPFH features of the scalp point cloud and the first data frame face point cloud by an FPFH algorithm, and extracting FPFH features of the first data frame face point cloud, and then matching the two point clouds in a feature space by a sample consensus initial registration (SAC-IA) algorithm, a RANSAC algorithm, a random sample maximum likelihood algorithm (MLESAC), or a progressive consensus sampling algorithm (PROSAC) to obtain a relationship between corresponding points in the two point clouds, and then calculating a translational transformation matrix and a rotational transformation matrix of the preprocessed image model point cloud and the preprocessed subject point cloud according to the relationship between the corresponding points. The translational transformation matrix and the rotational transformation matrix are said first spatial transformation relation M1.

Those skilled in the art may also extract the curved surface features of the scalp point cloud and the curved surface features of the first data frame point cloud through a deep learning algorithm, such as PointNet, perfectMatch, FCGF algorithm, which is only one illustrative example and should not be construed as limiting the present invention.

For example, FCGF (Fully Convolutional Geometric Features) features are extracted for the scalp point cloud and the first data frame point cloud respectively using a pre-trained deep learning model, or local features of each point are extracted for the scalp point cloud using a multi-layer perceptron (MLP) using a pre-trained deep learning model, and local features of each point are extracted for the first data frame point cloud, then the scalp point cloud and the first data frame point cloud are registered based on the extracted features using DGP (Deep Global Registration) method, and the first spatial transformation relationship M1 is output.

As shown in fig. 1, a second data frame facial point cloud is obtained based on a second data frame facial image of the subject. The "first data frame" and "second data frame" are used herein for purposes of illustration only, and should not be construed as a description or limitation of the order of frames, nor of the number of frames, i.e., the second data frame may be one frame or a plurality of different frames.

In one example, after a depth image of the face of the subject (i.e., a face image of the second data frame) is obtained by the stereo camera, the depth image is image reconstructed into a three-dimensional point cloud image, which is the reconstructed face point cloud of the second data frame. For example, depth information of the face image of the second data frame is visualized as a point cloud by PCL.

And after reconstructing the face image of the second data frame into a point cloud image, respectively performing point cloud preprocessing on the reconstructed face point cloud C3 of the second data frame to obtain the face point cloud of the second data frame.

The point cloud preprocessing is first a downsampling process to speed up the subsequent process of homologous point cloud registration (described in more detail below).

The down-sampling method comprises the following steps:

rasterizing a spatial voxel where the reconstructed facial point cloud C3 of the second data frame is located;

judging whether each voxel grid contains a point in a reconstructed face point cloud C3 of a second data frame, and selecting a point closest to the center point of the current voxel grid from the reconstructed face point cloud C3 of the second data frame as a sampling point when the current voxel grid contains the point in the reconstructed face point cloud C3 of the second data frame;

traversing the reconstructed facial point cloud C3 of the second data frame to obtain a sampling point set, wherein the sampling point set is the facial point cloud C4 of the second data frame.

In one example, a method of determining whether a point in the reconstructed facial point cloud of the second data frame is contained in each voxel grid is to determine whether the coordinates of the point are within the coordinate range of the current voxel grid, and when the coordinates of the point are within the coordinate range, then determining that the current voxel grid contains the point.

For example, the coordinate range of the current voxel grid is (1, 1) - (2, 2), and the coordinate of at least one point, for example, the coordinate (1.5,1.7,2) of one of the points in the reconstructed face point cloud C3 of the second data frame falls within the coordinate range, then the reconstructed face point cloud C3 of the current voxel grid including the second data frame is determined, and then a point closest to the center point of the current voxel grid from the points falling within the coordinate range is selected as the sampling point. When only one point falls within the coordinate range of the current voxel grid, the point is a sampling point. The reconstructed facial point cloud C3 of the second data frame is traversed in the above way, whereby all sampling points are obtained, which form the facial point cloud C4 of the second data frame.

In one example, the downsampled second data frame facial point cloud is filtered to obtain a preprocessed second data frame facial point cloud C4. The filtering processing method is a k-nearest neighbor point cloud filtering method, a statistical outlier removing method or a radius outlier removing method. The filtering method and principle of the k adjacent point cloud filtering method, the statistical outlier removing method and the radius outlier removing method for the face point cloud of the second data frame are completely identical to those of the first data frame, and are not repeated here.

In one example, a first data frame facial point cloud is registered with a second data frame facial point cloud based on a second registration method to obtain a second spatial transformation relationship M2 between the first data frame facial point cloud and the second data frame facial point cloud. The second registration method includes coarse registration and fine registration. The coarse registration is to extract the FPFH characteristic of the first data frame face point cloud from the second data frame face point cloud C2 and the FPFH characteristic of the second data frame face point cloud by using an FPFH algorithm, then match the two point clouds in a feature space by using a sampling consistency initial registration (SAC-IA) algorithm, a RANSAC algorithm, a random sampling maximum likelihood algorithm (MLESAC) or a progressive consistency sampling algorithm (PROSAC) to obtain a relationship between corresponding points in the two point clouds, and then calculate a translation transformation matrix and a rotation transformation matrix of the preprocessed image model point cloud and the preprocessed subject point cloud according to the relationship between the corresponding points. The fine registration is to optimize a translation transformation matrix and a rotation transformation matrix obtained through coarse registration through an ICP algorithm to obtain an optimized translation transformation matrix and an optimized rotation transformation matrix, wherein the optimized translation transformation matrix and the optimized rotation transformation matrix are the second spatial transformation relation M2.

Those skilled in the art may also extract the curved surface features of the first data frame facial point cloud and the curved surface features of the second data frame point cloud through a deep learning algorithm, such as PointNet, perfectMatch, FCGF algorithm, which is only one illustrative example and should not be construed as limiting the present invention.

Of course, it will be understood by those skilled in the art that, when the first registration method and the second registration method are the same method with respect to the curved surface feature extraction method, the curved surface feature of the face point cloud C2 of the first data frame may not be extracted in the second registration method, and the curved surface feature extracted in the first registration method may be directly used to accelerate real-time tracking of the target point.

In one example, prior to registering the first data frame facial point cloud C2 and the second data frame facial point cloud C4, the two data frames may also be image segmented, respectively, to obtain a subject facial point cloud. The image segmentation method comprises a nose tip method and a face point cloud segmentation method based on deep learning, wherein the nose tip method specifically comprises the following steps: the nasal tip points in the first data frame face point cloud C2 and the second data frame face point cloud C4 are estimated respectively first using the RANSAC algorithm, and then all points in the point cloud are traversed to obtain all points 8cm from the nasal tip points, and the all points form the subject face point cloud.

The face point cloud segmentation method based on deep learning specifically includes the steps of performing face segmentation on a first data frame face point cloud C2 and a second data frame face point cloud C4 by using trained models to obtain a subject face point cloud.

In an example, the method and principle of extracting the curved surface feature by the first data frame face point cloud C2 are completely identical to the method and principle of extracting the curved surface feature by the second data frame face point cloud C4, so the method of extracting the curved surface feature by the second data frame face point cloud C4 is exemplified below, and the method of extracting the curved surface feature by the first data frame face point cloud C4 is not described again.

First, a face normal vector of all points therein will be obtained based on the second data frame face point cloud C4. The specific method comprises the following steps:

determining k of each point in the second data frame face point cloud C4 through k-d Tree searching method ₁ All nearest neighbors in the neighborhood, where k ₁ The neighborhood is a sphere having as a radius a length of a first predetermined multiple of voxels of the MRI image of the subject's head;

k on a per point basis ₁ Obtaining a tangent plane of each point by least square fitting of all nearest neighbors in the neighborhood, wherein the tangent plane is a plane with the smallest sum of distances from all nearest neighbors to the fitted plane, and the normal vector of the tangent plane of each point is the normal vector of the face of each point and is recorded as n _i (h,j,l)。

For example, k of any point k' in the second data frame face point cloud C4 is determined by a k-d Tree search method ₁ All nearest neighbors, k, in the neighborhood ₁ The neighborhood is a sphere with a point k 'as a center and a 2 times voxel length as a radius, the nearest neighbor points are all points in the sphere and form a point pair with the point k', 20 nearest neighbor points are obtained by searching, then the 20 nearest neighbor points are fitted through a least square method, a tangent plane of the point k 'is obtained, a normal vector of the tangent plane is calculated, a face normal vector of the point k' is obtained, and the face normal vector is recorded as n _k’ (h ', j ', l '), the normal vector of the face of the point k ' may also be referred to as the normal of the point k '.

Since the number of nearest neighbors directly determines the advantages and disadvantages of the fitted tangent plane, the neighborhood obtained by taking the length of 2 times of voxels as the radius is preferable, and the tangent plane obtained by fitting the number of nearest neighbors covered by the neighborhood has better effect, for example, the face normal vector obtained later is better, and further, when FPFH feature extraction is performed, features (namely descriptors) which are closer to or more matched with each other can be extracted.

In one example, a face normal vector n is found for each point of the second data frame face point cloud C4 _i After (h, j, l), further tuning by the normal vector optimization methodFull face normal vector n _i (h, j, l) to ensure the stability of the second data frame facial point cloud C4 characteristics, thereby improving the quality of the respective FPFH descriptors.

The optimization method of the face normal vector comprises the following steps:

setting a center point of the second data frame face point cloud C4 as a coordinate origin O and establishing an XYZ coordinate system, wherein in the XYZ coordinate system, a Y axis is parallel to the ground and the positive direction points to the front of the subject, an X axis is perpendicular to the Y axis and the positive direction points to the right of the subject, and a Z axis is perpendicular to the XOY plane and the positive direction points to the upper side of the subject;

obtaining a vector from the origin of coordinates O to each point in the first data frame facial point cloud based on the second data frame facial point cloud C4;

and judging whether to adjust the direction of the normal vector of the face of each point based on the inner product of the vector from the origin of coordinates O to each point and the normal vector of the face of each point, wherein when the value of the inner product is larger than 0, the direction of the normal vector of the point is inverted, and when the value of the inner product is smaller than or equal to 0, the direction of the normal vector of the point is not adjusted.

Hereby it is achieved that the direction of the facial normal vector of all points of the second data frame facial point cloud C4 is towards the origin of coordinates O, i.e. towards the intracranial direction of the subject. After the directions of the face normal vectors of all points of the second data frame face point cloud C4 are all directed to the cranium of the subject, the curved surface features formed by the second data frame face point cloud C4 can be unified and can be well distinguished from other noise point clouds, so that the quality of the FPFH features of the face model point cloud is improved.

For example, for any point k ', the vector from the origin of coordinates O to that point k' is noted asCalculate->Normal vector n of the face to the point k _k’ Inner product between (h ', j ', l '). When the value of the inner product is > 0, the face method of the point k' is performedVector n _k’ (h ', j ', l ') is inverted to an optimized face normal vector-n _k’ (-h ', -j ', -l '). For example, the normal vector n of the face at any point k' _k” When the value of (h ', j', l ') is less than or equal to 0, the normal vector n of the face at point k' _k” (h ', j ', l ') is the optimized face normal vector.

In one example, a method for obtaining FPFH features of the second data frame facial point cloud C4 based on the optimized facial normal vector includes the steps of:

determining k of each point in the second data frame face point cloud C4 through k-d Tree searching method ₃ All pairs of points in the neighborhood, k ₃ The neighborhood is a sphere having as a radius a length of a third predetermined multiple of voxels in the MRI image of the subject's head;

k based on each point in the second data frame facial point cloud C4 ₃ The FPFH feature for each point is obtained for all pairs of points in the neighborhood and for each point in the optimized face model normal vector for all pairs of points.

For example, k of any point k' in the second data frame face point cloud C4 is determined by a k-d Tree search method ₃ All pairs of points in the neighborhood, k ₃ The neighborhood is a sphere centered on point k ' and having a radius that is 5 times the length of a voxel in an MRI image of the subject's head, and for point k ' a point pair (k ', k ") is constructed and an optimized face normal vector-n for point k ' is obtained _k’ Optimized face normal vector n of (-h ', -j', -l ') and point k' _k” (h ", j", l "); optimized face normal vector-n based on point pair (k', k ") and point k _k’ Optimized face normal vector n of (-h ', -j', -l ') and point k' _k” (h ', j', l ') calculates a reduced point feature histogram SPFH (k') for point k 'and a reduced point feature histogram SPFH (k') for point k ', respectively, and then calculates an FPFH feature (i.e., FPFH descriptor) FPFH (k') using neighboring SPFH values. Wherein, the expression of FPFH (k') is:

where k represents the number of nearest neighbors of point k', ω _i The weight is represented and the weight value is the distance of point k' to its neighbor point k ".

And traversing all points in the face point cloud C4 of the second data frame to obtain the FPFH characteristic of each point.

The FPFH characteristic for each point in the first data frame facial point cloud C2 is obtained in the same manner as described above.

In one example, feature matching is performed by a registration method (e.g., RANSAC algorithm) based on the FPFH feature of the first data frame face point cloud C2 for each point and the FPFH feature of the second data frame face point cloud C4 for each point to obtain a preliminary second spatial transformation relationship in space that converts the first data frame face point cloud C2 into the second data frame face point cloud C4. The preliminary second spatial transformation relationship includes a preliminary translational transformation matrix and a preliminary rotational transformation matrix. And then, according to the projection errors of the space where the FPFH characteristics of the first data frame face point cloud C2 are registered to the FPFH characteristics of the second data frame face point cloud C4, mismatching points are removed, and therefore coarse registration of the first data frame face point cloud C2 and the second data frame face point cloud C4 is completed.

Of course, a person skilled in the art can also use a random sampling maximum likelihood algorithm (MLESAC) to match two point clouds in the feature space, so as to obtain a relationship between corresponding points in the two point clouds, thereby obtaining a preliminary second spatial transformation relationship between the first data frame face point cloud C2 and the second data frame face point cloud C4.

In one example, the fine registration is to optimize the preliminary translation transformation matrix and the preliminary rotation transformation matrix by ICP algorithm to obtain the optimized translation transformation matrix and the optimized rotation transformation matrix, where the optimized translation transformation matrix and the optimized rotation transformation matrix are the second spatial transformation relationship M2, and then the first data frame face point cloud C2 and the second data frame face point cloud C4 are based on the second spatial transformation relationship M2.

For example, the first data frame face Point cloud C2 is registered to the second data frame face Point cloud C4 based on a preliminary registration matrix (i.e., a preliminary translation transformation matrix and a preliminary rotation transformation matrix) obtained by coarse registration, a first data frame face Point cloud after coarse registration is obtained, then the first data frame face Point cloud after coarse registration uses a Point2Plane ICP algorithm to calculate a registration root mean square error between the first data frame face Point cloud to the first data frame face Point cloud after coarse registration and a new registration matrix (a new translation transformation matrix and a new rotation transformation matrix), then the new registration matrix is iterated to the preliminary registration matrix obtained by coarse registration, and the current registration root mean square error is calculated again, when the root mean square error is smaller than 1mm, iteration is stopped, and the new registration matrix is the optimized translation transformation matrix and the optimized rotation transformation matrix. When the root mean square error between the Point cloud obtained based on the transformation of the new registration matrix and the face Point cloud of the first data frame is still larger than 1mm, obtaining a further registration matrix based on the new registration matrix through a Point2Plane ICP algorithm and iterating the new registration matrix until the root mean square error between the new Point cloud and the face Point cloud of the first data frame is smaller than 1mm, and stopping iteration. Thereby a second spatial transformation relation M2 is obtained.

Then, the position information V2 of the target point in the face image of the second data frame is obtained based on the initial position information V1 of the target point, the first spatial transformation relation M1 and the second spatial transformation relation M2. The expression of the position coordinate V2 of the target point is as follows:

V2＝M2×M1×V1 (2)

in one example, the acquisition time of the first data frame facial image is located before the acquisition time of the second data frame facial image. Preferably, the first data frame face image is a previous frame image of the second data frame face image.

When acquiring the real-time position information of the target point in each frame of face image after the face image of the second data frame, all the data frame images need to be processed according to the processing methods of the face image of the second data frame, such as an image reconstruction method, a point cloud preprocessing method, a second registration method and other processing methods, a real-time spatial transformation relation between the face point cloud of the current frame and the face point cloud of the previous frame (such as a second spatial transformation relation between the face point cloud of the first data frame and the face point cloud of the second data frame) can be obtained, and then the real-time position information of the target point in the point cloud image of the current frame can be calculated based on the real-time spatial transformation relation and the position information of the target point in the point cloud of the previous frame, so as to obtain the real-time position information of the target point in the point cloud image of each frame.

For example, when the real-time spatial transformation relationship between the fifth frame and the sixth frame and the real-time position information of the target point in the face point cloud of the fifth frame are already calculated, then the real-time position information of the target point in the face point cloud of the sixth frame can be obtained by multiplying the real-time spatial transformation relationship by the real-time position information of the target point, then the face point cloud of the seventh frame is obtained by reconstructing the face image of the seventh frame according to the method of the face image of the second data frame, then the real-time position information of the target point in the face point cloud of the seventh frame can be calculated based on the real-time position information of the target point in the sixth frame. Therefore, the method realizes that the heterologous point cloud registration (namely the registration between the scalp point cloud and the face point cloud of the first data frame) is needed only when the registration is carried out for the first time, and the target real-time tracking can be realized only by carrying out the homologous point cloud registration in the subsequent target real-time tracking process. This example is merely an illustrative example and those skilled in the art should not be construed as limiting the invention.

In one example, a readable storage medium is provided according to yet another embodiment of the present invention. A "readable storage medium" of embodiments of the present invention refers to any medium that participates in providing programs or instructions to a processor for execution. Such a medium may take many forms, including but not limited to, non-volatile media, and transmission media. Non-volatile media includes, for example, optical or magnetic disks, such as a storage device. Volatile media includes dynamic memory, such as main memory. Transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise a bus. Transmission media can also take the form of acoustic or light waves, such as those generated during Radio Frequency (RF) and Infrared (IR) data communications. Common forms of readable storage media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave as described hereinafter, or any other medium from which a computer can read.

The readable storage medium has stored thereon a program or instructions which when executed by a processor perform the target tracking method described above.

In one example, the readable storage medium is disposed on a server, such as a cloud server, in the form of memory. The cloud server is further provided with a processor, and the processor executes programs or instructions stored in the memory. The processor may be a central processing unit (central processing unit, CPU). In one example, the cloud server may be a virtual server formed by mapping physical servers through virtualization techniques. Wherein, the physical servers can be one or more, and when the physical servers are a plurality of, the cloud server can be a virtual server formed by mapping a server cluster through a virtualization technology. The virtual server may also be one or more. In one example, a cloud server may provide users with use through a cloud platform.

Although a few embodiments of the present general inventive concept have been shown and described, it would be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the general inventive concept, the scope of which is defined in the claims and their equivalents.

Claims

1. A target tracking method, the target tracking method comprising the steps of:

2. The target tracking method according to claim 1, wherein,

the method for obtaining the facial point cloud of the current frame based on the facial image of the current frame of the subject comprises the following steps:

3. The target tracking method according to claim 2, wherein,

the downsampling method comprises the following steps:

4. The target tracking method according to claim 3, wherein,

and judging whether the coordinates of the points in the reconstructed facial point cloud of the current frame are within the coordinate range of the current voxel grid or not by using the method for judging whether the points in the reconstructed facial point cloud of the current frame are contained in each voxel grid, and determining that the current voxel grid contains the points when the coordinates of the points are within the coordinate range.

5. The target point tracking method according to any one of claims 1 to 4, wherein,

the method for registering the scalp point cloud with the first data frame facial point cloud based on the first registration method comprises the following steps:

6. The target tracking method according to claim 5, wherein,

the method for preprocessing the point cloud comprises the following steps:

downsampling based on the scalp point cloud and the first data frame facial point cloud to obtain a preprocessed scalp point cloud and a downsampled first data frame facial point cloud;

and filtering based on the down-sampled first data frame facial point cloud to obtain a preprocessed first data frame facial point cloud.

7. The target tracking method according to claim 6, wherein,

the filtering processing method is a statistical outlier removal method or a radius outlier removal method.

8. The target tracking method according to claim 5, wherein,

the first registration method comprises the following steps:

and obtaining the first space transformation relation based on the curved surface characteristics of the scalp point cloud and the curved surface characteristics of the face point cloud of the first data frame.

9. The target tracking method according to claim 1, wherein,

the method for registering the facial point cloud of the previous frame with the facial point cloud of the current frame based on the second registration method comprises the following steps:

10. A readable storage medium, characterized in that,

the readable storage medium having stored thereon a program or instructions which, when executed by a processor, perform the target point tracking method of any of claims 1-9.