CN115187550A

CN115187550A - Target registration method, device, equipment, storage medium and program product

Info

Publication number: CN115187550A
Application number: CN202210822686.7A
Authority: CN
Inventors: 刘翌勋; 秦陈陈; 姚建华; 常健博; 陈亦豪; 冯铭; 王任直
Original assignee: Tencent Technology Shenzhen Co Ltd; Peking Union Medical College Hospital Chinese Academy of Medical Sciences
Current assignee: Tencent Technology Shenzhen Co Ltd; Peking Union Medical College Hospital Chinese Academy of Medical Sciences
Priority date: 2022-07-12
Filing date: 2022-07-12
Publication date: 2022-10-14

Abstract

The application discloses a target registration method, a target registration device, target registration equipment, storage media and a program product, and relates to the technical field of artificial intelligence. The method comprises the following steps: acquiring an image to be registered and a reference image which contain the same target; acquiring an initial registration relation between the image to be registered and the reference image based on the pose matrix of the target in the image to be registered and the pose matrix of the target in the reference image; acquiring a preliminary registration result corresponding to the target in the registration image according to the initial registration relation and the reference image; acquiring an optimized registration relation between the image to be registered and the reference image based on the effective region of the depth image corresponding to the image to be registered and the preliminary registration result; and adjusting the preliminary registration result according to the optimized registration relation, and acquiring an optimized registration result corresponding to the target in the registration image, wherein the optimized registration result is used for guiding the target to process the target. According to the method and the device, the preliminary registration result is optimized through the depth image, and the registration accuracy of the target can be improved.

Description

Target registration method, device, equipment, storage medium and program product

Technical Field

The embodiment of the application relates to the technical field of artificial intelligence, in particular to a target registration method, a target registration device, target registration equipment, a storage medium and a program product.

Background

In the field of surgical navigation, there are surgical navigation applications for various human body parts, such as navigation of parts of the head, brain, thigh femur, abdomen, lung, etc. Among other things, registration of a target (e.g., a body part) has a significant impact on surgical navigation.

Taking the head as an example, in the related art, the head operation navigation is realized by manually marking key points on the head in the preoperative image and the real head in the operation, and then performing registration on the head based on the key points through a registration algorithm. However, since the related art requires manual operation, human error is introduced, resulting in insufficient accuracy of registration.

Disclosure of Invention

The embodiment of the application provides a target registration method, a target registration device, a target registration apparatus, a storage medium and a program product, which can improve the target registration accuracy.

According to an aspect of an embodiment of the present application, there is provided a target registration method, including:

acquiring an image to be registered and a reference image which contain the same target;

detecting the image to be registered to obtain a pose matrix of the target in the image to be registered, wherein the pose matrix is used for representing the position and the posture of the target in the image;

acquiring an initial registration relation between the image to be registered and the reference image based on the pose matrix of the target in the image to be registered and the pose matrix of the target in the reference image; the initial registration relation is used for preliminarily characterizing the transformation relation between the image to be registered and the reference image;

acquiring a preliminary registration result corresponding to the target in the image to be registered according to the initial registration relation and the reference image;

acquiring an optimized registration relation between the image to be registered and the reference image based on the effective area of the depth image corresponding to the image to be registered and the preliminary registration result; wherein the active region of the depth image includes the target;

adjusting the preliminary registration result according to the optimized registration relation to obtain an optimized registration result corresponding to the target in the registration image; wherein the optimized registration result is used to guide the object to process the target.

According to an aspect of an embodiment of the present application, there is provided a target registration apparatus, the apparatus including:

the registration image acquisition module is used for acquiring an image to be registered and a reference image which contain the same target;

a pose matrix acquisition module, configured to detect the image to be registered to obtain a pose matrix of the target in the image to be registered, where the pose matrix is used to represent a position and a posture of the target in the image;

an initial relationship obtaining module, configured to obtain an initial registration relationship between the image to be registered and the reference image based on the pose matrix of the target in the image to be registered and the pose matrix of the target in the reference image; wherein the initial registration relation is used for preliminarily characterizing a transformation relation between the image to be registered and the reference image;

a preliminary result obtaining module, configured to obtain a preliminary registration result corresponding to the target in the image to be registered according to the initial registration relationship and the reference image;

an optimization relation obtaining module, configured to obtain an optimized registration relation between the image to be registered and the reference image based on the effective region of the depth image corresponding to the image to be registered and the preliminary registration result; wherein the active area of the depth image includes the target;

an optimized result obtaining module, configured to adjust the preliminary registration result according to the optimized registration relationship, and obtain an optimized registration result corresponding to the target in the registration image; wherein the optimized registration result is used to guide the object to process the target.

According to an aspect of embodiments of the present application, there is provided a computer device comprising a processor and a memory, the memory having stored therein a computer program, the computer program being loaded and executed by the processor to implement the above-mentioned target registration method.

According to an aspect of embodiments of the present application, there is provided a computer-readable storage medium having a computer program stored therein, the computer program being loaded and executed by a processor to implement the above-mentioned target registration method.

According to an aspect of embodiments herein, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the target registration method described above.

The technical scheme provided by the embodiment of the application at least comprises the following beneficial effects.

The target is preliminarily registered based on the pose matrix of the target in the image to be registered and the pose matrix of the target in the reference image, and then the preliminary registration result is optimally registered based on the effective area of the depth image corresponding to the image to be registered and the preliminary registration result to obtain the optimal registration result corresponding to the target, so that the full-automatic registration of the target is realized, the introduction of human errors is avoided, and the registration accuracy of the target is improved. Meanwhile, the preliminary registration result of the target is subjected to targeted optimization through the effective region of the depth image, so that the registration error in the preliminary registration result is effectively reduced, and the registration accuracy of the target is further improved.

In addition, the target can be automatically registered only based on the image to be registered and the reference image, so that the target navigation is realized, the target navigation (such as the operation navigation of the target) is not required to be performed through expensive large-scale equipment, the usability and the convenience of the target navigation are greatly improved, and the cost of the target navigation is reduced.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic illustration of an environment for implementing an embodiment provided by an embodiment of the present application;

FIG. 2 is a flow chart of a target registration method provided by an embodiment of the present application;

fig. 3 is a flowchart of a pose matrix obtaining method according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a common head pose provided by one embodiment of the present application;

FIG. 5 is a graph and thermodynamic diagram of key points provided by one embodiment of the present application;

FIG. 6 is a schematic diagram of a selected rotation matrix detection mode provided by an embodiment of the present application;

fig. 7 is a flowchart of a target registration method provided by another embodiment of the present application;

FIG. 8 is a schematic diagram of active area segmentation provided by one embodiment of the present application;

fig. 9 is a schematic diagram of a semi-automatic registration method provided by an embodiment of the present application;

FIG. 10 is a schematic illustration of an environment for implementation of an embodiment in a surgical navigation scenario provided by an embodiment of the present application;

FIG. 11 is a schematic view of a surgical navigation system using method in a surgical navigation scenario provided by an embodiment of the present application;

FIG. 12 is a schematic diagram of a target registration method in a surgical navigation scenario provided by an embodiment of the present application;

FIG. 13 is a schematic diagram of a preliminary registration head model provided by an embodiment of the present application;

FIG. 14 is a schematic diagram of an optimized registration head model provided by an embodiment of the present application;

FIG. 15 is a schematic view of a rendered view under different display options from a perspective as provided by one embodiment of the present application;

fig. 16 is a block diagram of a target registration apparatus provided in an embodiment of the present application;

fig. 17 is a block diagram of a target registration apparatus provided in another embodiment of the present application;

fig. 18 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, the following detailed description of the embodiments of the present application will be made with reference to the accompanying drawings.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Computer Vision technology (CV) is a science for researching how to make a machine "look", and more specifically, it refers to that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or is transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision techniques typically include image processing, image Recognition, image semantic understanding, image retrieval, OCR (Optical Character Recognition), video processing, video semantic understanding, video content/behavior Recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, and map construction, among other techniques.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and teaching learning.

The technical scheme provided by the embodiment of the application relates to a computer vision technology and a machine learning technology of artificial intelligence, an effective region, a target, key points of the target and a rotation matrix of the target corresponding to an image are obtained by using the computer vision technology, and an effective region segmentation model, a target detection model, a key point detection model and a rotation matrix detection model are obtained by training respectively based on the effective region, the target, the key points of the target and the rotation matrix of the target by using the machine learning technology.

According to the method provided by the embodiment of the application, the execution main body of each step can be a computer device, and the computer device refers to an electronic device with data calculation, processing and storage capabilities. The Computer device may be a terminal such as a PC (Personal Computer), a tablet Computer, a smartphone, a wearable device, a smart robot, a vehicle, or the like; or may be a server. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud computing service.

The technical scheme provided by the embodiment of the application is suitable for any scene needing target registration, such as an operation navigation scene, a target discrimination scene, a target registration scene, an intelligent traffic scene, an auxiliary driving scene, an image analysis scene, a safety detection scene and the like. The technical scheme provided by the embodiment of the application can improve the registration accuracy of the target.

In one example, please refer to fig. 1, which illustrates a schematic diagram of an implementation environment of an embodiment provided by an embodiment of the present application. The embodiment implementation environment can be realized as an architecture of a target registration system. The implementation environment may include: a terminal 10 and a server 20.

The terminal 10 may be an electronic device such as a mobile phone, a tablet Computer, a PC (Personal Computer), a wearable device, a smart robot, and the like. A client of the target application may be installed in the terminal 10. The target application may be a surgical navigation application, an object registration application, an image registration application, a navigation application, a simulation learning application, a security detection application, and any application that can be used for target registration, which is not limited in this embodiment of the present application.

The server 20 may be an independent physical server, a server cluster or a distributed system including a plurality of physical servers, or a cloud server providing a cloud computing service. The server 20 is used to provide background services for clients of target applications in the terminal 10. For example, the server 20 may be a backend server for the target application (e.g., surgical navigation-like application) described above.

The terminal 10 and the server 20 can communicate with each other through the network 30.

Exemplarily, referring to fig. 1, after acquiring an image to be registered and a reference image, a client of a target application in a terminal 10 detects the image to be registered, acquires a pose matrix corresponding to a target in the image to be registered, acquires a pose matrix of a reference model corresponding to the target constructed based on the reference image, and performs preliminary registration based on the two pose matrices to obtain a preliminary registration result corresponding to the target. And then obtaining an effective area of the depth image corresponding to the image to be registered, performing optimized registration based on the effective area and the preliminary registration result to obtain an optimized registration result corresponding to the target in the image to be registered, and further displaying the optimized registration result to an object (namely a user).

Optionally, the target registration process may also be executed in the server 20, and after the server 20 obtains the image to be registered, the depth image corresponding to the image to be registered, and the reference image, the two stages of target registration are performed to obtain an optimized registration result corresponding to the target in the image to be registered, and then the optimized registration result is sent to the client of the target application program to be displayed to the object.

Referring to fig. 2, a flowchart of a target registration method provided in an embodiment of the present application is shown. The execution subject of the steps of the method may be the terminal 10 or the server 20 in the embodiment environment shown in fig. 1, and the method may include the following steps (201 to 206).

Step 201, an image to be registered and a reference image containing the same target are obtained.

The embodiment of the application does not limit the target, and different scenes can correspond to different targets. Illustratively, in a surgical navigation scenario, the target may indicate a head, a brain, a face, a thigh femur, an abdomen, a lung, and so forth. In an intelligent traffic scene, the target may refer to a license plate, a traffic signboard, a vehicle, a road, a bridge, and the like. In a target registration scenario, the target may refer to a person, a part of a person, an animal, a part of an animal, an object, and so on.

The image to be registered refers to an image that includes an object and the object needs to be registered. The image to be registered may be acquired in real time or in advance, which is not limited in the embodiment of the present application. For example, taking an operation navigation scene as an example, the image to be registered may refer to an intra-operative image including a target acquired in real time during an operation. For example, the head may be photographed in real time by a video photographing apparatus or an image acquiring apparatus to acquire an image to be registered. Alternatively, the target registration may be performed in real time for a temporally continuous image sequence to be registered (e.g., a surgical navigation scene), or may be performed only for an image set including a target (e.g., a target identification scene), which is not limited in the embodiment of the present application. The image to be registered may refer to an RGB (Red-Green-Blue, red, green and Blue) image, i.e. a color image.

The reference image refers to an image which comprises a target and is used for registering with the target in the image to be registered, and the target in the reference image can be used as a registration reference object of the target in the image to be registered. Illustratively, the reference image may refer to an image including the target acquired before the image to be registered. For example, taking a surgical navigation scene as an example, the reference image may refer to a preoperative image including a target acquired before a surgery, such as a CT (Computed Tomography), MRI (Magnetic resonance Imaging), RGB image extracted from a preoperative image, and the like.

Optionally, the image to be registered may be an image obtained by shooting at any shooting angle for the target, and the reference image may be an image corresponding to the target in a standard posture, that is, the embodiment of the present application supports registration of the target at any angle.

Step 202, detecting the image to be registered to obtain a pose matrix of the target in the image to be registered, wherein the pose matrix is used for representing the position and the posture of the target in the image.

The pose of the target may include two partial transformations: translation and rotation. In the embodiment of the application, the pose matrix can comprise two parts of a translation vector and a rotation matrix. The translation vector is used for representing the position of the target in the image, and the rotation matrix is used for representing the posture of the target in the image. Illustratively, may be represented by P ₁ ∈R ^3×4 A matrix of poses representing the object, P ₁ ∈R ^3×4 Consists of a translation vector and a rotation matrix. Therein, the translation vector may be represented as a three-dimensional vector of (tdx, tdy, tdz). For example, the translation vector may refer to a translation vector of the coordinates of the target relative to the world coordinate system origin. The rotation matrix may be represented by R ^3×3 The element in R may refer to a rotation angle corresponding to the target. Alternatively, the pose matrix of the target may be constructed based on a world coordinate system.

In one example, the detection may include three detection processes of target detection, key point detection and rotation matrix detection, and then step 202 may further include the following steps.

Step 202a, performing target detection on the image to be registered to obtain a bounding box corresponding to the target.

Optionally, target detection may be performed on the image to be registered through a target detection model, so as to obtain a Bounding Box (i.e., bounding Box) corresponding to the target, where the Bounding Box is used to represent a prediction region of the target in the image to be registered.

Exemplarily, the image to be registered is input into the target detection model, so that a plurality of bounding boxes corresponding to the image to be registered and the confidence corresponding to each bounding box can be obtained, the bounding boxes can be screened based on the confidence to obtain candidate bounding boxes, and then the bounding box corresponding to the target is obtained from the candidate bounding boxes. For example, taking single target registration as an example, since only the bounding box of one target needs to be obtained, the bounding box with the confidence greater than 0.5 may be screened first, and then the bounding box with the maximum confidence in the candidate bounding boxes may be determined as the bounding box corresponding to the target. For multiple targets, the bounding boxes corresponding to each target can be obtained first, and then the screening process is performed.

The above object detection model is a neural network for detecting an object. Optionally, the target detection model may be constructed based on a lightweight SSD (Single Shot Detector), or may be constructed based on other detection models such as YOLO (young Only Look one), retinaFace (a face detection Network), CNN (Convolutional Neural Network), R-CNN (Region-CNN, convolutional Neural Network based on Region), fast R-CNN, and the like, which is not limited in this embodiment of the present application.

In one example, the object detection model may be trained with a larger data set (e.g., a sample image including a large number of objects at any arbitrary viewing angle) and data enhancement, resulting in an object detection model that can be used to detect objects at any arbitrary viewing angle.

Illustratively, a surgical navigation scenario is taken as an example. Because surgical navigation is mainly performed in the surgical process of a hospital, and the surgical navigation scene is greatly different from a common natural scene, in the natural scene, targets generally face an image acquisition device, so that a conventional target detection algorithm can be used. However, in the surgical navigation scene, the relative pose between the target and the camera is more complicated, for example, referring to fig. 4, taking the target as the head as an example, the head is often in a pose 401: head side facing camera, pose 402: head top side facing camera, pose 403: head front side facing camera and pose 404: the back side of the head faces the camera. The head can rotate at any angle during operation according to different positions needing operation.

After acquiring a sample image including a large number of targets at any view angle, performing random data enhancement on each image sample to obtain a data-enhanced image sample, where the data enhancement may include random flipping, random 360-degree rotation, random stretching, random zooming, and the like. And then acquiring a boundary box corresponding to the image sample after data enhancement through the target detection model, constructing a training loss of the target detection model based on the boundary box and/or the confidence coefficient, and further performing iterative training on the target detection model based on the training loss to obtain the trained target detection model.

Step 202b, based on the bounding box corresponding to the target, intercepting an intercepted image including the target from the image to be registered.

Optionally, the bounding box may be enlarged to obtain an enlarged bounding box; and intercepting an intercepted image comprising the target from the image to be registered according to the enlarged boundary frame.

Illustratively, the side length of the bounding box may be first scaled up, such as by 1.1 times, 1.2 times, 1.3 times, etc. And then, intercepting the area corresponding to the enlarged boundary frame from the image to be registered, thereby obtaining an intercepted image. In the training process of the target detection model, the amplified bounding box can also be used for training the target detection model, so that the robustness of the target detection model is improved. In addition, the accuracy of key point detection and rotation matrix detection can be improved by performing key point detection and rotation matrix detection based on the screenshot image, and the registration accuracy of the target is further improved.

And step 202c, carrying out key point detection on the intercepted image to obtain a target key point corresponding to the target.

Optionally, at least one key point is needed to calculate the translation vector corresponding to the target. Illustratively, taking the head of a surgical navigation scene as an example, because the postures of the head are variable in different surgeries or the same surgery, in order to avoid the situation that no key point is detected, the embodiment of the present application detects at least the following 5 key points: left and right canthus, nose tip, and left and right corners of mouth. Optionally, key points such as a detection ear, a chin, an ear, and the like may also be added, which is not limited in the embodiments of the present application.

The key point with the highest confidence level in the plurality of key points can be determined as the target key point. For example, when the head faces the camera, the confidence corresponding to the eye corner is strong, and the eye corner may be determined as the target key point. When the side of the head faces the camera, the confidence coefficient of the nose tip is strong, and the nose tip can be determined as a target key point.

In one example, the key point detection model may be used to perform key point detection on the captured image to obtain a plurality of key points corresponding to the target, and then the target key points corresponding to the target may be determined from the plurality of key points according to the confidence.

The key point detection model is a neural network used for detecting key points. The key point detection model can be constructed based on a Keypoint Heatmap network (a key point heat map network, 2D Unet is used as a backbone network), a DeepLab network (a semantic segmentation network), a Hourlas network (a hourglass network) and other segmentation networks.

For example, the keypoint detection model may output a keypoint heat map corresponding to the target. For example, referring to fig. 5, the keypoint detection model may output a keypoint map 501 in the form of coordinates corresponding to the target, and/or a keypoint thermodynamic diagram 502. And obtaining a target key point based on the key point diagram 501 or the key point thermodynamic diagram 502. The method is characterized in that the method comprises the following steps of performing a regression network algorithm on the key points based on thermodynamic diagrams, obtaining the confidence coefficient of each key point compared with the method of performing regression network algorithm on the specified feature points, and judging whether the key points or target key points exist or not by comparing the confidence coefficients.

Optionally, in the training process of the keypoint detection model, for each image sample, data enhancement may be performed randomly to obtain an image sample after data enhancement, where the data enhancement may include random flipping, random 360-degree rotation, random stretching, random scaling, and the like. And then, acquiring a predicted key point thermodynamic diagram corresponding to the image sample after data enhancement through a key point detection model, further calculating the training loss of the key point detection model based on the predicted key point thermodynamic diagram, and finally performing iterative training on the key point detection model through the training loss of the key point detection model to obtain the trained key point detection model.

Alternatively, a mean square error algorithm may be used to calculate the training loss of the keypoint detection model, and then the training loss of the keypoint detection model may be expressed as follows:

Loss＝(H _y -H _gt ) ² ；

wherein H _y Predicted keypoint thermodynamic diagrams, H, output for keypoint detection models _gt Is the true keypoint thermodynamic diagram (i.e., gold standard) corresponding to the image sample.

And 202d, detecting the posture of the intercepted image to obtain a rotation matrix corresponding to the target, wherein the rotation matrix is used for representing the posture of the target.

In the embodiment of the application, a rotation matrix corresponding to the target is obtained by adopting a mode of directly regressing the rotation matrix by the model. Illustratively, the captured image may be subjected to posture detection through a rotation matrix detection model, so as to obtain a rotation matrix corresponding to the target.

The rotation matrix detection model is a neural network used for rotation matrix detection. For example, referring to fig. 6, the rotation matrix detection model may include a backbone network 601 (i.e., backbone), a Global Average Pooling layer 602 (i.e., global Average Pooling layer) and a linear layer 603. The backbone network 601 may be constructed based on Resnet18, mobileNet (a lightweight convolutional neural network), denseNet (a dense convolutional neural network), and the like, and the linear layer 603 includes two linear layers.

Feature extraction is performed on the intercepted image through a backbone network 601 to obtain a feature map, the feature map is converted through a global average pooling layer 602 to obtain an intermediate feature vector, finally, linear transformation is performed on the intermediate feature vector through a linear layer 603 to obtain an output vector (such as a 6-dimensional vector), and then a rotation matrix is obtained based on the output vector.

In one example, a specific acquisition method of the rotation matrix may be as follows:

1. and carrying out gesture detection on the intercepted image to obtain an output vector corresponding to the target, wherein the output vector is used for representing the gesture of the target in the intercepted image.

For example, referring to fig. 6, a 6-dimensional vector corresponding to the target is obtained by rotating the matrix detection model.

2. And splitting the output vector to obtain a first orthogonal rotation vector and a second orthogonal rotation vector.

Optionally, the output vector may be split in half to obtain a first orthogonal rotation vector and a second orthogonal rotation vector. Exemplarily, referring to fig. 6, the first 3-dimensional element of the 6-dimensional vector may be determined as a first orthogonal rotation vector, and the second 3-dimensional element of the 6-dimensional vector may be determined as a second orthogonal rotation vector.

3. And converting the first orthogonal rotation vector to obtain a first sub-matrix.

Alternatively, the acquisition process of the first sub-matrix may be represented as follows:

wherein x is ₁ Is a first orthogonal rotation vector.

4. And obtaining a second sub-matrix based on the first sub-matrix and the second orthogonal rotation vector.

Alternatively, the acquisition process of the second sub-matrix may be represented as follows:

v ₂ ＝x ₂ -(v ₁ ·x ₂ )v ₁ (ii) a Wherein x is ₂ Is a second orthogonal rotation vector.

5. And converting the second orthogonal rotation vector to obtain a third sub-matrix.

Alternatively, the acquisition process of the third sub-matrix may be represented as follows:

6. and obtaining a fourth sub-matrix based on the first sub-matrix and the third sub-matrix.

Alternatively, the acquisition process of the fourth sub-matrix may be represented as follows: v. of ₄ ＝v ₁ ·v ₃ 。

7. And obtaining a rotation matrix corresponding to the target based on the first sub-matrix, the third sub-matrix and the fourth sub-matrix.

Alternatively, the acquisition process of the rotation matrix may be represented as follows:

M＝|v ₁ v ₃ v ₄ l, |; wherein M is a rotation matrix, and M belongs to R ^3×3 。

Optionally, the training loss of the rotation matrix detection model may be calculated based on the rotation matrix corresponding to the target, and then the rotation matrix detection model is iteratively trained based on the training loss of the rotation matrix detection model to obtain a trained rotation matrix detection model, and the rotation matrix detection model may directly perform regression on the rotation matrix. Illustratively, a near Distance Loss (Geodesic Distance Loss) may be employed to calculate a training Loss of the rotation matrix detection model based on the rotation matrix corresponding to the target, which may be represented as follows:

wherein M is _p Detecting the rotation matrix of the model output for the rotation matrix, M _gt For a true rotation matrix (i.e., gold standard), tr () is the trace of the matrix.

The posture of the target is expressed according to angles and can be divided into euler angles, quadruples and rotation matrixes, the related technology mainly uses a convolution neural network to regress the euler angles, and uses the center of a bounding box of the target to calculate a 2D translation vector (tdx, tdy). Meanwhile, the euler angle has a universal lock problem (Gimbal block), the same posture has multiple solutions to the euler angle, and the problem of Large Pose (overlarge rotation relative to the target positive side) cannot be solved under any view angle by a regression network based on the euler angle in the related art. By directly regressing the rotation matrix, the method and the device can reduce the error of posture acquisition and avoid the problems of universal lock and LargePose when the Euler angle is regressed, thereby improving the accuracy of posture acquisition and further improving the accuracy of target registration.

Optionally, in order to enable the rotation matrix detection model to achieve a higher posture prediction effect, in training the rotation matrix detection model, the embodiment of the present application uses a posture-enhanced training scheme. The scheme improves the prediction capability of a rotation matrix detection model by simulating 2D translation and rotation of the target relative to a vertical shooting plane of the image acquisition device. Illustratively, the image acquisition device may be simulated to be close to or away from the target by Crop (random cropping), resize (random scaling), or the like. The attitude change of the target may be simulated by random flipping, random 360 degree rotation, and the like.

The data pre-increment modes such as Crop, resize and the like can translate the target, but do not change the relative rotation angle of the target, so that the movement of the image acquisition equipment of a real scene can be greatly simulated.

In one example, the pose enhancement algorithm (i.e., rotate, flip fill, etc.) may be as follows: the 6DoF of the target posture can be converted into a matrix expression, corresponding transformation is carried out to obtain a posture matrix, the 6DoF and the posture matrix can be converted mutually, and therefore the real rotation matrix corresponding to the image sample can be updated.

This process can be expressed by the following equation:

6DoF→Matrix M ₀ ；

M _t ＝TM ₀ ；

Matrix M _t →6DoF；

wherein M is ₀ For the rotation matrix before update, M _t Is the updated rotation matrix.

For example, taking the image center as the rotation center as an example, the update process of the real rotation matrix (i.e. gold standard) can be as follows: rotation involves the interconversion of the euler angle and the rotation matrix. Assuming that three rotation angles are α, β, and γ, respectively, the euler angle is converted into a rotation matrix as follows:

M＝R _y (β)*R _x (α)*R _z (γ)。

the transformation process of the rotation matrix into the euler angle can be expressed as follows:

α＝αtan2(M ₂₁ ，M ₂₂ )；

β＝αrcsin(-M ₂₃ )；

γ＝αtan2(M ₂₁ ，M ₂₂ )。

the 2D rotation enhancement formula for pose is as follows:

M _t ＝TM ₀ ；

wherein, T = to _ rotation _ matrix (0,0, γ) ₁ )，γ ₁ ＝γ ₀ + θ,0 is the rotation angle of the image acquisition device, (0,0, γ ₀ ) Is the initial rotation angle corresponding to the target, (0,0, gamma) ₁ ) The updated rotation angle corresponding to the target.

And step 202e, obtaining a pose matrix of the target in the image to be registered based on the target key point and the rotation matrix.

Optionally, a translation vector corresponding to the target may be obtained based on the target key point, and then the pose matrix of the target may be constructed based on the translation vector and the rotation matrix.

Exemplarily, the depth information of the target key point is obtained based on the depth image corresponding to the image to be registered; acquiring a translation vector of a target key point in a world coordinate system based on a plane coordinate of the target key point in an image to be registered and depth information of the target key point; the translation vector is used for representing the position of the target key point in the image to be registered; and constructing and obtaining a pose matrix of the target in the image to be registered based on the translation vector and the rotation matrix.

In the embodiment of the application, the depth image corresponding to the image to be registered is aligned with the image to be registered, that is, the pixel points in the depth image corresponding to the image to be registered are aligned with the pixel points in the image to be registered one by one. The depth information of the target key point can be obtained only by obtaining the corresponding pixel point of the target key point in the depth image, the three-dimensional coordinate of the target key point under a world coordinate system can be obtained by taking the plane coordinate of the target in the image to be registered as the x-axis coordinate and the y-axis coordinate of the target and the depth information of the target key point as the z-axis coordinate, and then the translation vector of the target key point under the world coordinate system is obtained based on the three-dimensional coordinate and the origin of the world coordinate system. And constructing a pose matrix of the target by taking the data corresponding to the translation vector and the data corresponding to the rotation matrix as elements.

Step 203, acquiring an initial registration relation between the image to be registered and the reference image based on the pose matrix of the target in the image to be registered and the pose matrix of the target in the reference image; the initial registration relation is used for preliminarily representing the transformation relation between the image to be registered and the reference image.

In the embodiment of the application, the pose of the target in the reference image and the pose of the target in the image to be registered may be the same or different.

In one example, the obtaining of the initial registration relationship may be performed based on a reference model of the target, which may be as follows:

1. and constructing a reference model corresponding to the target based on the reference image.

Optionally, a binary segmentation image corresponding to the target may be obtained based on surface data of the target in the reference image; converting the binary segmentation image to obtain a reference model corresponding to the target; wherein the reference model is composed of a plurality of triangular patches.

Wherein the surface data is used to characterize the topography of the object. For example, in a surgical navigation scenario, the surface data may refer to data corresponding to skin. In the target authentication scenario, the surface data may refer to data of the surface of the object. The binary segmented image corresponding to the target may be used to reflect the features of the target and the regions in the reference image. Alternatively, a Marching Cube algorithm (Marching Cube algorithm) may be used to convert the binary segmented image into a reference model composed of a plurality of triangular patches, and the reference model may be a three-dimensional model.

2. And performing rigid body transformation on the reference model based on the standard data corresponding to the target to obtain the transformed reference model.

Because the poses of the targets may be different at different moments and the reference model constructed based on the reference image may have errors, the poses of the reference model may be subjected to rigid body transformation through standard data to reduce the influence of the errors. The standard data can be mean data of the pose of the target, so that the reference model of the target is more standard and accurate. The rigid body transformation only carries out rigid transformation (such as fine adjustment of translation and rotation) on the reference model, and does not carry out deformation (such as stretching, scaling and the like) on the reference model, so that the authenticity of the model cannot be referred to.

Optionally, an ICP (Iterative closest Point algorithm) may be adopted to register the standard model and the reference model corresponding to the standard data, so as to obtain a transformed reference model. The standard model and the reference model corresponding to the standard data may also be registered by using another PBR (Point Based Registration, point Registration algorithm) to obtain a transformed reference model, which is not limited in the embodiment of the present application.

3. And acquiring a pose matrix of the target in the reference image based on the transformed reference model.

Alternatively, the same key point detection and rotation matrix detection as described above may be employed to obtain the reference mapThe pose matrix of the target in the image, the pose matrix of the target in the reference image is also constructed based on the world coordinate system, which can be denoted as P _img ∈R ^3×4 。

4. And dividing the pose matrix of the target in the image to be registered and the pose matrix of the target in the reference image to obtain the initial registration relation between the image to be registered and the reference image.

Alternatively, the initial registration relationship may be represented as follows:

wherein, P _rest And obtaining a pose matrix of the target in the image to be registered. The initial registration relationship may be used to characterize a translation relationship between the target in the image to be registered and the target in the reference image.

According to the method and the device, the target is registered based on the reference model corresponding to the target, so that the registration is more fit to a real scene, and the registration accuracy can be improved. In addition, compared with the image-level registration directly based on the reference image and the image to be registered, the target registration is carried out based on the reference model, the calculated amount in the registration process can be effectively reduced, and the registration efficiency is further improved.

In another example, the image-level registration may also be performed based on the direct reference image and the image to be registered, which is not limited in the embodiment of the present application. Illustratively, matching pixel points between a target in a reference image and a target in an image to be registered are obtained, and then registration between the target in the reference image and the target in the image to be registered is performed based on the matching pixel points.

And 204, acquiring a preliminary registration result corresponding to the target in the image to be registered according to the initial registration relation and the reference image.

The preliminary registration result refers to a result of preliminary registration of the target in the image to be registered. The pose of the target in the preliminary registration result is almost consistent with the pose of the target in the reference image. Optionally, the target in the image to be registered may be transformed according to the initial registration relationship, so as to obtain a preliminary registration result.

Under the condition of target registration based on the reference model, the target in the image to be registered can be converted according to the initial registration relation, and a preliminary registration model corresponding to the target in the image to be registered is obtained.

Under the condition of image-level target registration, the image to be registered can be converted according to the initial registration relation to obtain a preliminary registration image, and then model construction is carried out on the preliminary registration image to obtain a preliminary registration model corresponding to the target in the image to be registered.

Step 205, acquiring an optimized registration relation between the image to be registered and the reference image based on the effective region of the depth image corresponding to the image to be registered and the preliminary registration result; wherein the active area of the depth image includes the target.

Alternatively, the effective area of the depth image may refer to an area of the target in the depth image. The optimized registration relationship may be used to characterize a transformation relationship between the target in the preliminary registration result and the target in the depth image.

In one example, the acquisition process to optimize the registration relationship may be as follows:

1. and acquiring point clouds corresponding to the preliminary registration result.

Alternatively, the point cloud corresponding to the preliminary registration result may be obtained by sampling from the preliminary registration result (e.g., the preliminary registration model) through methods such as uniform sampling, geometric sampling, random sampling, lattice point sampling, and the like.

2. And removing abnormal values in the point cloud corresponding to the preliminary registration result to obtain the optimized point cloud.

In the embodiment of the present application, the outlier refers to a point cloud too far and too close to the target. Exemplarily, taking a surgical navigation scene as an example, a point cloud extraction range can be set by combining a point cloud depth and a normal surgical distance, and a point cloud smaller than a lower limit value of the point cloud extraction range and a point cloud larger than an upper limit value of the point cloud extraction range are removed to obtain an optimized point cloud. Therefore, noise in the point cloud can be removed, and the registration accuracy of the target is improved.

3. And extracting a target point cloud from the optimized point cloud according to the shooting direction corresponding to the image to be registered, wherein the target point cloud refers to the point cloud observed in the shooting direction.

Illustratively, the point cloud facing the image acquisition device can be directly intercepted from the optimized point cloud to obtain the target point cloud, so that the noise is further eliminated, and the registration accuracy of the target is further improved.

4. And obtaining an optimized registration relation between the image to be registered and the reference image through iterative optimization based on the target point cloud and the point cloud corresponding to the effective area.

Optionally, the point cloud corresponding to the effective area may be obtained by sampling from the effective area through methods such as uniform sampling, geometric sampling, random sampling, lattice point sampling, and the like.

In one example, the active region may be obtained by an active region segmentation model. The effective region segmentation model can be constructed based on a segmentation network such as 2D Unet, deep Lab, FCN (Full volume neural network) and the like. The training Loss of the effective region segmentation model can be calculated by using a Cross entropy Loss function, a Dice Loss function, and the like.

In one example, an ICP algorithm may be used to perform iterative optimization based on the target point cloud and the point cloud corresponding to the effective region, so as to obtain an optimized registration relationship. The optimized registration relationship can also be obtained by other PBR algorithms based on the target point cloud and the point cloud corresponding to the effective region, which is not limited in the embodiment of the present application.

Step 206, adjusting the preliminary registration result according to the optimized registration relationship to obtain an optimized registration result corresponding to the target in the registration image; and the optimized registration result is used for guiding the object to process the target.

Optionally, the preliminary registration result may be transformed according to the optimized registration relationship to obtain an optimized registration result.

Under the condition of target registration based on the reference model, the preliminary registration model can be adjusted according to the optimized registration relation, and the optimized registration model corresponding to the target in the registration image is obtained.

Under the condition of image-level target registration, the preliminary registration image can be converted according to the initial registration relation to obtain an optimized registration image, and then model construction is carried out on the optimized registration image to obtain an optimized registration model corresponding to a target in the image to be registered.

According to the method and the device, the preliminary registration result of the target is optimized in a targeted manner through the effective area of the depth image, so that the registration error in the preliminary registration result is effectively reduced, and the registration accuracy of the target is further improved.

In one example, after the optimized registration result (such as the optimized registration model) is obtained, the optimized registration result may be further converted from the world coordinate system to the optimized registration result in the coordinate system of the image acquisition device, and view rendering such as VR (Virtual Reality), AR (Augmented Reality) and the like is performed from the perspective of the image acquisition device, so as to more intuitively guide the object to process the target.

In summary, according to the technical scheme provided by the embodiment of the application, the targets are preliminarily registered based on the pose matrix of the targets in the image to be registered and the pose matrix of the targets in the reference image, and then the preliminary registration result is optimally registered based on the effective area of the depth image corresponding to the image to be registered and the preliminary registration result, so that the optimized registration result corresponding to the targets is obtained, full-automatic registration of the targets is realized, introduction of artificial errors is avoided, and the registration accuracy of the targets is improved. Meanwhile, the preliminary registration result of the target is subjected to targeted optimization through the effective region of the depth image, so that the registration error in the preliminary registration result is effectively reduced, and the registration accuracy of the target is further improved.

In addition, the embodiment of the application can reduce the error of attitude acquisition by directly regressing the rotation matrix, and avoid the problems of universal lock and LargePose when the Euler angle is regressed, thereby improving the accuracy of attitude acquisition and further improving the accuracy of target registration.

Referring to fig. 7, a flowchart of a target registration method provided in another embodiment of the present application is shown. The execution subject of the steps of the method may be the terminal 10 or the server 20 in the embodiment environment shown in fig. 1, and the method may include the following steps (701-705).

Step 701, acquiring an image to be registered and a reference image containing the same target.

Step 701 is the same as that described in the above embodiments, and details not described in the embodiments of the present application may refer to the above embodiments, which are not described herein again.

Step 702, acquiring a preliminary registration result after manual registration.

The preliminary registration result refers to a result of preliminary registration of the target in the image to be registered. The pose of the target in the preliminary registration result is almost consistent with the pose of the target in the reference image.

Under the conditions that no obvious key point exists, the target is shielded, the target is incomplete and the like due to the fact that the rear side, the top and the like of the target face the image acquisition equipment, the rotary matrix detection model is difficult to predict the posture of the target, and at the moment, a semi-automatic registration method is needed to be adopted to register the target. Illustratively, an object (such as an operation executor in an operation navigation scene) manually performs preliminary registration on a target in an image to be registered and a target in a reference image to obtain a preliminary registration result, and the client may obtain the preliminary registration result.

Illustratively, the target in the image to be registered and the target in the reference image are respectively subjected to key point marking by the object, so that an initial registration relationship between the target in the image to be registered and the target in the reference image is automatically obtained through the client, and then the target in the image to be registered is converted through the initial registration relationship, so as to obtain a preliminary registration result.

Step 703, dividing the effective area of the depth image to obtain a rigid area and a deformation area corresponding to the effective area.

Alternatively, the effective region of the depth image may be acquired by the effective region segmentation model. The active area includes a target.

The rigid region refers to a region that is not easily deformed. For example, taking the head as an example, the forehead, nose, and the like are hardly deformed regions. The deformed region is a region where deformation is easily generated. For example, the mouth, face, etc. are areas that are easily deformed.

For example, referring to fig. 8, taking the head as an example, the effective region corresponding to the head may be automatically segmented to obtain the regions such as the skin 801, the ears (not shown), the face 802, the eyes 803, the eyebrows 804, the mouth 805, and the nose 806. The rigid region and the deformed region are divided according to the eye 803. For example, the forehead (not shown) or the like above the eyes 803 is determined as a rigid region (which may also include the nose 806), and the mouth 805, the face 802, or the like below the eyes 803 is determined as a deformed region.

And 704, performing iterative optimization to obtain an optimized registration relation between the image to be registered and the reference image based on the point cloud corresponding to the preliminary registration result, the point cloud corresponding to the effective area, the point cloud weight parameter corresponding to the rigid area and the point cloud weight parameter corresponding to the deformed area.

The point cloud weight parameters corresponding to the rigid area are larger than the point cloud weight parameters corresponding to the deformed area, so that the interference of the deformed area can be reduced while the deformed area is utilized, and the registration accuracy of the target can be improved.

Exemplarily, referring to fig. 9, taking a header as an example, the semi-automatic registration process may be as follows: the client constructs a preoperative head model 901 of the head based on a reference image corresponding to the head, and then obtains a preliminary registration result 904 obtained by manually registering the preoperative head model 901 and the head in the image 902 to be registered at the current moment by an operation executor. The method comprises the steps of performing effective area segmentation on a depth image 903 at the current moment of a client to obtain a rigid area and a deformation area corresponding to the effective area, then respectively obtaining point clouds of a primary registration model 904 and the point clouds corresponding to the effective area, and performing iterative optimization to obtain an optimized registration relation based on the point clouds of the primary registration model 904 and the point clouds corresponding to the effective area through an ICP (inductively coupled plasma) algorithm. And the client performs registration optimization on the preliminary registration model 904 according to the optimized registration relation to obtain an optimized registration model 905. Optionally, the client may render the optimized registration model 905 into a VR or AR view to present the optimized registration model 905 to the surgical practitioner.

Step 705, according to the optimized registration relationship, the preliminary registration result is adjusted to obtain an optimized registration result corresponding to the target in the registration image. And the optimized registration result is used for guiding the object to process the target.

Optionally, the preliminary registration result may be transformed according to the optimized registration relationship to obtain an optimized registration result. For example, in the case of performing target registration based on the target reference model, the preliminary registration model corresponding to the target may be adjusted according to the optimized registration relationship, so as to obtain the optimized registration model corresponding to the target in the registration image.

In summary, according to the technical scheme provided by the embodiment of the application, the preliminary registration result obtained through manual registration is optimized in a targeted manner based on the effective area of the automatically acquired depth image, so that semi-automatic registration of the target is realized, the effective area of the depth image is prevented from being manually outlined by the object, and the registration efficiency and the registration accuracy of the target are improved.

In an exemplary embodiment, in a surgical navigation scenario, the target registration method provided by the embodiment of the present application may be used to navigate the head, the brain, the femur, the abdomen, the lung, and other parts during surgery. The following will use a header as an example to describe the target registration method provided in the embodiments of the present application, and the specific content of which may be as follows:

in order to improve the ease of target registration in surgical navigation, in the embodiment of the present application, a target registration method is transplanted to a mobile terminal, such as a client corresponding to a target application (e.g., a surgical navigation application) installed and running in the terminal 10.

Illustratively, referring to fig. 10, the surgical navigation system may include a tablet 1001, a depth camera 1002, and a navigation wand 1003. The tablet computer 1001 serves as a computing platform, in which an operation navigation application is installed, and may be used to implement the target registration method provided by the embodiment of the present application. The depth camera 1002 may capture an intra-operative head-directed image (hereinafter referred to as an intra-operative image) including an RGB image (e.g., a color image) and a depth image, i.e., an RGB-D image, in real time. The navigation bar 1003 may be a visual navigation bar, and may be used to obtain the position of the needle tip in real time.

Communication between tablet 1001 and depth camera 1002 may be over a network. Tablet 1001 may acquire intraoperative images from depth camera 1002.

In an example, referring to fig. 11, the usage process of the client corresponding to the target application (e.g. the surgical navigation application) may be as follows:

1. a preoperative image is acquired for the head of a surgical performer. Illustratively, in a hematoma emergency surgery scenario, a fast scan CT may be acquired. In a tumor surgery scenario, an MRI may be acquired.

2. A preoperative plan is specified, which mainly includes segmenting organs, specifying surgical approaches, etc. based on preoperative images. The segmented organ may include, among other things, a registered organ for registration with intra-operative images (such as the RGB images and depth images described above) and a navigated organ. Illustratively, the registration organ may refer to a rigid organ or structure, such as a bony part (skull, femoral femur, spine, etc.). The navigation organs can comprise focuses, ventricles and the like and are used for assisting the operation performer to position.

Alternatively, the segmented organs may also include non-rigid organs and structures, such as abdomen, lungs, etc., which may have intraoperative drift due to heartbeat respiration, adding significant difficulty to surgical navigation. Thus, rigid organs or structures can be selected and registration of the target can be performed.

Since the embodiment of the application is a surgical navigation scene, the segmentation of the organ can be selected manually or automatically. The automatic segmentation algorithm does not limit common 3D segmentation networks such as 3 DUnet.

3. And performing intra-operative registration, namely, a client in the tablet computer 1001 acquires intra-operative images in real time through the depth camera 1002, and acquires a transformation relation between a head corresponding to the intra-operative images and a head corresponding to the pre-operative images by adopting the target registration method provided by the embodiment of the application, so as to acquire a final registration result.

4. In intraoperative navigation, the client in the tablet 1001 performs spatial transformation on the final registration result, renders it into a VR or AR view, and displays it on the screen. Alternatively, the client in the tablet 1001 may automatically detect the position of the navigation stick 1003, display the needle tip in the VR or AR view, and display key structures such as the registered organ, the navigated organ, etc. to assist the surgical performer in manipulating the head.

In the embodiment of the present application, the intraoperative registration is divided into automatic registration and semi-automatic registration. Semi-automatic registration may be employed when automatic registration is not available (e.g., keypoints cannot be detected) or there is a small probability of error (e.g., incomplete head).

In one example, referring to fig. 12, the specific process of intraoperative registration and intraoperative navigation may be as follows:

based on the preoperative images, a preoperative head model 1203 is constructed that corresponds to the head. For example, a binary segmentation image corresponding to the head may be obtained based on the skin corresponding to the head in the preoperative image, and then the Marching Cube algorithm is adopted to convert the binary segmentation image to obtain a preoperative head model 1203 corresponding to the head, where the preoperative head model 1203 is composed of a plurality of triangular patches.

And registering the preoperative head model 1203 to a standard head model corresponding to the operated person by using an ICP (inductively coupled plasma) algorithm to realize standard posture alignment so as to obtain a post-calibration preoperative head model 1203.

The method comprises the steps of performing head detection on an RGB image 1201 corresponding to an intra-operative image through a target detection model to obtain a boundary frame corresponding to a head, and then intercepting the head image from the RGB image 1201 based on the enlarged boundary frame.

The method comprises the steps of detecting the posture of a head image through a rotation matrix detection model to obtain rotation detection corresponding to the head, detecting key points of the head image through a key point detection model to obtain a plurality of key points (such as left and right eye corners, nose tips, left and right mouth corners and the like) corresponding to the head, determining target key points from the key points, and obtaining 2D coordinates of the target key points. For example, the corners of the eyes may be determined as target keypoints when the head is facing the depth camera frontally. The tip of the nose may be determined as the target keypoint when the side of the head faces the depth camera.

Aligning the depth image 1202 corresponding to the intra-operative image with the RGB image 1201 to obtain an aligned depth image 1202, and acquiring depth information corresponding to the target key point from the aligned depth image 1202.

And calculating the 3D coordinates of the target key points based on the depth information corresponding to the target key points and the 2D coordinates of the target key points, and further calculating the translation vector of the target key points in a world coordinate system based on the 3D coordinates.

Based on the translation vector and the rotation matrix, calculating a pose matrix of the head in the RGB image, acquiring a pose matrix corresponding to the preoperative head model 1203 after the calibration, and dividing the pose matrix corresponding to the preoperative head model 1203 after the calibration and the pose matrix of the head in the RGB image to obtain an initial registration relation corresponding to the head.

And carrying out standard conversion on the head in the RGB image according to the initial registration relation to obtain an initial registration head model. For example, referring to fig. 13, there is near agreement between the preliminary registration head model 1301 and the preoperative head model 1302.

The key point detection and the rotation matrix detection are both based on RGB images, and a certain error exists between the pose matrix obtained based on the RGB images and the pose matrix of the intraoperative head, so that the preliminary registration head model needs to be further optimized based on the depth image 1202 to achieve the registration accuracy required by surgical navigation.

And acquiring point clouds of the preliminary registration head model, removing abnormal values in the point clouds corresponding to the preliminary registration head model to obtain optimized point clouds, and extracting a target point cloud from the optimized point clouds according to the shooting direction corresponding to the depth camera, wherein the target point cloud is observed in the shooting direction.

And acquiring an effective area corresponding to the depth image 1202 and a head three-dimensional point cloud corresponding to the effective area, and removing abnormal values of the head three-dimensional point cloud corresponding to the effective area to obtain an optimized head three-dimensional point cloud corresponding to the effective area. Wherein the active area includes a head.

And dividing the effective area to obtain a rigid area and a deformation area corresponding to the effective area, and setting point cloud weight parameters corresponding to the rigid area to be larger than point cloud weight parameters corresponding to the deformation area.

And iterating to obtain a conversion relation between the preliminary registration head model and the head in the depth image, namely an optimized registration relation, based on the target point cloud corresponding to the preliminary registration head model, the optimized head three-dimensional point cloud corresponding to the effective region, the point cloud weight parameter corresponding to the rigid region and the point cloud weight parameter corresponding to the deformation region by adopting an ICP (inductively coupled plasma) algorithm.

And optimizing the preliminary registration head model according to the optimized registration relation to obtain an optimized registration head model. For example, referring to fig. 14, the optimized registered head model 1401 is consistent with the preoperative head model 1402.

Optionally, based on the above embodiment, after obtaining the optimized registration result, the optimized registration result may be converted from the world coordinate system to the optimized registration result in the image obtaining device coordinate system; the navigation bar and the needle point can be synchronously displayed in an optimized registration result under the coordinate system of the image acquisition equipment according to the positions of the navigation bar and the needle point in the registration image; and the optimized registration result comprises a navigation organ which is used for assisting the object to process the target.

For example, the optimized registered head model may be transformed from a world coordinate system to an optimized registered head model in a depth camera coordinate system, and the optimized registered head model in the depth camera coordinate system is rendered into VR or AR views, and the navigation stick and the needle tip are synchronously displayed in the optimized registered head model in the depth camera coordinate system according to the positions of the navigation stick and the needle tip in the intra-operative image. For example, referring to fig. 15, it shows the optimized registration head model 1502 under different shooting angles and different display options, and the positions of the navigation stick 1501 and the needle tip of the navigation stick 1501 with respect to the optimized registration head model 1502.

Under the condition that the head of an operator is not detected with obvious key points (such as the back side of the head faces a depth camera) or the head is shielded, the operator can manually register a preoperative head model and an intraoperative real head by adopting a semi-automatic registration method, in order to ensure the accuracy and efficiency of surgical navigation, the embodiment of the application automatically acquires an effective region corresponding to a depth image, and automatically optimizes a preliminary registration head model through the effective region to obtain an optimized registration head model, so that the accuracy of the surgical navigation is improved.

Illustratively, a preliminary registration head model after manual registration is obtained. And dividing the effective area of the depth image to obtain a rigid area and a deformation area corresponding to the effective area. And carrying out iterative optimization to obtain an optimized registration relation based on the point cloud corresponding to the preliminary registration head model, the point cloud corresponding to the effective area, the point cloud weight parameter corresponding to the rigid area and the point cloud weight parameter corresponding to the deformation area. And finally, optimizing the preliminary registration head model according to the optimized registration relation to obtain the optimized registration head model.

In summary, according to the technical scheme provided by the embodiment of the application, the targets are preliminarily registered based on the pose matrix of the targets (such as the heads) in the image and the pose matrix of the preoperative model of the targets, and then the preliminary registration model is optimally registered based on the effective area of the depth image corresponding to the preoperative image and the preliminary registration model to obtain the optimized registration model corresponding to the targets, so that the full-automatic registration of the targets in the operation is realized, the introduction of human errors is avoided, and the registration accuracy of the targets in the operation is improved. Meanwhile, the preliminary registration model of the target is optimized in a targeted manner through the effective region of the depth image, so that the registration error in the preliminary registration model is effectively reduced, and the registration accuracy of the target in the operation is further improved.

In addition, the automatic registration of the target in the operation can be realized only based on the image (RGB image and depth image) and the image before the operation, so that the navigation of the target in the operation is realized, the operation navigation of the target is not required to be performed through expensive large-scale equipment, the usability and convenience of the navigation of the target in the operation are greatly improved, and the navigation cost of the target in the operation is reduced.

The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.

Referring to fig. 16, a block diagram of a target registration apparatus provided in an embodiment of the present application is shown. The apparatus may be used to implement the target registration method described above. The apparatus 1600 may include: a registration image acquisition module 1601, a pose matrix acquisition module 1602, an initial relationship acquisition module 1603, a preliminary result acquisition module 1604, an optimization relationship acquisition module 1605, and an optimization result acquisition module 1606.

A registration image obtaining module 1601, configured to obtain an image to be registered and a reference image that contain the same target.

A pose matrix obtaining module 1602, configured to detect the image to be registered, to obtain a pose matrix of the target in the image to be registered, where the pose matrix is used to represent a position and a posture of the target in the image.

An initial relationship obtaining module 1603, configured to obtain an initial registration relationship between the image to be registered and the reference image based on the pose matrix of the target in the image to be registered and the pose matrix of the target in the reference image; wherein the initial registration relationship is used for preliminarily characterizing a transformation relationship between the image to be registered and the reference image.

A preliminary result obtaining module 1604, configured to obtain a preliminary registration result corresponding to the target in the image to be registered according to the initial registration relationship and the reference image.

An optimization relation obtaining module 1605, configured to obtain an optimized registration relation between the image to be registered and the reference image based on the effective region of the depth image corresponding to the image to be registered and the preliminary registration result; wherein the active area of the depth image includes the target.

An optimized result obtaining module 1606, configured to adjust the preliminary registration result according to the optimized registration relationship, and obtain an optimized registration result corresponding to the target in the registration image; wherein the optimized registration result is used to guide the object to process the target.

In an exemplary embodiment, the optimization relationship obtaining module 1605 is configured to:

acquiring point clouds corresponding to the preliminary registration result;

removing abnormal values in the point cloud corresponding to the preliminary registration result to obtain an optimized point cloud;

extracting a target point cloud from the optimized point cloud according to the shooting direction corresponding to the image to be registered, wherein the target point cloud is observed in the shooting direction;

and based on the target point cloud and the point cloud corresponding to the effective area, carrying out iterative optimization to obtain an optimized registration relation between the image to be registered and the reference image.

In an exemplary embodiment, as shown in fig. 17, the pose matrix acquisition module 1602 includes: a bounding box obtaining sub-module 1602a, a screenshot image obtaining sub-module 1602b, a target point obtaining sub-module 1602c, a rotation matrix obtaining sub-module 1602d, and a pose matrix obtaining sub-module 1602e.

The bounding box obtaining sub-module 1602a is configured to perform target detection on the image to be registered, so as to obtain a bounding box corresponding to the target.

The screenshot image obtaining sub-module 1602b is configured to, based on the bounding box corresponding to the target, capture a captured image including the target from the image to be registered.

And a target point obtaining sub-module 1602c, configured to perform key point detection on the captured image to obtain a target key point corresponding to the target.

And a rotation matrix obtaining submodule 1602d for performing posture detection on the captured image to obtain a rotation matrix corresponding to the target, where the rotation matrix is used to represent the posture of the target.

A pose matrix obtaining sub-module 1602e, configured to obtain a pose matrix of the target in the image to be registered based on the target key point and the rotation matrix.

In an exemplary embodiment, the screenshot image capture sub-module 1602b is configured to:

amplifying the boundary frame to obtain an amplified boundary frame;

and intercepting an intercepted image comprising the target from the image to be registered according to the enlarged boundary frame.

In an exemplary embodiment, the pose matrix acquisition sub-module 1602e is configured to:

acquiring depth information of the target key point based on the depth image corresponding to the image to be registered;

acquiring a translation vector of the target key point in a world coordinate system based on the plane coordinate of the target key point in the image to be registered and the depth information of the target key point; the translation vector is used for representing the position of the target key point in the image to be registered;

and constructing and obtaining a pose matrix of the target in the image to be registered based on the translation vector and the rotation matrix.

In an exemplary embodiment, the rotation matrix obtaining sub-module 1602d is configured to:

performing gesture detection on the intercepted image to obtain an output vector corresponding to the target, wherein the output vector is used for representing the gesture of the target in the intercepted image;

splitting the output vector to obtain a first orthogonal rotation vector and a second orthogonal rotation vector;

converting the first orthogonal rotation vector to obtain a first sub-matrix;

obtaining a second sub-matrix based on the first sub-matrix and the second orthogonal rotation vector;

converting the second orthogonal rotation vector to obtain a third sub-matrix;

obtaining a fourth sub-matrix based on the first sub-matrix and the third sub-matrix;

and obtaining a rotation matrix corresponding to the target based on the first sub-matrix, the third sub-matrix and the fourth sub-matrix.

In an exemplary embodiment, as shown in fig. 17, the initial relationship obtaining module 1603 includes: a reference model building sub-module 1603a, a reference model transformation sub-module 1603b, and an initial relationship obtaining sub-module 1603c.

A reference model building sub-module 1603a, configured to build a reference model corresponding to the target based on the reference image.

The reference model transformation submodule 1603b is configured to perform rigid-body transformation on the reference model based on the standard data corresponding to the target to obtain a transformed reference model.

The pose matrix obtaining module 1602 is further configured to obtain a pose matrix of the target in the reference image based on the transformed reference model.

An initial relationship obtaining sub-module 1603c, configured to divide the pose matrix of the target in the image to be registered and the pose matrix of the target in the reference image, so as to obtain an initial registration relationship between the image to be registered and the reference image.

In an exemplary embodiment, the reference model building submodule 1603a is configured to:

acquiring a binary segmentation image corresponding to the target based on the surface data of the target in the reference image;

converting the binary segmentation image to obtain a reference model corresponding to the target; wherein the reference model is composed of a plurality of triangular patches.

In an exemplary embodiment, the preliminary result obtaining module 1604 is configured to adjust the target in the image to be registered according to the initial registration relationship, and obtain a preliminary registration model corresponding to the target in the registered image.

The optimization result obtaining module 1606 is configured to adjust the preliminary registration model according to the optimized registration relationship, and obtain an optimized registration model corresponding to the target in the registration image.

In an exemplary embodiment, as shown in fig. 17, the apparatus 1600 further includes: a coordinate system conversion module 1607 and an optimization result presentation module 1608.

A coordinate system conversion module 1607, configured to convert the optimized registration result from the world coordinate system to the optimized registration result in the coordinate system of the image acquisition device.

An optimized result display module 1608, configured to synchronously display the navigation rod and the needle tip in an optimized registration result in the image acquisition device coordinate system according to the positions of the navigation rod and the needle tip in the registration image; and the optimized registration result comprises a navigation organ which is used for assisting the target to be processed by the object.

In an exemplary embodiment, as shown in fig. 17, the apparatus 1600 further includes: an active area division module 1609.

The preliminary result obtaining module 1604 is further configured to obtain a preliminary registration result after the manual registration.

An effective region dividing module 1609, configured to divide the effective region of the depth image to obtain a rigid region and a deformation region corresponding to the effective region.

The optimized relation obtaining module 1605 is configured to perform iterative optimization to obtain an optimized registration relation between the image to be registered and the reference image based on the point cloud corresponding to the preliminary registration result, the point cloud corresponding to the effective region, the point cloud weight parameter corresponding to the rigid region, and the point cloud weight parameter corresponding to the deformed region; and the point cloud weight parameter corresponding to the rigid area is greater than the point cloud weight parameter corresponding to the deformed area.

It should be noted that, when the apparatus provided in the foregoing embodiment implements the functions thereof, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the apparatus may be divided into different functional modules to implement all or part of the functions described above. In addition, the apparatus and method embodiments provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments for details, which are not described herein again.

Referring to fig. 18, a schematic structural diagram of a computer device according to an embodiment of the present application is shown. The computer device may be any electronic device with data calculation, processing and storage functions, which may be used to implement the target registration method provided in the above embodiments. Specifically, the following may be included.

The computer device 1800 includes a Central Processing Unit (e.g., a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a FPGA (Field Programmable Gate Array), etc.) 1801, a system Memory 1804 including a RAM (Random-Access Memory) 1802 and a ROM (Read-Only Memory) 1803, and a system bus 1805 connecting the system Memory 1804 and the Central Processing Unit 1801. The computer device 1800 also includes a basic Input/Output System (I/O System) 1806, as well as a mass storage device 1807 for storing an operating System 1813, application programs 1814, and other program modules 1815, to facilitate transfer of information between the various devices within the server.

In some embodiments, the basic input/output system 1806 includes a display 1808 for displaying information and an input device 1809, such as a mouse, keyboard, etc., for user input of information. The display 1808 and the input device 1809 are connected to the central processing unit 1801 via an input/output controller 1810 connected to the system bus 1805. The basic input/output system 1806 may also include an input/output controller 1810 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, input-output controller 1810 also provides output to a display screen, a printer, or other type of output device.

The mass storage device 1807 is connected to the central processing unit 1801 through a mass storage controller (not shown) connected to the system bus 1805. The mass storage device 1807 and its associated computer-readable media provide non-volatile storage for the computer device 1800. That is, the mass storage device 1807 may include a computer-readable medium (not shown) such as a hard disk or CD-ROM (Compact disk Read-Only Memory) drive.

Without loss of generality, the computer readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes RAM, ROM, EPROM (Erasable Programmable Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), flash Memory or other solid state Memory technology, CD-ROM, DVD (Digital Video Disc) or other optical, magnetic, or other magnetic storage devices. Of course, those skilled in the art will appreciate that the computer storage media is not limited to the foregoing. The system memory 1804 and mass storage device 1807 described above may be collectively referred to as memory.

The computer device 1800 may also operate in accordance with embodiments of the present application by connecting to remote computers over a network, such as the internet. That is, the computer device 1800 may be connected to the network 1812 through the network interface unit 1811 that is coupled to the system bus 1805, or the network interface unit 1811 may be used to connect to other types of networks or remote computer systems (not shown).

The memory also includes a computer program stored in the memory and configured to be executed by the one or more processors to implement the target registration method described above.

In an exemplary embodiment, a computer-readable storage medium is also provided, in which a computer program is stored which, when being executed by a processor, is adapted to carry out the above-mentioned target registration method.

Optionally, the computer-readable storage medium may include: ROM (Read-Only Memory), RAM (Random-Access Memory), SSD (Solid State drive), or optical disk. The Random Access Memory may include a ReRAM (resistive Random Access Memory) and a DRAM (Dynamic Random Access Memory).

In an exemplary embodiment, a computer program product or a computer program is also provided, which comprises computer instructions, which are stored in a computer-readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the target registration method described above.

It should be noted that information (including but not limited to subject equipment information, subject personal information, etc.), data (including but not limited to data for analysis, stored data, displayed data, etc.), and signals referred to in this application are authorized by the subject or fully authorized by various parties, and the collection, use, and processing of the relevant data is in need of complying with relevant laws and regulations and standards in relevant countries and regions. For example, images referred to in this application (e.g., a to-be-registered image, a reference image, a sample image, etc.) and the like are obtained with sufficient authorization.

It should be understood that reference to "a plurality" herein means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. In addition, the step numbers described herein only exemplarily show one possible execution sequence among the steps, and in some other embodiments, the steps may also be executed out of the numbering sequence, for example, two steps with different numbers are executed simultaneously, or two steps with different numbers are executed in a reverse order to the order shown in the figure, which is not limited by the embodiment of the present application.

The above description is only exemplary of the present application and should not be taken as limiting the present application, and any modifications, equivalents, improvements and the like that are made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method of target registration, the method comprising:

acquiring an initial registration relation between the image to be registered and the reference image based on the pose matrix of the target in the image to be registered and the pose matrix of the target in the reference image; wherein the initial registration relation is used for preliminarily characterizing a transformation relation between the image to be registered and the reference image;

2. The method according to claim 1, wherein the obtaining an optimized registration relationship between the image to be registered and the reference image based on the effective region of the depth image corresponding to the image to be registered and the preliminary registration result comprises:

acquiring point clouds corresponding to the preliminary registration result;

3. The method according to claim 1, wherein the detecting the image to be registered to obtain a pose matrix of the target in the image to be registered comprises:

carrying out target detection on the image to be registered to obtain a boundary frame corresponding to the target;

intercepting an intercepted image including the target from the image to be registered based on the boundary frame corresponding to the target;

performing key point detection on the intercepted image to obtain a target key point corresponding to the target;

detecting the posture of the intercepted image to obtain a rotation matrix corresponding to the target, wherein the rotation matrix is used for representing the posture of the target;

and obtaining a pose matrix of the target in the image to be registered based on the target key point and the rotation matrix.

4. The method according to claim 3, wherein the capturing an image including the object from the image to be registered based on the bounding box corresponding to the object comprises:

amplifying the boundary frame to obtain an amplified boundary frame;

5. The method according to claim 3, wherein the deriving a pose matrix of the target in the image to be registered based on the target keypoints and the rotation matrix comprises:

6. The method according to claim 3, wherein the performing pose detection on the captured image to obtain a rotation matrix corresponding to the target comprises:

detecting the posture of the intercepted image to obtain an output vector corresponding to the target, wherein the output vector is used for representing the posture of the target in the intercepted image;

converting the first orthogonal rotation vector to obtain a first sub-matrix;

converting the second orthogonal rotation vector to obtain a third sub-matrix;

7. The method according to claim 1, wherein the obtaining an initial registration relationship between the image to be registered and the reference image based on the pose matrix of the target in the image to be registered and the pose matrix of the target in the reference image comprises:

constructing a reference model corresponding to the target based on the reference image;

performing rigid body transformation on the reference model based on the standard data corresponding to the target to obtain a transformed reference model;

acquiring a pose matrix of the target in the reference image based on the transformed reference model;

and dividing the pose matrix of the target in the image to be registered with the pose matrix of the target in the reference image to obtain the initial registration relation between the image to be registered and the reference image.

8. The method of claim 7, wherein the constructing a reference model corresponding to the target based on the reference image comprises:

9. The method according to claim 7, wherein the obtaining a preliminary registration result corresponding to the target in the image to be registered according to the initial registration relationship and the reference image comprises:

according to the initial registration relation, the target in the image to be registered is adjusted, and a primary registration model corresponding to the target in the registration image is obtained;

the adjusting the preliminary registration result according to the optimized registration relationship to obtain an optimized registration result corresponding to the target in the registration image includes:

and adjusting the preliminary registration model according to the optimized registration relation to obtain an optimized registration model corresponding to the target in the registration image.

10. The method of claim 1, further comprising:

converting the optimized registration result from a world coordinate system to an optimized registration result under an image acquisition equipment coordinate system;

synchronously displaying the navigation rod and the needle point in an optimized registration result under the coordinate system of the image acquisition equipment according to the positions of the navigation rod and the needle point in the registration image;

and the optimized registration result comprises a navigation organ which is used for assisting the target to be processed by the object.

11. The method of claim 1, further comprising:

acquiring a preliminary registration result after manual registration;

dividing the effective area of the depth image to obtain a rigid area and a deformation area corresponding to the effective area;

performing iterative optimization to obtain an optimized registration relation between the image to be registered and the reference image based on the point cloud corresponding to the preliminary registration result, the point cloud corresponding to the effective area, the point cloud weight parameter corresponding to the rigid area and the point cloud weight parameter corresponding to the deformed area;

and the point cloud weight parameter corresponding to the rigid area is greater than the point cloud weight parameter corresponding to the deformed area.

12. An apparatus for target registration, the apparatus comprising:

13. A computer device, characterized in that the computer device comprises a processor and a memory, in which a computer program is stored, which computer program is loaded and executed by the processor to implement the target registration method as claimed in any one of claims 1 to 11.

14. A computer-readable storage medium, in which a computer program is stored which is loaded and executed by a processor to implement the target registration method as claimed in any one of claims 1 to 11.

15. A computer program product or computer program, characterized in that it comprises computer instructions stored in a computer readable storage medium, from which a processor reads and executes said computer instructions to implement the target registration method according to any of claims 1 to 11.