CN115294281A

CN115294281A - Pose estimation method and navigation system for robot-assisted surgery

Info

Publication number: CN115294281A
Application number: CN202210998737.1A
Authority: CN
Inventors: 陈飞; 邓露
Original assignee: Second Xiangya Hospital of Central South University
Current assignee: Second Xiangya Hospital of Central South University
Priority date: 2022-08-19
Filing date: 2022-08-19
Publication date: 2022-11-04

Abstract

The invention belongs to the technical field of medical informatics examination, and particularly relates to a pose estimation method and a navigation system for a robot-assisted surgery. The invention provides a new pose estimation method facing robot-assisted surgery, which is characterized in that a bone three-dimensional reconstruction model is established from medical images before an operation and is matched with 3D-3D of point cloud of the spine surface of a patient, which is acquired by a depth camera during the operation, and edge information is added, so that the characteristics are more obvious, the matching accuracy is higher, and in the process of screening a plurality of poses, the edge matching provided by the invention is used for effectively and quickly screening out the correct pose.

Description

Pose estimation method and navigation system for robot-assisted surgery

Technical Field

The invention belongs to the technical field of medical informatics examination, and particularly relates to a pose estimation method and a navigation system for a robot-assisted surgery.

Background

With the continuous development of medical imaging, computer-assisted surgery technology and computer vision technology, surgery assisted by a surgical navigation system has become a trend. Surgical navigation systems are derived from CT, MRI, US and other imaging data of a patient. It enables surgeons to simulate critical procedures prior to surgery by combining virtual reality space and visualization techniques. During the operation, the position of the surgical instrument relative to the lesion can be tracked in real time to assist the doctor in performing the operation. Compared with the traditional operation, the operation navigation system has the advantages of accurate positioning, reduction of trauma in the operation process, reduction of operation risk and the like. In surgical navigation, the design and implementation of the positioning module are particularly important.

Several methods of surgical navigational positioning are now common:

1) The image matching method comprises the following steps: the method comprises the steps of carrying out three-dimensional reconstruction on an acquired medical image of a patient before operation to obtain a three-dimensional virtual image of the patient, then acquiring position information of a light source point of scanning equipment in a coordinate system of positioning and tracking equipment, projecting the three-dimensional virtual image through the virtual light source point in the operation, determining the relative position relation between the virtual light source point and the three-dimensional virtual image, and calculating final position information. The method is susceptible to light, and more hardware devices (such as a positioning and tracking device) are complicated to operate.

2) X-ray scanning: the method comprises the steps of obtaining first image data obtained by scanning an intraoperative object through X-ray equipment, obtaining a first motion trail coordinate in a navigation space generated when the intraoperative object is scanned through the X-ray equipment, and then carrying out surgical navigation space registration according to the first image data, the first motion trail coordinate and a pre-calculated space conversion relation matrix between the image data and the motion trail coordinate of the X-ray equipment. Although accurate, the method relies on X-rays during the operation, which is prone to radiation effects on the health of the physician and the patient.

3) Chip positioning: the CT device, the magnetic resonance device and the ultrasonic device of the three-dimensional image reconstruction module respectively collect a plurality of two-dimensional detection images of a patient, integrate all the detection images to construct a first three-dimensional image corresponding to a focus area of the patient, respectively arrange a plurality of base stations of the three-dimensional positioning error correction module on each target point of an operation and each target point of a ct room, implant a plurality of positioning chips of the three-dimensional positioning error correction module around a mark position in the body of the patient through puncture, construct a second three-dimensional image through communication matching of the positioning chips and the base stations, and then calculate the position of a target according to the images. Although the method has high precision, the method has more hardware devices and high cost.

According to some defects of the domestic surgical navigation, the surgical navigation design which is not influenced by light rays, simple to operate, few in hardware equipment, high in precision and real-time is urgently needed.

Disclosure of Invention

In view of the above problems, the present invention provides a new pose estimation method and navigation system for robot-assisted surgery.

The specific technical scheme of the invention is as follows:

the invention provides a pose estimation method for a robot-assisted surgery, which comprises the following steps:

s1: constructing a bone three-dimensional reconstruction model by using a medical image of a patient, and preprocessing the bone three-dimensional reconstruction model in an off-line state;

s2: judging the visibility of the point pair characteristics and the geometric complexity of the point pair characteristics by using a normal vector included angle between the point pairs, extracting all visible point pair characteristics and non-planar point pair characteristics of the bone three-dimensional reconstruction model, and constructing a characteristic hash table by using the extracted point pair characteristics;

s3: acquiring surface point clouds of bones of a patient shot by a depth camera in an online state, and performing point cloud sampling based on edge information;

s4: and performing point pair feature extraction and matching on the surface point cloud and the three-dimensional reconstruction model based on the two voting balls to obtain candidate poses, performing clustering processing on the candidate poses, and screening based on the edge matching degree to obtain result poses.

A method of controlling a surgical robotic arm using a visual servoing method, the method comprising the steps of:

s100: performing coordinate transformation on the bone three-dimensional reconstruction model, a drilling coordinate system, a depth camera coordinate system, a mechanical arm tail end coordinate system and a mechanical arm base coordinate system, and obtaining a difference value from an actual position to a result pose based on a transformation relation from the actual pose to the result pose;

s200: acquiring images or point clouds of one or more cameras for capturing the end effector of the mechanical arm in real time, and generating the position of the end effector of the mechanical arm according to the images or the point clouds;

s300: calculating the movement speed of the mechanical arm end effector based on the difference value from the actual position to the result pose, generating a control command and sending the control command to the mechanical arm end effector;

s400: and judging whether the movement speed of the mechanical arm end effector is 0 or not, if so, sending a motion stopping command to the mechanical arm end effector, and if not, performing step S200.

A navigation system for robot-assisted surgery is characterized by comprising a host and hardware equipment communicated with the host, wherein the hardware equipment comprises a mechanical arm, a depth camera and a drilling instrument, and the depth camera and the drilling instrument are arranged on the mechanical arm; and

a memory storing instructions.

The invention has the following beneficial effects:

the invention provides a new pose estimation method facing robot-assisted surgery, which is characterized in that a bone three-dimensional reconstruction model is established from medical images before an operation and is matched with 3D-3D of point cloud of the spine surface of a patient, which is acquired by a depth camera during the operation, and edge information is added, so that the characteristics are more obvious, the matching accuracy is higher, and in the process of screening a plurality of poses, the edge matching provided by the invention is used for effectively and quickly screening out the correct pose.

Drawings

FIG. 1 is a flow chart of a conventional classical PPF algorithm;

FIG. 2 is a flow chart of a PPF algorithm of the pose estimation method for robot-assisted surgery according to the invention;

FIG. 3 is a flowchart of a pose estimation method for robot-assisted surgery according to the present invention;

FIG. 4 is a flowchart of a sampling method based on edge information according to the present invention;

FIG. 5 is a flow chart of steps S41-S44 in the present invention;

FIG. 6 is a flowchart of steps S45-S49 of the present invention;

FIG. 7 is a classification chart of points in the ROI region in the screening process according to the present invention;

FIG. 8 is a flow chart of a method of controlling a surgical robotic arm using a visual servoing method in accordance with the present invention;

FIG. 9 is a block diagram of a position based servo control of the present invention;

FIGS. 10a-10d are diagrams illustrating the effect of the navigation system for robot-assisted surgery according to the present invention;

FIG. 11 is a flowchart of instructions executed by the navigation system for robot-assisted surgery according to the present invention;

FIG. 12 is a block diagram of a navigation system for robot-assisted surgery in accordance with the present invention.

Detailed Description

The invention is further described with reference to the following figures and examples, which are provided for the purpose of illustrating the general inventive concept and are not intended to limit the scope of the invention.

In 2010, drost proposed a rigid body 6D pose estimation method based on point-to-feature (PPF) voting, which is a compromise solution between local feature methods and global feature methods, achieving a good balance between accuracy and speed. The PPF describes the surface of an object through four-dimensional features defined between global modeling directed point pairs, the features are used for searching the corresponding relation between a scene and a model point pair, a plurality of candidate pose transformations are generated based on the point pair corresponding relation, and finally clustering and sequencing are carried out on the candidate poses to obtain a final hypothesis. The PPF characteristics are low-dimensional characteristics of the directional point to the normal vector, and are suitable for objects with rich surface changes. There are natural advantages to the identification and registration of the spine in surgery. Fig. 1 shows a flow chart of a classical PPF algorithm.

The invention improves a characteristic frame based on the point to obtain a method suitable for matching in spinal surgery, and particularly relates to a pose estimation method for robot-assisted surgery, which comprises the following steps of:

s1: constructing a bone three-dimensional reconstruction model by using a medical image of a patient, and preprocessing the bone three-dimensional reconstruction model in an off-line state; since the present embodiment is directed to a spine operation, the three-dimensional bone reconstruction model is a spine three-dimensional reconstruction model;

s2: judging the visibility of the characteristic of the point pair and the geometric complexity of the characteristic of the point pair by using a normal vector included angle between the point pair, extracting all visible point pair characteristics and non-planar point pair characteristics of the bone three-dimensional reconstruction model, and constructing a characteristic hash table by using the extracted point pair characteristics;

s4: and performing point-to-point feature extraction and matching on the surface point cloud and the three-dimensional reconstruction model based on the two voting balls to obtain candidate poses, performing clustering processing on the candidate poses, and screening based on the edge matching degree to obtain result poses.

The invention provides a novel pose estimation method facing a robot-assisted surgery, which is characterized in that a bone three-dimensional reconstruction model is established by medical images before an operation (namely, an off-line state or an off-line stage) and a 3D-3D of a patient spine surface point cloud acquired by a depth camera in the operation (namely, an on-line state or an on-line stage) is matched, and edge information is added, so that the characteristics are more obvious, the matching accuracy is higher, and in the screening process of a plurality of poses, the correct poses are effectively and quickly screened out by using the edge matching provided by the invention (the accuracy is high, and the real-time performance is strong).

In this embodiment, the specific steps of determining the visibility of the point pair features and the geometric complexity of the point pair features by using the normal vector included angle between the point pairs in step S2 are as follows:

calculating an included angle between normal vectors of the two points, wherein if the included angle is larger than 175 degrees, the point pair characteristics are invisible, otherwise, the point pair characteristics are visible; if the included angle is less than 5 degrees, the point pair characteristic tends to be a plane, otherwise, the geometric complexity of the point pair characteristic is high.

In the offline training phase, all point pair features of the model are extracted and stored in a hash table to create a global model description. However, due to self-occlusion and other factors, the global description contains some redundant point pair features, which never appear in the input scene. These redundant point pair features not only increase the search time of the voting phase, but also increase the match error. Therefore, the present embodiment determines the visibility problem of the point pair feature in advance by using the normal vector angle between the point pairs. If the normal vector angle of two points is higher than 175 deg., this pair of points is considered to be hardly visible, and therefore no storage of point pair characteristics is made. On the other hand, the performance of conventional PPF methods degrades when the object itself has many repeating features or exhibits degeneracy (e.g., large planes), which is not always the case. Therefore, the point pair features with the normal vector included angle smaller than 5 degrees are not stored, and the algorithm focuses more on feature point pairs with rich geometric structures. And constructing a characteristic hash table for the preprocessed point pair characteristics for subsequent online matching.

In this embodiment, the point cloud sampling based on the edge information in step S3 specifically includes:

and constructing a multi-resolution voxel grid structure according to the model diameter of the bone three-dimensional reconstruction model, and combining the similar points of which the angles between the normal lines are smaller than theta for each voxel unit under the resolution according to the order from thin to thick. In this embodiment, a downsampling strategy with a definite target is adopted, and the edge region is paid more attention to, so as to efficiently extract the features of the complex geometric figure, which is specifically as follows:

in the online training stage, the point pair feature voting method depends on the discrimination effect of the point pair to a great extent, and low-quality or non-discriminative features reduce the speed and the recognition performance of the method. Thus, the characterization performance of a PPF depends on the number of features and the relevance and quality of each individual feature. This makes the matching performance largely dependent on the preprocessing step. In order to greatly highlight the characteristics of the spine, the embodiment designs a sampling method for sampling bones by combining edge contour points and geometric high-curvature characteristic points. Firstly, a multi-resolution voxel grid structure is constructed according to the diameter of a spine three-dimensional reconstruction model, for each voxel unit under each resolution (in the order from thin to thick), similar points with the angle between the normal lines smaller than a threshold value theta are merged, the threshold value theta is correspondingly reduced along with the increase of the resolution, so that a fine geometrical structure is saved, and in the process, contour point clouds except a background large plane are sampled. The sampling method enables the point cloud of the skeleton to be more fully sampled in parts with extremely geometric characteristics, such as a curved surface part, a contour part and the like, and the geometric change of the surface is considered to the maximum extent so as to improve the identification effect of the PPF characteristics, and a specific algorithm is shown in FIG. 4.

As shown in fig. 5, the obtaining of candidate poses by performing point pair feature extraction and matching on the surface point cloud and the three-dimensional reconstruction model based on two voting spheres in step S4 in this embodiment specifically includes:

s41: regarding the point cloud after sampling processing, taking every 5 points as reference points in the point pair characteristics, taking the point cloud within the range of the model diameter M of the three-dimensional reconstruction model stored in the hash table as a second point in the point pair characteristics, wherein the distance between the reference point and the corresponding second point is smaller than the model diameter, and taking all the reference points and the corresponding second points as point pairs participating in voting;

s42: dividing the point pair participating in the voting into two voting spheres to receive the voting respectively, wherein the two voting spheres comprise a sphere with the diameter of

A small voting sphere and a large voting sphere with a diameter equal to the diameter of the object, and the accumulator is filled with the paired votes accepted by the small voting sphere; the diameter or the diameter of the object in this embodiment is a default term in the three-dimensional visual field, i.e., a small sphere is framed by a frame, the diagonal line is the diameter of the object, the object represented by the diameter of the object can wrap a large voting sphere, and the diameter of the small voting sphere is calculatedIf a small cuboid is adopted to wrap the small ball, the height of the cuboid is higher>Long and long>The diameter is the sum of the squares of the length and the width, namely the formula, the technology adopts a conventional method, and the invention is not particularly limited;

s43: extracting peak values in an accumulator, generating a hypothesis corresponding to a 3D posture (here, the 3D posture of the scene point cloud of the spine of the patient) and a corresponding relation between the three-dimensional reconstruction model and the surface point cloud, and deriving candidate poses based on the hypothesis corresponding to the 3D posture;

s44: and filling an accumulator with the point pair votes accepted by the large vote and rejected by the small voting sphere, extracting peak values, generating a hypothesis corresponding to the 3D gesture and a corresponding relation between the three-dimensional reconstruction model and the surface point cloud, and obtaining each candidate pose based on the hypothesis corresponding to the 3D gesture.

For the point cloud in the input scene, every 5 points of the sampling scene are used as reference points, and other points are used as second points in the point pair characteristics. If the distance between two points is greater than the model diameter (M), the two points belong to different objects and thus should be excluded from the point-to-point feature. To improve the efficiency of the matching section, the selected reference point can only be paired with other scene points within the range of the model diameter M. We use the Kd-tree structure to obtain only the point cloud within the model diameter as the second point of the point pair feature. Dividing the pair of points participating in the ballot into two ballot spheres, a smaller ballot sphere having a diameter of

A larger voting sphere, the diameter of which is the diameter of the object. Because the smaller voting sphere is more likely to be on the object, the accumulator is first filled with the paired votes accepted by the small sphere, the peaks are extracted from the accumulator, each peak corresponding to an assumption of the 3D pose and a correspondence between a model and a scene point, and then the accumulator is continued to be filled with votes from pairs accepted by the large sphere but rejected by the small sphere, the peaks are extracted to generate pose and point correspondence assumptions. In this way, peaks that are less contaminated by background clutter may be obtained, and peaks of other configurations may still be obtained。

As shown in fig. 6, the clustering the candidate poses in step S4 in this embodiment specifically includes:

s45: and framing each candidate pose by using a rectangular frame to serve as an ROI (region of interest), clustering the distance of edge contour point clouds in the ROI, calculating the distance between the center point of the edge cluster and the center point of the candidate pose, and screening out the point clouds with the distance larger than a threshold value.

In this embodiment, the screening based on the edge matching degree in step S4 specifically includes:

s46: sorting the clustered candidate poses according to the number of votes, and sorting the clustered candidate poses according to the maximum number of votes Vote of the candidate poses _max The candidate poses are classified into two types, wherein the first type is greater than Vote _max Candidate poses of/2, which pose has a greater likelihood of being the correct pose, and a second category of less than Vote _max A candidate pose of/2, and the number of the second classes is greater than the number of the first classes;

s47: selecting 10 candidate poses close to the edge and having the largest voting number from the first class of candidate poses to respectively calculate pose scores, if the pose scores are higher than 0.7, the candidate poses are result poses, stopping calculation and performing step S48, if all the pose scores are larger than 0.6 and smaller than 0.7, the candidate poses with the highest scores are selected as result poses and performed step S48, and if all the pose scores are smaller than 0.6, performing step S49; in the first type of candidate poses, checking the fit degree of each pose and the point cloud edge outline by using a Kdtree;

s48: generating a conversion relation between the three-dimensional reconstruction model and the surface point cloud based on the corresponding relation between the three-dimensional reconstruction model and the surface point cloud;

s49: and (3) performing 3D edge point query on all the remaining candidate poses (the query is calculated by using Score), returning to the step S47 to calculate pose scores of all the remaining candidate poses, and judging the pose scores.

The calculation formula of the attitude Score in step S47 in this embodiment is as follows:

wherein N is _ROI Is a contour point cloud in the ROI area, N _matching Is the number of edge point clouds that are close to the candidate pose.

As described above, with two voting spheres, two candidate poses can be estimated for each reference point, and in order to merge similar candidate poses, the candidate poses are first subjected to vote count sorting. All candidate poses are then examined in sequence and if the rotation and translation between two candidate poses is less than a predefined threshold, the two candidate poses are clustered. And finally, carrying out quaternion averaging on each cluster to calculate a new candidate attitude, and adding the scores of each attitude to form the score of the new candidate attitude.

The score for each pose is an approximation of the sum of the votes from the clustered candidates. It may be observed that the pose with the highest score may not be optimal in the case of sensor noise and background clutter. The application calculates a more reliable score by an additional re-scoring process. In most cases, 80% -95% of the calculation time is spent on the pose screening process in the scoring stage, so the application provides an edge contour-based pose screening method with a strategy of early exit.

The edge profile is a very distinct feature of an object and can robustly represent the shape characteristics of an object. Through the contour information of the point cloud, the correct pose can be selected from a pile of candidate poses with great probability. In the screening method, for each candidate pose, an AABB frame of the candidate pose is calculated to serve as an ROI (region of interest), distance clustering is carried out on edge contour point clouds in the ROI, and the distance between the center point of the edge clustering and the center point of the candidate pose is calculated, so that remote and ambiguous contour point clouds (yellow point clouds in figure 7) are screened out. The final score for this candidate pose is shown in the following formula. N is a radical of _ROI Is a contour point cloud in the ROI region, is composed of red and blue points, N, as shown in FIG. 7 _matching The number of edge point clouds that are near the candidate pose is the red point in fig. 4.

In addition, the pose estimation method can also solve the problems of 1) real-time performance and 2) accuracy of medical operations, shadow under operation navigation scenes, high occlusion and difficult matching caused by spine structures; the problem 1) is that the clinical application background of the surgical navigation system is considered, and the surgical navigation system has high requirements on the real-time performance and accuracy of human-computer interaction, so that a surgical instrument tracking algorithm with high algorithm execution efficiency and accurate operation result needs to be developed, and the accuracy of the surgical instrument system is ensured while the real-time performance is considered; the 2) problem specifically includes

Shadow problem: the mutual coverage between the shadows directly influences the detection and extraction of the target, so that the influence of the shadows on the data acquisition and detection of the target object is reduced.

High shielding problem: when the surgical mechanical arm operates, the position of the instrument is easy to shield a target object, or when the viewpoint of a camera fixed on the mechanical arm is close to the target object, the visual field is small, so that less point clouds in a target area are obtained. Therefore, processing high-occlusion target point clouds is an inevitable requirement for pose estimation in surgical navigation.

Difficult matching problems caused by special structures of the spine: the skeletal structure of the human body has natural symmetry. The main body structure of a person is bilaterally symmetrical. This means that the left and right sides of the body tend to be very similar, but not completely similar. The invention can solve the symmetric ambiguity by calculating the edge matching degree.

In another embodiment, a method for controlling a surgical robot using a visual servo method is provided, as shown in fig. 8, the method comprising the steps of:

s100: performing coordinate transformation on the bone three-dimensional reconstruction model, a drilling coordinate system, a depth camera coordinate system, a mechanical arm tail end coordinate system and a mechanical arm base coordinate system, and obtaining a difference value from an actual position to a result pose based on a transformation relation from the actual pose to the result pose; the calculation of the difference value belongs to the conventional technology, such as two formulas disclosed in section 4.2 of the article "image-Scale 3D Object registration dataset", and the application is not specifically limited and is within the scope of the application;

based on the conversion relation from the actual pose to the result pose, namely the conversion relation between the surface point cloud and the three-dimensional reconstruction model;

s300: and judging whether the difference value from the actual position to the result pose is smaller than a threshold value, if not, calculating the movement speed of the end effector of the mechanical arm, generating a control command and sending the control command to the end effector of the mechanical arm, and performing the step S200, if so, the movement speed of the end effector of the mechanical arm is 0, and sending a command of stopping movement to the end effector of the mechanical arm.

To control the drill head mounted on the robotic arm (i.e. to drill a hole in this application) so that it can drill a hole in the pose we specify, we perform a coordinate transformation. It is to be understood that the transform relationship is the relationship between the spine model, the fixed bore, the depth camera, the end of the robot arm and the base of the robot arm. Finally, obtaining a conversion relation between expected drilling and a mechanical arm base, wherein the specific calculation method comprises the following steps:

acquiring a drilling target pose matrix under a bone three-dimensional reconstruction model coordinate system (if the bone three-dimensional reconstruction model is established in the method, the corresponding coordinate system is also established during the model establishment process)

(the matrix is entered by the physician in advance);

the method comprises the steps of calibrating a depth camera and a mechanical arm by hands and eyes to obtain a conversion matrix from a depth camera coordinate system to a mechanical arm tail end (a calibration plate is arranged at the tail end of the mechanical arm) coordinate system

Calibrating the mechanical arm and the drilling tool to obtain a coordinate system for fixing the drilling instrument to the end of the mechanical armTransformation matrix of end coordinate system

The calibration method of hand-eye calibration and tool calibration belongs to the conventional technology, and the application is not limited specifically;

pose estimation method for robot-assisted surgery to obtain transformation matrix from bone three-dimensional reconstruction model coordinate system to camera coordinate system

Acquiring a pose matrix of the end effector in a robot base coordinate system through a mechanical arm controller

Calculating transformation matrix for fixing drilling instrument to expected drilling instrument pose under fixed drilling instrument coordinate system

The calculation formula is as follows:

wherein, the first and the second end of the pipe are connected with each other,

is a matrix

The inverse matrix of (d);

calculating transformation matrix from expected position and attitude of drilling instrument to mechanical arm base coordinate system

The calculation formula is as follows:

in the present embodiment, a position-based visual servo scheme is used, and as shown in fig. 9, the visual servo scheme is to control the motion of the robot by using visual information extracted from images or point clouds captured by one or more cameras (i.e., depth cameras). The visual servoing is a closed loop system where the visual analysis provides guidance to the robot and the robot motion provides new visual analysis to the camera. The mechanical arm is controlled by a visual servo method based on the position, so that the mechanical arm has strong resistance to calibration noise, the whole system forms a closed loop, the success rate is improved, and the deviation is reduced.

The present embodiment uses a location-based visual servoing scheme. The input is the difference between the detected actual posture of the spine and the desired spine posture and the output is a control command in the velocity domain of the robot, with the aim of rapidly moving the robot (i.e. the robotic arms and their end-effectors in this embodiment) to the target pose state. And after the command is finished, the camera continuously receives the feedback value of the robot state to form a closed-loop control system. The closer the true pose is to the desired pose, the less the velocity of the robot arm. When the difference value is smaller than the set threshold value, the speed of the robot arm is 0, and the servo is stopped; after the pose is calculated, the mechanical arm is driven and controlled by the visual servo to gradually approach a target point according to the visual characteristics. The whole precision and accuracy are high.

In the embodiment, the mechanical arm is controlled by using a position-based visual servo method, so that the mechanical arm has stronger resistance to calibration noise, the whole system forms a closed loop, the success rate is improved, and the deviation is reduced.

In a third embodiment, a navigation system for robot-assisted surgery is provided, which comprises a host and hardware equipment in communication with the host, wherein the hardware equipment comprises a mechanical arm 1, a depth camera 2 and a drilling instrument 3, the depth camera and the drilling instrument are arranged on the mechanical arm, and the host comprises at least one processor; and

a memory storing instructions that, when executed by the at least one processor, perform the steps of the method as described above. Referring to fig. 10 (including 10a-10 d) a diagram of the actual performance of the system and fig. 11 a flow chart of the instructions executed by the system.

In the invention, only the depth camera is used for point cloud registration, so that the requirement on light is not high, and the pose estimation (without being influenced by light) can be carried out under the conditions of dark light and light shielding; meanwhile, only the depth camera is used for positioning, the mechanical arm is cooperated for operation, other hardware equipment is not needed, and the operation is simple (the hardware equipment is less and the operation is simple); the mainframe part uses a desktop computer with an Intel i7-7700k processor, and the above algorithm is operated on the mainframe to carry out the task of point cloud space registration. In the peripheral auxiliary part, as shown in fig. 12, an Azure Kinect DK depth camera is used as the three-dimensional camera, so that real-time monitoring of surgical navigation can be performed in real time. The mechanical arm is a roaming cooperative robot and is used for simulating a human hand and carrying out main experiments such as surgical puncture and the like. The drilling apparatus can be provided with various sizes of drill holes and can adjust different rotating speeds. It is fixed and is convenient to drill the skeleton on the arm.

While this specification contains many specific implementation details, these should not be construed as limitations on the scope of any inventions or of what may be claimed, but rather as descriptions of features that may embody particular implementations of particular inventions. Certain features that are described in this specification in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in combination and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.

In certain situations, multitasking and parallel processing may be advantageous. Moreover, the separation of various system modules and components in the embodiments described above should not be understood as requiring such separation in all embodiments.

Particular embodiments of the subject matter have been described. Other implementations are within the scope of the following claims. For example, the activities recited in the claims can be performed in a different order and still achieve desirable results. As one example, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In certain implementations, multitasking and parallel processing may be advantageous.

Claims

1. A pose estimation method for robot-assisted surgery, the pose estimation method comprising the steps of:

2. The pose estimation method for robot-assisted surgery according to claim 1, wherein the step S2 of determining the visibility of the point pair features and the geometric complexity of the point pair features using the normal vector angle between the point pairs is specifically as follows:

3. The pose estimation method for the robot-assisted surgery according to claim 1, wherein the point cloud sampling based on the edge information in the step S3 is specifically:

and constructing a multi-resolution voxel grid structure according to the model diameter of the bone three-dimensional reconstruction model, and combining the similar points of which the angles between the normal lines are smaller than theta for each voxel unit under the resolution according to the order from thin to thick.

4. The pose estimation method for robot-assisted surgery according to claim 1, wherein the step S4 of performing point pair feature extraction and matching on the surface point cloud and the three-dimensional reconstruction model based on two voting spheres to obtain candidate poses specifically comprises:

A small voting sphere and a large voting sphere with a diameter equal to the diameter of the object, and the accumulator is filled with the paired votes accepted by the small voting sphere;

s43: extracting peak values in an accumulator, generating a hypothesis corresponding to the 3D posture and a corresponding relation between the three-dimensional reconstruction model and the surface point cloud, and obtaining candidate poses based on the hypothesis corresponding to the 3D posture;

5. The pose estimation method for robot-assisted surgery according to claim 4, wherein the clustering the candidate poses in step S4 specifically comprises:

s45: and framing each candidate pose by using a rectangular frame to serve as an ROI (region of interest), clustering the distance of the edge contour point cloud in the ROI, calculating the distance between the center point of the edge cluster and the center point of the candidate pose, and screening out the point cloud with the distance larger than a threshold value.

6. The pose estimation method for robot-assisted surgery according to claim 5, wherein the screening based on the edge matching degree in step S4 specifically comprises:

s46: sorting the clustered candidate poses according to the number of votes, and sorting the clustered candidate poses according to the maximum number Vote of votes of the candidate poses _max The candidate poses are classified into two types, wherein the first type is greater than Vote _max Candidate pose of/2, second class is less than Vote _max A candidate pose of/2, and the number of the second classes is greater than the number of the first classes;

s47: selecting 10 candidate poses close to the edge and having the largest voting number from the first class of candidate poses to respectively calculate pose scores, if the pose scores are higher than 0.7, the candidate poses are result poses, stopping calculation and performing step S48, if all the pose scores are larger than 0.6 and smaller than 0.7, the candidate poses with the highest scores are selected as result poses and performed step S48, and if all the pose scores are smaller than 0.6, performing step S49;

s49: and inquiring the 3D edge points of all the remaining candidate poses, returning to the step S47 to calculate pose scores of all the remaining candidate poses, and judging the pose scores.

7. The robot-assisted surgery oriented pose estimation method according to claim 6, wherein the pose Score in step S47 is calculated as follows:

wherein, N _ROI Is a contour point cloud in the ROI area, N _matching Is the number of edge point clouds that are close to the candidate pose.

8. A method of controlling a surgical robotic arm using a visual servoing method, the method comprising the steps of:

s200: acquiring images or point clouds of one or more cameras capturing the end effector of the mechanical arm in real time, and generating the position of the end effector of the mechanical arm according to the images or the point clouds;

s300: and judging whether the difference value from the actual position to the result pose is smaller than a threshold value, if not, calculating the movement speed of the mechanical arm end effector, generating a control command and sending the control command to the mechanical arm end effector, and performing the step S200, if so, the movement speed of the mechanical arm end effector is 0, and sending a command of stopping movement to the mechanical arm end effector.

9. The method for controlling a surgical robot arm using a visual servomechanism as set forth in claim 8, wherein the coordinate transformation in step S100 is performed by the following steps:

obtaining drilling target pose matrix under bone three-dimensional reconstruction model coordinate system

Performing hand-eye calibration on the depth camera and the mechanical arm to obtain a conversion matrix from a depth camera coordinate system to a mechanical arm tail end coordinate system

Calibrating the mechanical arm and the drilling hole to obtain a conversion matrix from a coordinate system for fixing the drilling instrument to a coordinate system at the tail end of the mechanical arm

Method for obtaining transformation matrix from bone three-dimensional reconstruction model coordinate system to camera coordinate system based on pose estimation method of any one of claims 1-7

Calculating transformation matrix for fixing drilling instrument to expected drilling instrument position and pose under fixed drilling instrument coordinate system

The calculation formula is as follows:

The calculation formula is as follows:

10. a navigation system for robot-assisted surgery, which is characterized by comprising a host and hardware equipment communicated with the host, wherein the hardware equipment comprises a mechanical arm (1), a depth camera (2) and a drilling instrument (3) arranged on the mechanical arm, and the host comprises at least one processor; and

memory storing instructions which, when executed by the at least one processor, carry out the steps of the method according to any one of claims 1 to 9.