CN111216109A

CN111216109A - Visual following device and method for clinical treatment and detection

Info

Publication number: CN111216109A
Application number: CN201911005988.XA
Authority: CN
Inventors: 王斐; 任百明; 梁宸; 茹常磊
Original assignee: Northeastern University China
Current assignee: Northeastern University China
Priority date: 2019-10-22
Filing date: 2019-10-22
Publication date: 2020-06-02

Abstract

The invention discloses a vision following device and a method thereof for clinical treatment and detection, and belongs to the technical field of computer vision and medical instruments. The vision following device comprises an image acquisition module, a hand-eye calibration module, a central processing unit module, an execution module and a head movement part; the vision following device adopts a monocular camera to shoot AprilTag labels attached to the side surfaces of the heads, and obtains a pose transformation matrix of a head scanning part relative to a base of a serial multi-joint cooperative manipulator on an ros robot operating system through an AprilTag label algorithm and by combining with hand-eye calibration. And finally, controlling the serial multi-joint cooperation mechanical arm to move to a head scanning part by using inverse kinematics planning, so that the scanner moves along with the head. The invention can be applied to the field of clinical medical treatment and is beneficial to improving the human-computer interaction capability of medical instruments.

Description

Visual following device and method for clinical treatment and detection

Technical Field

The invention belongs to the technical field of computer vision and medical instruments, and particularly relates to a vision following device and a vision following method for clinical treatment and detection.

Background

With the increase of the complexity of clinical application and the development of imaging technology, computer graphics technology, virtual reality technology and robot technology, the integration of medical scanners has more and more functions, and the automation degree is continuously improved, thereby generating a batch of novel medical scanners. Different from the traditional medical apparatus, the novel medical scanner can calculate the change of the head movement of a patient by means of machine vision under the condition of no human intervention, and then controls a mechanical arm carrying the scanner to synchronously follow the head movement.

Currently available methods for visual following of medical devices are:

1. a method based on a communication circuit. The invention patent CN102599925A of fiprony et al discloses an automatic following control device of a suspension type X-ray machine, which comprises a motion control device of a followed party (chest piece clamping mechanism) and a motion control device of a following party (bulb tube). The motor control circuits in the two automatic following parties are mutually independent and are connected through a communication circuit, the motor control circuits in the two automatic following parties independently control the motors of the two automatic following parties to operate, information is exchanged in real time through the communication circuit, and the following action is performed after the information is processed by software in the motor control circuits. But automatic following can only occur in both the up and down directions of movement.

2. A motion sensor based method. The invention patent CN104287752A of fipronil et al discloses a motion control method and apparatus for a mobile X-ray machine. The electric control device of the mobile X-ray machine is formed by taking the existing magnetic-sensing Hall sensor and a matched circuit as a sensor unit, taking a single chip microcomputer as a core as a data processing unit and taking a bidirectional reversible direct current motor driving technology taking a DSP as a control core as a motor driving unit, and can meet various clinical application requirements. However, magnetic field interference of the magnetic sensor in a medical environment is large, and motion limitation also exists in motion control of the double motors.

Disclosure of Invention

The present invention is directed to a visual tracking device and method for clinical treatment and detection, which solves the above problems in the prior art.

The invention provides a method for detecting head movement by using an AprilTag algorithm and combining a monocular camera and an AprilTag label, and then controlling a serial multi-joint cooperative manipulator (hereinafter referred to as a manipulator) to follow the movement so as to control the movement of a scanner, thereby achieving the effect of following the head for scanning.

Specifically, the visual following device for clinical treatment and detection provided by the invention adopts a monocular video camera to shoot an April tag attached to the side face of the head of a subject, the April tag moves along with the head, and then a transformation matrix P1 of the April tag relative to the monocular video camera is obtained through an April tag algorithm. Then, a transformation matrix T1 of the monocular camera relative to the manipulator base is obtained in a hand-eye calibration mode, and a transformation matrix P2 from the manipulator base to the tail end scanner is obtained through a joint tree on a ros robot operating system. The aprilatag tag's transformation matrix T2 relative to the scanner is set by the scanning program on the ros robot operating system. Wherein, T1 is a fixed matrix because the monocular camera and the manipulator base are fixed in position. The relative position of the scanner to the AprilTag label is controlled to be unchanged by a scanning program, and T2 is a fixed matrix. Obtained by matrix transformation, T2 ═ P1 ═ T1 ═ P2. When the head moves, P1 changes, because T1 and T2 do not change, when the scanner detects the changes, the changes are sent to a processing terminal, a new P2 is obtained through the ros robot operating system, then the mechanical motion is controlled through inverse kinematics planning on the ros robot operating system, the scanner at the tail end of the manipulator is driven to move, and the operation is repeated, so that the visual following effect is achieved.

The technical scheme of the invention is as follows:

a visual following device for clinical treatment and detection comprises an image acquisition module, a hand-eye calibration module, a central processing unit module, an execution module and a head movement part; wherein:

the image acquisition module comprises a monocular camera 3 and a bracket 4, wherein the monocular camera 3 is fixed on the bracket 4 and is used for acquiring an image of the Apriltag 1 so as to transmit the image acquired in real time to the central processing unit module;

the hand-eye calibration module comprises an AprilTag label 1, and the AprilTag label 1 is fixed on the side of the head of the subject 2 to represent the movement of the head;

the execution module comprises a manipulator 6 and a scanner 5; the scanner 5 is arranged at the tail end of the manipulator 6, and the manipulator 6 is used for driving the scanner 5 at the tail end to move;

the central processor module comprises a computer 9, the monocular camera 3 is connected with the computer 9, the computer 9 receives an image of the April Tag label 1 acquired by the monocular camera 3 to calculate a transformation matrix of the April Tag label 1 relative to the monocular camera 3, and then the transformation matrix of the April Tag label 1 relative to the manipulator 6 is obtained; meanwhile, the computer 9 is connected with the manipulator 6, and transmits the transformation matrix of the aprilat tag 1 relative to the base of the manipulator 6 to the manipulator 6 so as to control the motion of the manipulator 6; the computer 9 is loaded with a scanning program to control the relative position of the scanner 5 to the aprilat tag 1 to be unchanged;

the head motion part is used as a real-time feedback link, and the AprilTag label 1 attached to the side face of the head of the subject 2 is driven to move through head motion, so that closed-loop feedback and visual following are realized.

In another aspect, the present invention provides a visual following method for clinical treatment and detection, comprising the following steps:

step 1, selecting an Apriltag 1 and attaching the Apriltag 1 to a head side scanning area of a subject 2;

step 2, opening the computer 9, and operating a starting program and an AprilTag algorithm program of the monocular camera 3;

step 3, acquiring an AprilTag label 1 image attached to the side face of the head of the subject 2 through the monocular camera 3, and transmitting the AprilTag label 1 image to the computer 9;

step 4, operating an AprilTag algorithm on a ros robot operating system on the computer 9 to obtain a transformation matrix P1 of the AprilTag label 1 relative to the monocular camera 3;

step 5, the computer 9 is connected with the manipulator 6, and a transformation matrix T1 of the monocular camera 3 relative to a base of the manipulator 6 is obtained through hand-eye calibration;

step 6, operating a moveit starting program of the manipulator 6, reading the transformation matrix of each joint through a manipulator 6 joint tree issued on a ros robot operating system, further obtaining a transformation matrix P2 from the base of the manipulator 6 to the terminal scanner 5 of the manipulator, and storing the transformation matrix P2;

step 7, running a scanning program, and setting a transformation matrix T2 of the Apriltag 1 relative to the scanner 5 through the scanning program on the ros operating system, wherein the transformation matrix T2 is a fixed matrix;

step 8, obtaining a matrix transformation relation, wherein T2 is P1T 1P 2, when the head of the subject 2 moves, the AprilTag label 1 generates relative motion to cause the transformation matrix P1 to change, because the transformation matrix T1 and the transformation matrix T2 do not change, when the scanner 5 detects the change, the transformation matrix is sent to the central processing unit module, a new transformation matrix P2 is obtained through a ros robot operating system, the self-adaptive change of each joint angle between the base of the manipulator 6 and the scanner 5 is automatically obtained according to inverse kinematics solution, and at the moment, the scanner 5 and the AprilTag label 1 reach steady balance, and the following of the scanner 5 to the head motion is realized;

wherein: the transformation matrix P1, the transformation matrix P2, the transformation matrix T1, and the transformation matrix T2 are represented as:

a coordinate system centered at the origin of the reference coordinate system is represented by three vectors, usually perpendicular to each other, referred to as unit vectors n, o, a. Each unit vector is represented by three components in the x, y, z axes of the reference coordinate system in which they are located.

Vector in formula (n)_x、o_x、a_x)、(n_y、o_y、a_y)、(n_z、o_z、a_z) Representing components in the x, y, z axes, respectively, and constituting a 3 x 3 matrix,to represent a gesture; p represents a position vector, and is decomposed into position components in x, y, and z axes, respectively, by P_x、P_y、P_zRepresents, further P_x、P_y、P_zRepresenting the corresponding position, i.e. the displacement with respect to the origin coordinate of the reference coordinate system.

Further, in step 4, reading the image of the aprilat tag 1 acquired in step 3 into an image of the aprilat tag 1 formulated in each frame of the video identification video, and preprocessing the color image-to-gray image; then, the image is blurred through Gaussian filtering, and noise is smoothed; then calculating a gradient, including the direction and the amplitude of the gradient; selecting edge points with the amplitude M larger than a threshold value, searching surrounding points of the edge points, and clustering adjacent points according to the direction of the edge; fitting straight lines by linear regression, and searching a quadrangle formed by closed straight lines; and calculating a homography matrix by using a direct linear transformation algorithm, and combining the information of multiple cameras by using a single linear transformation matrix to obtain the posture information of the AprilTag label 1.

Further, in step 5, another aprilat tag 1 is attached to the end of the robot arm 6, and the spatial coordinates of the aprilat tag 1 and the robot arm 6 are observed through a visual interface RVIZ on the ros robot operating system, so as to ensure that the aprilat tag 1 and the robot arm 6 have the same coordinates; reading a pose transformation matrix TB of the tail end of the manipulator 6 relative to a manipulator 6 base through a joint tree on a ros robot operating system, obtaining a pose transformation matrix TA of the monocular camera 3 relative to an Apriltag 1 by using an Apriltag algorithm, and finally obtaining a coordinate transformation matrix T1 of the monocular camera 3 relative to the manipulator 6 base through matrix multiplication, wherein T1 is TA TB; the formulas of the pose transformation matrix TA and the pose transformation matrix TB are the same as those of the transformation matrix P1, the transformation matrix P2, the transformation matrix T1, and the transformation matrix T2.

The invention has the beneficial effects that:

the monocular camera is used for collecting the motion of the Apriltag to indirectly judge the motion of the reaction head, the characteristics are simple and effective, no special requirements are required on hardware, no complex algorithm is involved, and the monocular camera is easy to use. The manipulator is a cooperative robot and has protective stopping measures for people, and meanwhile, the manipulator is a 6-degree-of-freedom manipulator and is flexible in motion control. The scanner is mounted at the end of the manipulator, thereby driving the movement of the scanner. The invention can be applied to the medical background and is beneficial to improving the human-computer interaction capability of the medical instrument.

And, in particular, the apparatus of the present invention for realizing visual following in combination with machine vision and a robot arm has the following advantages over the existing methods. First, there is a natural advantage to using monocular cameras and AprilTag tags to detect head motion. Firstly, the spatial attitude of the aprilat tag can be obtained by using the two-dimensional image of the aprilat tag, thereby greatly reducing the calculation amount and the algorithm complexity. Secondly, the AprilTag label is only a square paper sheet with a small area, and only needs to be attached to the side face of the head, so that the experience of a user is good. Thirdly, the AprilTag label can obtain the spatial pose of the AprilTag label through a monocular camera under shielding of a certain degree, and has certain robustness. Fourthly, the method is less influenced by illumination change, and the method can accurately acquire the head motion change in real time. Secondly, after the manipulator receives the pose of the AprilTag label in the space, the inverse kinematics packet is used for resolving, and the method is flexible in motion control and small in limitation. And the manipulator belongs to a cooperative manipulator and has a protective stopping function.

Drawings

Fig. 1 is a diagram illustrating the overall effect of a visual tracking device for clinical treatment and detection provided in an embodiment of the present invention.

Fig. 2 is a flowchart of the whole visual tracking apparatus for clinical treatment and detection provided in the embodiment of the present invention.

Fig. 3 is a coordinate diagram and a pose transformation matrix of a visual tracking apparatus for clinical treatment and detection provided in an embodiment of the present invention.

Fig. 4 is a flow chart of aprilat algorithm for a visual tracking device for clinical treatment and detection provided in an embodiment of the present invention.

Fig. 5 is a graph illustrating the actual effect of aprilat algorithm of a visual tracking apparatus for clinical treatment and detection according to an embodiment of the present invention.

Fig. 6 is an overall device diagram of hand-eye calibration of a visual tracking device for clinical treatment and detection provided in an embodiment of the present invention.

Fig. 7 is a flow chart of the visual following of the manipulator in the ros robot operating system of the visual tracking apparatus for clinical treatment and detection provided in the embodiment of the present invention.

In the figure: 1 aprilatag tag; 2, a subject; 3 a monocular camera; 4, supporting the bracket; 5, a scanner; 6, a mechanical arm; 7, a network cable; 8USB transmission lines; 9 computer.

Detailed Description

The following further describes a specific embodiment of the present invention with reference to the drawings and technical solutions.

It is to be understood that the appended drawings are not to scale, but are merely drawn with appropriate simplifications to illustrate various features of the basic principles of the invention. Specific design features of the invention disclosed herein, including, for example, specific dimensions, orientations, locations, and configurations, will be determined in part by the particular intended application and use environment.

In the several figures of the drawings, identical or equivalent components (elements) are referenced with the same reference numerals.

Fig. 1 is a diagram illustrating the overall effect of a visual tracking device for clinical treatment and detection provided in an embodiment of the present invention. Fig. 2 is a flowchart of the whole visual tracking apparatus for clinical treatment and detection provided in the embodiment of the present invention. Referring to fig. 1 and 2, the visual following device for clinical treatment and detection in the present embodiment includes an image acquisition module, a hand-eye calibration module, a central processor module, an execution module, and a head movement portion.

The image acquisition module comprises a monocular camera 3 and a bracket 4. The monocular camera 3 is fixed on the bracket 4 and used for acquiring images of the AprilTag label 1 and then transmitting the images acquired in real time to the central processing unit module.

The hand-eye calibration module comprises an aprilat tag 1. AprilTag Tag 1 utilizes Tag similar to a two-dimensional code to realize positioning by combining with a corresponding algorithm, and is used for positioning guidance. Aprilatag tag 1 is affixed to the side of the head of subject 2 and represents the movement of the head. In the hand-eye calibration process, the monocular camera 3 is arranged on the support 4 and is not arranged at the tail end of the manipulator 6, namely, the calibration mode of 'eyes outside the hands' is adopted, and the pose transformation matrix of the manipulator 6 and the monocular camera 3 is obtained according to the calibration mode. The hand-eye calibration module is a pad of the following system, and can store the result of the hand-eye calibration after calibration is usually performed only once without repeated calibration.

The central processor module comprises a computer 9 which is provided with a ros robot operating system. On the one hand, the monocular camera 3 is connected to a computer 9, for example via a USB cable 8. The computer 9 is configured to receive the image of the aprilat tag 1 acquired by the monocular camera 3, and transmit the acquired image of the aprilat tag 1 to the ros robot operating system. And calculating a transformation matrix of the AprilTag label 1 relative to the monocular camera 3 by using an AprilTag algorithm, and then combining a hand-eye calibration module to obtain the transformation matrix of the AprilTag label 1 relative to the manipulator 6. On the other hand, the computer 9 is connected to the robot arm 6, for example, via the network cable 7, and transmits the obtained transformation matrix of the head scanning area with respect to the base of the robot arm 6 to the robot arm 6 in the ros robot operation system, thereby controlling the movement of the robot arm 6. Finally, the scanning program loaded by the computer 9 controls the scanner 5 to be unchanged from the aprilat tag 1 relative position T2, thereby ensuring that T2 is a fixed matrix.

The execution module comprises a manipulator 6 and a scanner 5. The scanner 5 is installed at the end of the manipulator 6, and after the manipulator 6 receives the transformation matrix of the aprilat tag 1 relative to the manipulator 6, the angular transformation of each joint is calculated by using a corresponding inverse kinematics plug-in through a moveit program package under a ros robot operating system, so as to drive the scanner 5 at the end to move to a corresponding position.

The head motion part is used as a real-time feedback link, and the AprilTag label 1 attached to the side face of the head of the subject 2 is driven to move through head motion, so that the purpose of closed-loop feedback is achieved, and the visual following effect is facilitated. Since the aprilat tag 1 is fixed to the side of the head, the motion of the head can be equivalently replaced with the motion of the aprilat tag 1.

A visual following method based on the visual following device for clinical treatment and detection comprises the following steps:

step 3, acquiring an Apriltag 1 image attached to the side face of the head of the subject 2 through a monocular camera 3, and transmitting the Apriltag 1 image to a computer 9 through a USB transmission line 8;

step 4, operating an AprilTag algorithm on a ros robot operating system on the computer 9 to obtain a pose transformation matrix P1 of the AprilTag label 1 relative to the monocular camera 3;

step 5, the computer 9 is connected with the manipulator 6 through the network cable 7, and a transformation matrix T1 of the monocular camera 3 relative to the base of the manipulator 6 is obtained through hand-eye calibration;

step 6, the moveit starting program of the operating manipulator 6 can read the transformation matrix of each joint through the manipulator 6 joint tree issued on the ros robot operating system, and further obtains the transformation matrix P2 from the base of the manipulator 6 to the terminal scanner 5 of the manipulator and stores the transformation matrix P2;

step 7, running a scanning program, and setting a transformation matrix T2 of the AprilTag label 1 relative to the scanner 5 through the scanning program on the ros operating system, wherein T2 is a fixed matrix;

and 8, obtaining through a matrix transformation relation, wherein T2 is P1T 1P 2, when the head of the subject 2 moves, the April tag 1 generates relative motion to cause P1 change, T1 and T2 are not changed, when the scanner 5 detects the change, the change is sent to a central processor module, a new P2 is obtained through a ros robot operating system, the self-adaptive change of each joint angle between the base of the manipulator 6 and the scanner 5 is automatically obtained according to inverse kinematics solution, and at the moment, the scanner 5 and the April tag 1 reach steady balance, so that the following of the scanner 5 to the head motion is realized. This process is repeated when subject 2 moves the head again.

X, y, z in (1) in fig. 3 represent a reference coordinate system, and n, o, a represent a coordinate system. A vector is made between the origin of the coordinate system and the origin of a reference coordinate system to represent the position of the coordinate system. P represents the length of the vector. This vector is represented by three components relative to a reference coordinate system. This coordinate system can then be represented by three unit vectors representing directions and a fourth position vector. By the method, the coordinate transformation of different coordinate systems can be visually represented by the transformation matrix.

Fig. 3 (2) shows the transformation matrix expression formula for different positions in the form of a matrix, and both the posture and the positions are expressed in the same matrix, so that a scale factor can be added into the matrix to make the matrix into a 4 x 4 matrix. In the formula, F represents a transformation matrix, and the vector (n) in the formula_x、o_x、a_x)、(n_y、o_y、a_y)、(n_z、o_z、a_z) Respectively representing components in x, y and z axes, and forming a 3 x 3 matrix to represent the posture; p represents a position vector, and is decomposed into position components in x, y, and z axes, respectively, by P_x、P_y、P_zRepresents, further P_x、P_y、P_zRepresenting the corresponding position, i.e. the displacement with respect to the origin coordinate of the reference coordinate system. The aforementioned calculations refer to T2, P1, T1, P2, TA, TB as transformation matrices with different meanings, so they can be represented by this representation. The formats of T2, P1, T1, P2, TA and TB are the same, but the data of each component in the interior is different. Thus we can compute the transformed pose and position by multiplication of the matrix. Therefore, the pose transformation matrix P2 can be obtained quickly.

Fig. 4 shows in the form of a flowchart how the aprilat algorithm processes the acquired image of the aprilat tag 1, reads in the image of the aprilat tag 1 formulated in each frame of the video recognition video, and preprocesses the color map to the grayscale map; then, the image is blurred through Gaussian filtering, and noise is smoothed; the gradient is then calculated, including the direction and magnitude of the gradient. Selecting edge points with the amplitude M larger than a threshold value, searching surrounding points of the edge points, and clustering adjacent points according to the direction of the edge; fitting straight lines by linear regression, and searching a quadrangle formed by closed straight lines; and calculating a homography matrix by using a direct linear transformation algorithm, and combining the information of multiple cameras by using a single linear transformation matrix to obtain the related posture information of the AprilTag label 1.

Fig. 5 shows in the form of an effect diagram how the aprilat algorithm processes the acquired aprilat image, first (1) in fig. 5 is to read in the aprilat image; the label detection algorithm calculates their magnitude by calculating the start of the gradient at each pixel, as in (2) of fig. 5; their directions are calculated as in (3) of fig. 5; using pattern-based methods, with pixels of similar gradient direction and size, clustered into components, e.g.

(4) in fig. 5; using a weighted least squares method, as in (5) of fig. 5, a line segment is fitted to the pixels at each component; the direction of the line segment is determined by the gradient direction, so the segment is black on the left and light on the right. And extracting straight lines in the scene and detecting square corner points. Finally, square regions and key corner points thereof are obtained (as shown in (6) in fig. 5), the square regions are homomorphically mapped into squares and matched with a Tags library, and whether the square regions are Tags is judged.

In step 5, since "hand-eye calibration" is performed to obtain the transformation matrix T1 of the monocular camera 3 with respect to the base of the robot arm 6, this process is performed before the entire system is operated, and to achieve "hand-eye calibration", this is performed by means of the aprilat tag 1. First, the AprilTag 1 is attached to the tail end of the manipulator 6, and the spatial coordinates of the AprilTag 1 and the tail end of the manipulator 6 are observed through a visual interface RVIZ on a ros robot operating system, so that the coordinates of the AprilTag 1 and the tail end of the manipulator 6 are consistent. And reading a pose transformation matrix TB of the tail end of the manipulator 6 relative to the base of the manipulator 6 through a joint tree on a ros robot operating system, then obtaining a pose transformation matrix TA of the monocular camera 3 relative to an AprilTag label 1 by using an AprilTag algorithm, and finally obtaining a coordinate transformation matrix T1 of the monocular camera 3 relative to the base of the manipulator 6 through matrix multiplication, namely T1-TA TB.

Fig. 6 shows an overall device diagram of hand-eye calibration, referring to fig. 6, the specific calibration process is as follows: first, the robot 6 is connected to the ros robot operation system, and the pose transformation TB of the end of the robot 6 with respect to the base coordinates of the robot 6 can be read through the joint tree on the ros robot operation system. Then we attach aprilat tag 1 to the end of the robot arm 6, ensuring that the tag coordinates are consistent with the coordinates of the end of the robot arm 6. Then, the pose, namely TA, of the AprilTags label in the acquired image is read by the monocular camera 3, which is obtained by the AprilTags algorithm. With the TA and TB coordinates, we can obtain the pose transformation T1 of the monocular camera 3 relative to the UR5 manipulator 6 base coordinate through matrix multiplication, that is, T1 is TA × TB.

In step 7, the scanning program sets AprilTag tag 1 to remain unchanged with respect to the transformation matrix T2 of scanner 5. Since T2 is preset by medical parameters according to the scanner 5 and the head scanning part, the best scanning effect can be achieved at the relative position of T2.

In step 8, in the process of calculating the transformation matrix P2 from the base of the manipulator 6 to the end scanner 5 thereof: the poses of T2, P1, T1 and P2 are all represented in a 4 x 4 matrix mode of adding displacement to the poses, so that P2 can be obtained through multiplication of the matrix.

In step 8, in the process of calculating the transformation matrix P2 from the base of the manipulator 6 to the end scanner 5 thereof: when the head moves, AprilTag label 1 on the side of the head is driven to move, and real-time feedback can be carried out. Since the aprilat algorithm updates the pose of aprilat tag 1 in real time, the final matrix transformation results in a continuous change in P2 over time and head movement.

In step 8, the scanner 5 is mounted at the end of the robot 6 and can move along with the movement of the robot 6. Further, the scanner 5 can scan the area along with the head moving in real time, so that the visual tracking effect of the scanner 5 is realized.

Fig. 7 shows in the form of a flowchart that a transformation matrix P1 of the aprilat tag 1 relative to the monocular camera 3 is obtained by the aprilat algorithm on the ros operating system, and after the transformation matrix P2 of the aprilat tag 1 relative to the base of the manipulator 6 is converted, the P2 is published in the form of a topic, where the topic is a communication mode between nodes on the ros robot operating system, a judgment is made here to determine whether to subscribe to the published pose topic, and if not, subscription is resumed. Then another node is written to subscribe and publish in the form of a server, wherein the service is also a communication mode on the ros robot operating system, and then another node is written to subscribe and transmit to moveit in the form of a server client, wherein the moveit is a program package for movement planning of the manipulator 6. Meanwhile, the manipulator 6 communicates with the ros robot operating system through the network cable 7, receives the angle change of each joint planned by the moveit inverse kinematics, and distributes the angle change to each joint of the manipulator 6, so that the manipulator 6 moves and plans to a specified spatial position. Thereby moving the scanner 5 at the end of the robot 6.

It should be understood that the above description is illustrative of the practice of the invention and is not to be construed as limiting the scope of the invention, which is defined by the appended claims.

Claims

1. A visual following device for clinical treatment and detection is characterized by comprising an image acquisition module, a hand-eye calibration module, a central processing unit module, an execution module and a head movement part; wherein:

the image acquisition module comprises a monocular camera (3) and a bracket (4), wherein the monocular camera (3) is fixed on the bracket (4) and is used for acquiring an image of the April tag (1) so as to transmit the image acquired in real time to the central processing unit module;

the hand-eye calibration module comprises an AprilTag label (1), and the AprilTag label (1) is fixed on the side of the head of the subject (2) to represent the motion of the head;

the execution module comprises a manipulator (6) and a scanner (5); the scanner (5) is arranged at the tail end of the manipulator (6), and the manipulator (6) is used for driving the scanner (5) at the tail end to move;

the central processor module comprises a computer (9), the monocular camera (3) is connected with the computer (9), the computer (9) receives an image of the April tag (1) acquired by the monocular camera (3) to calculate a transformation matrix of the April tag (1) relative to the monocular camera (3), and then the transformation matrix of the April tag (1) relative to the manipulator (6) is obtained; meanwhile, the computer (9) is connected with the manipulator (6) and transmits the transformation matrix of the AprilTag label (1) relative to the base of the manipulator (6) to the manipulator (6) so as to control the motion of the manipulator (6); the computer (9) is loaded with a scanning program to control the relative position of the scanner (5) to the AprilTag label (1) to be unchanged;

the head moving part is used as a real-time feedback link, and the AprilTag label (1) attached to the side face of the head of the testee (2) is driven to move through head movement, so that closed-loop feedback and visual following are realized.

2. A visual following method for clinical treatment and detection, which is characterized by comprising the following steps:

step 1, selecting an Apriltag (1) and attaching the Apriltag to a head side scanning area of a subject (2);

step 2, opening the computer (9), and operating a starting program and an AprilTag algorithm program of the monocular camera (3);

step 3, acquiring an AprilTag label (1) image attached to the side face of the head of the subject (2) through a monocular camera (3), and transmitting the AprilTag label image to a computer (9);

step 4, operating an AprilTag algorithm on a ros robot operating system on the computer (9) to obtain a transformation matrix P1 of the AprilTag label (1) relative to the monocular camera (3);

step 5, connecting the computer (9) with the manipulator (6), and obtaining a transformation matrix T1 of the monocular camera (3) relative to the base of the manipulator (6) through hand-eye calibration;

step 6, operating a moveit starting program of the manipulator (6), reading the transformation matrix of each joint through a manipulator (6) joint tree issued on the ros robot operating system, further obtaining a transformation matrix P2 from the base of the manipulator (6) to the terminal scanner (5) of the manipulator, and storing the transformation matrix P2;

step 7, running a scanning program, and setting a transformation matrix T2 of the Apriltag (1) relative to the scanner (5) through the scanning program on the ros operating system, wherein the transformation matrix T2 is a fixed matrix;

step 8, obtaining a matrix transformation relation, wherein T2 is P1T 1P 2, when the head of the subject (2) moves, the April tag (1) generates relative motion to cause the change of the transformation matrix P1, the transformation matrix T2 is unchanged due to the transformation matrix T1, when the scanner (5) detects the change, the transformation matrix is sent to a central processor module, a new transformation matrix P2 is obtained through a ros robot operating system, the self-adaptive change of each joint angle between the base of the manipulator (6) and the scanner (5) is automatically obtained according to inverse kinematics solution, and at the moment, the scanner (5) and the April tag (1) reach steady balance, and the following of the scanner (5) to the head movement is realized;

a coordinate system centered at the origin of the reference coordinate system is represented by three vectors, usually perpendicular to each other, called unit vectors n, o, a; each unit vector is represented by three components in the x, y, z axes of the reference coordinate system in which they are located; vector in formula (n)_x、o_x、a_x)、(n_y、o_y、a_y)、(n_z、o_z、a_z) Respectively representing components in x, y and z axes, and forming a 3 x 3 matrix to represent the posture; p represents a position vector, and is decomposed into position components in x, y, and z axes, respectively, by P_x、P_y、P_zRepresents, further P_x、P_y、P_zRepresenting the corresponding position, i.e. the displacement with respect to the origin coordinate of the reference coordinate system.

3. The visual following method for clinical treatment and detection according to claim 2, wherein in step 4, the aprilat tag (1) image collected in step 3 is read into the aprilat tag (1) image prepared in each frame of the video recognition video, and the color map is preprocessed into a gray scale map; then, the image is blurred through Gaussian filtering, and noise is smoothed; then calculating a gradient, including the direction and the amplitude of the gradient; selecting edge points with the amplitude M larger than a threshold value, searching surrounding points of the edge points, and clustering adjacent points according to the direction of the edge; fitting straight lines by linear regression, and searching a quadrangle formed by closed straight lines; and calculating a homography matrix by using a direct linear transformation algorithm, and combining the information of multiple cameras by using a single linear transformation matrix to obtain the posture information of the AprilTag label (1).

4. The visual following method for clinical treatment and inspection according to claim 2 or 3, characterized in that in step 5, another April tag (1) is attached to the end of the manipulator (6), and the spatial coordinates of the April tag (1) and the end of the manipulator (6) are observed through a visual interface RVIZ on a ros robot operating system to ensure that the April tag (1) and the end of the manipulator (6) are consistent; reading a pose transformation matrix TB of the tail end of the manipulator (6) relative to a manipulator (6) base through a joint tree on a ros robot operating system, obtaining a pose transformation matrix TA of the monocular camera (3) relative to an AprilTag label (1) by using an AprilTag algorithm, and finally obtaining a coordinate transformation matrix T1 of the monocular camera (3) relative to the manipulator (6) base through matrix multiplication, wherein T1 is TA TB; the formulas of the pose transformation matrix TA and the pose transformation matrix TB are the same as those of the transformation matrix P1, the transformation matrix P2, the transformation matrix T1, and the transformation matrix T2.