CN115721422A

CN115721422A - Operation method, device, equipment and storage medium for interventional operation

Info

Publication number: CN115721422A
Application number: CN202211329875.7A
Authority: CN
Inventors: 周小虎; 谢晓亮; 李�浩; 刘市祺; 奉振球; 侯增广; 桂美将; 项天宇; 于喆; 黄德兴
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2022-10-27
Filing date: 2022-10-27
Publication date: 2023-03-03

Abstract

The embodiment of the invention provides an operation method, a device, equipment and a storage medium for interventional operation, wherein the method comprises the following steps: acquiring a target image corresponding to an interventional operation; the target image comprises an instrument image and a user blood vessel image; inputting the instrument image and the user blood vessel image into the trained interventional operation model to obtain a target operation instruction; the interventional operation model is obtained by training based on the sample environment image, the first operation instruction corresponding to the sample environment image, the reward value corresponding to the first operation instruction and the sample image at the second moment corresponding to the first operation instruction after the first operation instruction is executed; and performing the operation of the interventional operation according to the target operation instruction. The method of the embodiment of the invention obtains the target operation instruction based on the acquired instrument image of the interventional operation, the blood vessel image of the user and the interventional operation model, realizes the autonomous delivery of the interventional operation instrument, and reduces the workload brought by a large amount of simple repeated operations in the interventional operation.

Description

Operation method, device, equipment and storage medium for interventional operation

Technical Field

The present invention relates to the field of medical technology, and in particular, to an interventional operation method, an interventional operation device, an interventional operation apparatus, and a storage medium.

Background

The interventional operation is a minimally invasive treatment mode carried out by modern high-tech means, namely, under the guidance of medical imaging equipment, special precise instruments such as catheters, guide wires and the like are introduced into a human body, so that the diagnosis and the local treatment of internal diseases are carried out. The digital technology is applied to interventional therapy, the visual field of a doctor is expanded, the hands of the doctor are prolonged by means of the catheter and the guide wire, the incision of the interventional operation is small, and the digital technology has the characteristics of no operation, small wound, quick recovery and good effect.

In the related art, a doctor judges the position of an instrument in a blood vessel according to Digital Subtraction Angiography (DSA), and sends an axial movement command and a rotation command to a main end of an interventional surgical robot, and a slave end of the robot controls the instrument to move in the blood vessel according to the commands. In other words, in the interventional operation process, the interventional operation robot can only passively execute the control instruction of the doctor, and the workload brought by a large number of simple repeated operations in the interventional operation cannot be reduced, so that the efficiency of the interventional operation is low.

Disclosure of Invention

In view of the problems in the prior art, embodiments of the present invention provide an operation method, an apparatus, a device, and a storage medium for interventional surgery.

Specifically, the embodiment of the invention provides the following technical scheme:

in a first aspect, an embodiment of the present invention provides a method for operating an interventional procedure, including:

acquiring a target image corresponding to an interventional operation; the target image comprises an instrument image and a user blood vessel image;

inputting the instrument image and the user blood vessel image into a trained interventional operation model to obtain a target operation instruction; the interventional operation model is obtained by training based on a sample environment image, a first operation instruction corresponding to the sample environment image, an award value corresponding to the first operation instruction and a second time sample image corresponding to the first operation instruction after the first operation instruction is executed;

and performing operation of an interventional operation according to the target operation instruction.

Further, the interventional procedure model is trained based on:

acquiring a first time sample environment image of an interventional operation; the first time sample environment image comprises a sample instrument image and a sample user blood vessel image;

inputting the sample instrument image and the sample user blood vessel image into an initial interventional operation model to obtain a first operation instruction;

and training the initial interventional operation model according to the first time sample environment image of the interventional operation, the first operation instruction, the reward value corresponding to the first operation instruction and the second time sample image corresponding to the first operation instruction after the first operation instruction is executed, so as to obtain the trained interventional operation model.

Further, the training the initial interventional operation model according to the first time sample environment image of the interventional operation, the first operation instruction, the reward value corresponding to the first operation instruction and the second time sample image corresponding to the first operation instruction after the first operation instruction is executed to obtain a trained interventional operation model, including:

acquiring operation data of a surgical instrument according to the first time sample environment image of the interventional operation, a first operation instruction, a reward value corresponding to the first operation instruction and a second time sample image corresponding to the first operation instruction after the first operation instruction is executed;

training the initial interventional operation model according to the operation data of the surgical instrument to obtain a trained interventional operation model; the obtaining of the operational data of the surgical instrument and the training of the initial interventional surgical operation model according to the operational data of the surgical instrument are performed asynchronously.

Further, operating the surgical instrument according to the first operation instruction to obtain a first position of the surgical instrument; determining an award value corresponding to the first operation instruction according to the distance between the first position and the target position; the target position represents a position in a blood vessel of the user, which needs to be operated; and/or the presence of a gas in the gas,

determining a target path according to the initial position and the target position of the surgical instrument; determining an award value corresponding to the first operation instruction according to the target path and the first position; the initial position represents a position of the surgical instrument in a blood vessel of the user corresponding to the first moment; the target path represents a path in which a distance between the first position and the target position is shortest; and/or the presence of a gas in the gas,

determining a reward value corresponding to the first operating instruction based on a first distance between the initial position and the target position and a second distance between the first position and the target position when the first position of the surgical instrument is located on the target path; and/or;

and determining a reward value corresponding to the first operation instruction according to the contact force between the surgical instrument and the surgical robot.

Further, the initial interventional procedure model is trained to maximize a training goal as follows:

r is _t The reward value represents the operation instruction of the surgical instrument corresponding to the t-th environmental image; the gamma represents an attenuation coefficient; the alpha represents an entropy regular coefficient; the described

Representing entropy; the entropy represents a degree of randomness of an operating instruction of the surgical instrument; e denotes the desired operation.

Further, after acquiring the first time sample environment image of the interventional operation, the method further includes:

and respectively carrying out binarization processing on the instrument image and the user blood vessel image in the first time sample environment image.

In a second aspect, an embodiment of the present invention further provides an operation device for an interventional procedure, including:

the acquisition module is used for acquiring a target image corresponding to the interventional operation; the target image comprises an instrument image and a user blood vessel image;

the processing module is used for inputting the instrument image and the user blood vessel image into the trained interventional operation model to obtain a target operation instruction; the interventional operation model is obtained by training based on a sample environment image, a first operation instruction corresponding to the sample environment image, an award value corresponding to the first operation instruction and a second time sample image corresponding to the first operation instruction after the first operation instruction is executed;

and the operation module is used for performing the operation of the interventional operation according to the target operation instruction.

In a third aspect, an embodiment of the present invention further provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the processor implements the operation method of the interventional procedure according to the first aspect.

In a fourth aspect, the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the operating method of the interventional procedure according to the first aspect.

In a fifth aspect, the present invention further provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the computer program implements the operation method of the interventional procedure according to the first aspect.

According to the operation method, the device, the equipment and the storage medium for the interventional operation, the target operation instruction is obtained based on the acquired instrument image of the interventional operation, the blood vessel image of the user and the interventional operation model, so that the interventional operation robot can operate the instrument of the interventional operation according to the determined target operation instruction, the autonomous delivery of the interventional operation instrument is realized, the efficiency of the interventional operation and the autonomous level of the blood vessel interventional operation robot are improved, the workload brought by a large number of simple repeated operations in the interventional operation is reduced, and the effect of assisting a doctor to safely implement interventional therapy is realized.

Drawings

In order to more clearly illustrate the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a schematic flow chart diagram of a method of operation of an interventional procedure provided by an embodiment of the present invention;

FIG. 2 is a schematic view of an interventional procedure device provided in accordance with an embodiment of the present invention;

FIG. 3 is a schematic training diagram of an interventional procedure model provided by an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of an interventional procedure model provided by an embodiment of the present invention;

FIG. 5 is a flow chart of a method of operation of another interventional procedure provided by an embodiment of the present invention;

FIG. 6 is a schematic structural diagram of an operation device for interventional operation provided by an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The method provided by the embodiment of the invention can be applied to a medical scene, and the active delivery of an interventional surgical instrument is realized.

In the related art, a doctor judges the position of an instrument in a blood vessel according to Digital Subtraction Angiography (DSA), and sends an axial movement command and a rotation command to a main end of an interventional surgical robot, and a slave end of the robot controls the instrument to move in the blood vessel according to the commands. In the interventional operation process, the interventional operation robot can only passively execute the control instruction of a doctor, and the workload caused by a large amount of simple repeated operations in the interventional operation cannot be reduced, so that the efficiency of the interventional operation is low.

According to the operation method of the interventional operation, the target operation instruction is obtained based on the acquired instrument image of the interventional operation, the user blood vessel image and the interventional operation model, so that the interventional operation robot can operate the instrument of the interventional operation according to the determined target operation instruction, the autonomous delivery of the interventional operation instrument is realized, the efficiency of the interventional operation and the autonomous level of the blood vessel interventional operation robot are improved, the workload brought by a large number of simple repeated operations in the interventional operation is reduced, and the effect of assisting a doctor to safely implement the interventional treatment is realized.

The technical solution of the present invention is described in detail with specific embodiments in conjunction with fig. 1-7. These several specific embodiments may be combined with each other below, and details of the same or similar concepts or processes may not be repeated in some embodiments.

Fig. 1 is a flowchart illustrating an operation method of an interventional procedure according to an embodiment of the present invention. As shown in fig. 1, the method provided by this embodiment includes:

101, acquiring a target image corresponding to an interventional operation; the target image comprises an instrument image and a user blood vessel image;

in particular, vascular intervention surgery is one of the main modes of treating cardiovascular and cerebrovascular diseases. In the blood vessel interventional operation, a doctor needs to guide and control an interventional instrument such as a catheter guide wire and the like to reach a focus position through a blood vessel cavity under the guidance of a digital subtraction angiography imaging system to perform the treatment such as dissolving thrombus, expanding a narrow blood vessel and the like. Alternatively, with a master-slave configuration of the vascular interventional surgical robot, the physician manipulates the robotic slave-end delivery interventional instruments through the master-end in a radiation-free control room, avoiding high doses of X-ray radiation. In the process of the robot-assisted vascular interventional surgery, a doctor judges the position of an instrument in a blood vessel according to digital subtraction angiography imaging, so that an axial movement instruction and a rotation instruction are sent to the main end of the vascular interventional surgery robot, and the auxiliary end of the robot controls the instrument to move in the blood vessel according to the instruction. However, in the interventional operation process, the vascular interventional operation robot can only passively execute the control command of the doctor, which cannot reduce the workload caused by a large number of simple repeated operations in the interventional operation, and cannot assist the inexperienced doctor to safely perform the interventional operation.

In order to solve the above problems and improve the efficiency of the interventional operation and the autonomy level of the vascular interventional operation robot, in the embodiment of the present invention, an instrument image and a user blood vessel image in the interventional operation are first acquired, and optionally, the instrument image includes images of a catheter and a guide wire in the interventional operation.

102, inputting an instrument image and a user blood vessel image into the trained interventional operation model to obtain a target operation instruction; the interventional operation model is obtained by training based on a sample environment image, a first operation instruction corresponding to the sample environment image, a reward value corresponding to the first operation instruction and a corresponding sample image at a second moment after the first operation instruction is executed;

specifically, after an instrument image and a user blood vessel image in an interventional operation are acquired, the instrument image and the user blood vessel image are input into a trained interventional operation model to obtain a target operation instruction; the interventional operation model is used for determining an operation instruction corresponding to the vascular interventional operation robot according to an acquired instrument image and a user blood vessel image in the interventional operation; optionally, the target operation command comprises a control command of two degrees of freedom of axial direction and rotation, wherein the two degrees of freedom of the axial direction are respectively a constant-speed forward movement and a constant-speed backward movement; the rotational degree of freedom has five instructions, namely no rotation, clockwise rotation at two speeds and anticlockwise rotation at two speeds, and the control instruction lasts for 0.5 second; the reward value is used for evaluating the quality degree of the operation instruction, namely, the matching degree of the operation instruction and the current environment image and the contribution degree of the operation instruction to the current interventional operation are evaluated, namely, whether the operation instruction can deliver the interventional instrument to the focus position through the blood vessel cavity or not is evaluated; optionally, the interventional operation model is trained based on the sample environment image, the first operation instruction corresponding to the sample environment image, the reward value corresponding to the first operation instruction, and the second time sample image corresponding to the first operation instruction after execution of the first operation instruction.

And 103, performing the operation of the interventional operation according to the target operation instruction.

Specifically, after the instrument image and the user blood vessel image are input into the trained interventional operation model to obtain the target operation instruction, the blood vessel interventional operation robot can operate the instrument of the interventional operation according to the determined target operation instruction. In the embodiment of the invention, the operation instruction of the blood vessel interventional operation robot is determined based on the acquired image information of the instrument and the blood vessel of the user in the interventional operation, so that the blood vessel interventional operation robot can accurately and effectively operate the interventional operation instrument according to the determined operation instruction, the autonomous delivery of the interventional operation instrument is realized, the efficiency of the interventional operation and the autonomous level of the interventional operation robot are improved, the workload caused by a large amount of simple repeated operations in the interventional operation is reduced, and the effect of assisting a doctor to safely implement the interventional treatment is realized.

According to the method, the target operation instruction is obtained based on the acquired interventional operation instrument image, the user blood vessel image and the interventional operation model, so that the interventional operation robot can operate the interventional operation instrument according to the determined target operation instruction, the autonomous delivery of the interventional operation instrument is realized, the interventional operation efficiency and the autonomous level of the blood vessel interventional operation robot are improved, the workload brought by a large number of simple repeated operations in the interventional operation is reduced, and the effect of assisting a doctor in safely carrying out interventional treatment is realized.

In one embodiment, the interventional procedure model is trained based on:

inputting a sample instrument image and a sample user blood vessel image into an initial interventional operation model to obtain a first operation instruction;

training the initial interventional operation model according to the first time sample environment image of the interventional operation, the first operation instruction, the reward value corresponding to the first operation instruction and the second time sample image corresponding to the first operation instruction after the first operation instruction is executed, and obtaining the trained interventional operation model.

Specifically, in order to enable the interventional operation model to accurately output an operation instruction corresponding to the current environment image, in the embodiment of the invention, the operation model is trained through the sample environment image, a first operation instruction corresponding to the sample environment image, a reward value corresponding to the first operation instruction and a second time sample image corresponding to the first operation instruction after the first operation instruction is executed; the method comprises the steps of firstly inputting an acquired sample instrument image and a sample user blood vessel image of an interventional operation into an initial interventional operation model, obtaining an operation instruction of a surgical instrument corresponding to a current environment image and determining a reward value corresponding to the operation instruction, optionally training the initial interventional operation model under the condition that the reward value of the operation instruction determined according to the sample instrument image and the sample user blood vessel image of the interventional operation is lower than a preset threshold value until an interventional operation robot can achieve autonomous delivery of the interventional operation instrument according to the operation instruction output by the interventional operation model, and enabling interventional instruments such as a catheter guide wire to reach a focus position through a blood vessel lumen channel.

Illustratively, the interventional procedure model training device is schematically illustrated in fig. 2. The interventional operation model training device comprises a simulation operation environment module, an operation data acquisition module (6), a data output display module (7), an operation data storage module (8) and a parameter updating module (9). The simulated operation environment module consists of a blood vessel interventional operation robot (1), a catheter (2), a guide wire (3), a blood vessel model (4) and a camera (5).

The vascular intervention operation robot (1) receives a control instruction from the operation data acquisition module and controls the guide wire (3) to move in the vascular model (4) according to the control instruction; optionally, the blood vessel model (4) is made by a three-dimensional printing technology, and the camera (5) is positioned right above the blood vessel model (4) and is fixed in position.

The operation data acquisition module acquires images from the camera (5), sends control instructions to the vascular intervention operation robot (1) according to the currently learned operation skills, and records operation data, wherein the operation skills are represented by a neural network. The operational data collection module periodically copies the latest neural network parameters from the parameter update module to update the operational skills. The operation data storage module is used for storing the operation data recorded by the operation data acquisition module. The parameter updating module is used for learning the vascular intervention operation skill and updating the neural network parameters representing the vascular intervention operation skill by using the operation data sampled from the operation data storage module. The data output display module is used for displaying the images obtained by the simulation operation environment module, the real-time information of the parameter updating module and the like to a computer display. Optionally, the simulated surgical environment module is used for simulating a clinical surgical scene, and comprises a vascular interventional surgical robot, a three-dimensionally printed vascular model, an interventional instrument and a camera; the operation data acquisition module is used for sending a control instruction to the vascular intervention operation robot in the simulated operation environment module according to the currently learned operation skill and recording operation data. The operation data storage module is used for storing the operation data recorded by the operation data acquisition module. The parameter updating module is used for learning the vascular intervention operation skill, updating the neural network parameters representing the vascular intervention operation skill by using the operation data sampled from the operation data storage module, and periodically sending the latest neural network parameters to the operation data acquisition module; the data output display module is used for displaying the images obtained by the simulation operation environment module, the real-time information of the parameter updating module and the like to a computer display. Optionally, in consideration of long time consumption for both operation data acquisition and parameter update, in the embodiment of the present invention, distributed deployment is adopted, that is, operation data acquisition, operation data storage, and parameter update are asynchronously executed in different processes.

Optionally, the corresponding training method for the operation device for interventional operation is as follows:

step 1, an operation data acquisition module randomly selects a control instruction and sends the control instruction to a robot, records operation data in a simulated operation environment and stores the operation data in an operation data storage module;

and 2, updating the operation skill of the robot by the parameter updating module by using a reinforcement learning method, selecting a control instruction by the operation data acquisition module according to the learned operation skill, sending the control instruction to the robot, recording operation data in the simulated operation environment, and storing the operation data in the operation data storage module. The parameter updating module and the operation data acquisition module run asynchronously;

and 3, repeating the step 2 until the autonomous delivery of the instrument can be completed.

The embodiment of the invention has the advantages that the operation skill of the minimally invasive vascular intervention operation can be learned without manual supervision and guidance, and appropriate control instructions are selected in different operation stages, so that the autonomous delivery of instruments is realized.

As shown in fig. 3, the operation data acquisition module issues the operation instruction to the simulated operation environment, the surgical robot operates the surgical instrument according to the operation instruction to obtain the reward value corresponding to the operation instruction, and sends the environment image, the operation instruction corresponding to the environment image, the reward value corresponding to the operation instruction, and the environment image after the operation instruction is executed to the operation receipt acquisition module and performs the training of the interventional operation model.

An exemplary structural diagram of the interventional operation model is shown in fig. 4, and includes an encoder, a decoder, a strategy network, and a value function network. The encoder consists of a convolution neural network, the decoder consists of a deconvolution neural network, and the strategy network and the value function network both consist of fully-connected neural networks. The input o, namely the binary images of the blood vessels and the instruments, is processed by an encoder to obtain a code H (o), and the code H (o) is processed by a decoder, a strategy network and a value function network to respectively output a reconstruction result R (o), a probability pi (o) of each action and a value function Q (o). And R (o) is a picture with the same size as the input o, and the value of each pixel represents the probability that the pixel has a corresponding position value of 1 in the input o. And pi (o) and Q (o) are vectors with dimensions of control instruction number, and each dimension respectively represents the probability pi (o, a) and the value function Q (o, a) of selecting the corresponding control strategy a under input. The training efficiency of the interventional operation model can be improved through the reconstruction task and the distributed deployment, the vascular interventional operation skill can be learned in a short time, the propelling and the rotating motion of the guide wire can be controlled autonomously, and the workload of doctors is reduced.

In each update step, the decision network uses n sets of operational data o sampled from the database _t ,a _t ,o _t+1 ,r _t } _t And updating the parameters by using a gradient descent method.

The reconstruction loss function J (R) is used to update the encoder and reconstruction network, and is defined as:

wherein FL (·,) represents the target loss focal loss, and the specific calculation formula is

FL[R(o _t ),o _t ]＝SUM[-|R(o _t )-o _t | ^τ log(1-|R(o _t )-o _t |)]

Where SUM (-) represents a pixel-by-pixel summation.

The value function loss function J (Q) is used to update the encoder and the value function network, which is defined as:

wherein V (o) _t+1 ) Is defined as

Wherein the network of target value functions

Is an exponential average of the value function network Q (-) parameter.

The policy loss function J (π) is used to update the policy network, which is defined as:

according to the method, the initial operation model is trained through the sample environment image, the first operation instruction corresponding to the sample environment image, the reward value corresponding to the first operation instruction and the second moment sample image corresponding to the first operation instruction after execution, namely, the initial intervention operation model is optimized based on the reward value corresponding to the operation instruction output by the initial intervention operation model, so that the initial intervention operation model can accurately output the operation instruction corresponding to the current environment image, the efficiency of the intervention operation is improved, the workload brought by a large number of simple repeated operations in the intervention operation is reduced, and a doctor is assisted to safely implement the intervention treatment.

In an embodiment, training an initial interventional operation model according to a first time sample environment image of an interventional operation, a first operation instruction, a reward value corresponding to the first operation instruction, and a second time sample image corresponding to the first operation instruction after execution of the first operation instruction to obtain the trained interventional operation model includes:

acquiring operation data of a surgical instrument according to a first time sample environment image of an interventional operation, a first operation instruction, a reward value corresponding to the first operation instruction and a second time sample image corresponding to the first operation instruction after the first operation instruction is executed;

training the initial interventional operation model according to the operation data of the surgical instrument to obtain a trained interventional operation model; the acquisition of the operational data of the surgical instrument and the training of the initial interventional surgical operation model according to the operational data of the surgical instrument are performed asynchronously.

Specifically, the training of the initial interventional operation model in the embodiment of the present invention is divided into two steps, where the first step is an acquisition stage of operation data, and optionally, the operation data of the surgical instrument includes an environment image corresponding to the interventional operation, an operation instruction output by the initial interventional operation model according to the environment image, a position of the surgical instrument after being operated according to the operation instruction, and a reward value corresponding to the operation instruction; the second step is a training stage of the initial interventional operation model, namely training the initial interventional operation model according to the acquired environment image corresponding to the interventional operation, the operation instruction output by the initial interventional operation model according to the environment image, the position of the surgical instrument operated according to the operation instruction and the reward value corresponding to the operation instruction; optionally, two steps of the initial interventional operation model training can be asynchronously executed, so that the problem that the time consumption of data acquisition and model training is long is solved, the mutual influence caused by simultaneous and cross operation of operation data acquisition and model training is avoided, namely, after the first operation data acquisition is completed, the next operation data acquisition can be performed only after the operation data is input into the initial interventional operation model for model training, and the efficiency of model training is improved.

According to the method, the two steps of operation data acquisition and model training are asynchronously executed, so that mutual influence caused by simultaneous and cross operation of the operation data acquisition and the model training is avoided, the problem that time consumption of the data acquisition and the model training is long is solved, and the efficiency of interventional operation model training is effectively improved.

In one embodiment, the surgical instrument is operated according to a first operation instruction, and a first position of the surgical instrument is obtained; determining an award value corresponding to the first operation instruction according to the distance between the first position and the target position; the target position represents a position in a blood vessel of the user, which needs to be operated; and/or the presence of a gas in the gas,

determining a target path according to the initial position and the target position of the surgical instrument; determining an award value corresponding to the first operation instruction according to the target path and the first position; the initial position represents the position of the surgical instrument corresponding to the first moment in the blood vessel of the user; the target path represents a path in which the distance between the first position and the target position is shortest; and/or the presence of a gas in the gas,

Specifically, the reward value corresponding to the operation instruction in the embodiment of the present invention is used to evaluate the quality degree of the operation instruction, that is, the matching degree between the operation instruction and the current environment image and the contribution degree of the operation instruction to the current interventional operation are evaluated, that is, whether the operation instruction can deliver the interventional device to the lesion position through the blood vessel lumen or not is evaluated; optionally, the reward value of the operation instruction is determined by at least one of:

1. whether the reward of the target is achieved is judged, namely, the surgical instrument is operated according to the first operation instruction, and the first position of the surgical instrument is obtained; determining an award value corresponding to the first operation instruction according to the distance between the first position and the target position;

optionally, when the distal end of the guidewire is within 5 pixels of the target lesion location, the guidewire is considered to reach the target location and a reward value is obtained.

2. Insisting on the reward of the correct delivery route, i.e. determining the target path based on the initial position and the target position of the surgical instrument; determining an award value corresponding to the first operation instruction according to the target path and the first position;

optionally, the target path is a shortest path from a starting position corresponding to the surgical instrument to a target lesion position; if the guide wire deviates from the target path and enters the wrong blood vessel branch, a penalty is given, otherwise, if the guide wire leaves the wrong blood vessel branch and returns to the target path, the robot gives a reward and obtains a reward value.

3. A dense reward for shortening the target distance, that is, in a case where the first position of the surgical instrument is located on the target path, determining a reward value corresponding to the first operation instruction based on a first distance between the initial position and the target position and a second distance between the first position and the target position;

optionally, the reward is only applied when the guide wire is on the target path, the distance of each pixel point can be obtained under the target path at the same time, and the reward is set to the number of pixels with the observation distance reduced.

4. And determining a penalty value corresponding to the first operation command according to the punishment that the contact force exceeds the safety threshold, namely according to the contact force between the surgical instrument and the surgical robot, wherein the contact force between the surgical instrument and the surgical robot is estimated through the motor current.

According to the method, the reward value corresponding to the operation instruction is determined according to multiple dimensions such as the position of the operation instrument and the target focus after the operation instruction is executed, whether the operation instrument is on the target path after the operation instruction is executed, whether the distance between the operation instrument and the target focus position on the target path after the operation instruction is executed is reduced, the contact force between the operation instrument and the operation robot and the like, so that the determined reward value can accurately reflect the matching degree of the operation instruction output by the initial interventional operation model and the current environment image and the contribution degree of the operation instruction to the current interventional operation, accurate training of the initial interventional operation model is achieved, the operation instruction of the interventional operation can be accurately output by the trained interventional operation model, and the operation efficiency and accuracy of the interventional operation are improved.

In one embodiment, the initial interventional procedure model is trained to maximize the following training objectives:

r _t the reward value represents the operation instruction of the surgical instrument corresponding to the t-th environmental image; γ represents an attenuation coefficient; alpha represents an entropy regular coefficient;

representing entropy; the entropy represents the randomness of the operating instructions of the surgical instrument; e denotes the desired operation.

In particular, to enable training of the initial interventional procedure model, the initial interventional procedure model may be modeled as a markov decision process, grouped by six members<S,O,A,P,R,γ>Representing, wherein S, O and A respectively represent state spaces; optionally, the state space includes the position of the surgical instrument, the environmental image of the interventional procedure and the operation instruction, P: S × A × S → [0,1]Represents the state transition function, R: S × A → R represents the return function, and γ represents the attenuation coefficient. At time t, the observed value o from the acquired environment image _t Selecting an operation instruction action a from E O _t E.g. A, then state from s _t E.s with a state transition probability P (S) _t ,a _t ,s _t+1 ) Is converted into s _t+1 E.g. S, and obtaining the return (reward value) r corresponding to the operation instruction _t ＝R(s _t ,a _t ). Optionally, operating on the data (o) _t ,a _t ,o _t+1 ,r _t ) Stored in a database. The training goal of the interventional initial surgical procedure model is to regularize the cumulative return of entropy

At a maximum wherein

Representing entropy and alpha representing an entropy regular coefficient.

According to the method, the initial interventional operation model is trained to maximize the training target, so that the accumulated return of the entropy regulation of the operation instruction corresponding to the environmental image of the interventional operation is the largest or larger than the preset threshold value, the trained interventional operation model can accurately and effectively output the operation instruction matched with the environmental image, and the efficiency and the accuracy of the interventional operation are improved.

In an embodiment, after acquiring the first time sample environment image of the interventional procedure, the method further includes:

Specifically, after a sample environment image of the interventional operation is obtained, instrument images and user blood vessel images in the sample environment image can be subjected to binarization processing respectively, so that discrimination of an initial interventional operation model on the environment image is improved, a trained interventional operation model is more accurate, and a more accurate operation instruction of the interventional operation can be obtained. The colors of the blood vessel part and the background part in the blood vessel image of the user are different, and a binary image of the blood vessel can be obtained by utilizing a pixel threshold value. The instrument image can be obtained by calculating the difference value between the current image and the initial image without the instrument, and the binaryzation image of the instrument needs to be subjected to closed operation to eliminate possible fracture and centralization.

According to the method, the environment image is distinguished by the initial interventional operation model through the binaryzation processing of the instrument image and the user blood vessel image in the sample environment image, so that the trained interventional operation model is more accurate, more accurate operation instructions of the interventional operation can be obtained, and the efficiency and the accuracy of the interventional operation are improved.

As shown in FIG. 5, the operation data acquisition module receives the observed value o of the environment image sent by the simulated operation environment module _t Selecting a control instruction a _t And sends the data to the simulation operation module and then receives the simulation handReward r sent by the operating environment module _t And a new environment image observation value o _t+1 Packed as operation data (o) _t ,a _t ,o _t+1 ,r _t ) And sending the data to an operation data storage module. In the beginning stage (the first 2000 operation data), the operation data acquisition module randomly selects a control instruction, and then selects the control instruction according to the learned operation skill. The operation data storage module stores operation data according to a certain capacity, deletes the oldest operation data when the capacity is full, and preferentially stores the newest operation data. The parameter updating module randomly samples the operation data from the operation data storage module, updates the parameters of the neural network representing the operation skills, and periodically sends the neural network parameters to the operation data acquisition module to update the operation skills.

The following describes the operation device for interventional operation provided by the present invention, and the operation device for interventional operation described below and the operation method for interventional operation described above can be referred to correspondingly.

Fig. 6 is a schematic structural diagram of an operation device for interventional operation provided by the present invention. The operation device for interventional operation provided by the embodiment comprises:

an obtaining module 710, configured to obtain a target image corresponding to an interventional procedure; the target image comprises an instrument image and a user blood vessel image;

the processing module 720 is configured to input the instrument image and the blood vessel image of the user into the trained interventional operation model to obtain a target operation instruction; the interventional operation model is obtained by training based on a sample environment image, a first operation instruction corresponding to the sample environment image, a reward value corresponding to the first operation instruction and a corresponding sample image at a second moment after the first operation instruction is executed;

an operation module 730, configured to perform an operation of an interventional procedure according to the target operation instruction.

Optionally, the processing module 720 is specifically configured to: acquiring a first time sample environment image of an interventional operation; the first time sample environment image comprises a sample instrument image and a sample user blood vessel image;

training the initial interventional operation model according to a first time sample environment image of the interventional operation, a first operation instruction, a reward value corresponding to the first operation instruction and a second time sample image corresponding to the first operation instruction after the first operation instruction is executed, and obtaining the trained interventional operation model.

Optionally, the processing module 720 is specifically configured to: acquiring operation data of a surgical instrument according to a first time sample environment image of an interventional operation, a first operation instruction, a reward value corresponding to the first operation instruction and a second time sample image corresponding to the first operation instruction after the first operation instruction is executed;

Optionally, the processing module 720 is specifically configured to: operating the surgical instrument according to the first operation instruction to obtain a first position of the surgical instrument; determining an award value corresponding to the first operation instruction according to the distance between the first position and the target position; the target position represents a position in a blood vessel of the user, which needs to be operated; and/or the presence of a gas in the gas,

determining a reward value corresponding to the first operating instruction based on a first distance between the initial position and the target position and a second distance between the first position and the target position under the condition that the first position of the surgical instrument is located on the target path; and/or;

Optionally, the processing module 720 is specifically configured to: the initial interventional procedure model is trained to maximize the following training objectives:

r _t a reward value representing an operation instruction of the surgical instrument corresponding to the environmental image at the t-th time; γ represents an attenuation coefficient; alpha represents an entropy regular coefficient;

representing entropy; the entropy represents a degree of randomness of the operating instructions of the surgical instrument.

Optionally, the processing module 720 is specifically configured to: after a first time sample environment image of an interventional operation is obtained, instrument images and user blood vessel images in the first time sample environment image are respectively subjected to binarization processing.

The apparatus of the embodiment of the present invention is configured to perform the method of any of the foregoing method embodiments, and the implementation principle and the technical effect are similar, which are not described herein again.

Fig. 7 illustrates a physical structure diagram of an electronic device, which may include: a processor (processor) 810, a communication Interface 820, a memory 830 and a communication bus 840, wherein the processor 810, the communication Interface 820 and the memory 830 communicate with each other via the communication bus 840. Processor 810 may invoke logic instructions in memory 830 to perform a method of operation of an interventional procedure, the method comprising: acquiring a target image corresponding to an interventional operation; the target image comprises an instrument image and a user blood vessel image; inputting the instrument image and the blood vessel image of the user into the trained interventional operation model to obtain a target operation instruction; the interventional operation model is obtained by training based on a sample environment image, a first operation instruction corresponding to the sample environment image, a reward value corresponding to the first operation instruction and a corresponding sample image at a second moment after the first operation instruction is executed; and carrying out the operation of the interventional operation according to the target operation instruction.

In addition, the logic instructions in the memory 830 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method of operation of an interventional procedure provided by the above methods, the method comprising: acquiring a target image corresponding to an interventional operation; the target image comprises an instrument image and a user blood vessel image; inputting the instrument image and the user blood vessel image into the trained interventional operation model to obtain a target operation instruction; the interventional operation model is obtained by training based on the sample environment image, the first operation instruction corresponding to the sample environment image, the reward value corresponding to the first operation instruction and the sample image at the second moment corresponding to the first operation instruction after the first operation instruction is executed; and performing the operation of the interventional operation according to the target operation instruction.

In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor, implements a method of operation for performing the interventional procedure provided above, the method comprising: acquiring a target image corresponding to an interventional operation; the target image comprises an instrument image and a user blood vessel image; inputting the instrument image and the user blood vessel image into the trained interventional operation model to obtain a target operation instruction; the interventional operation model is obtained by training based on the sample environment image, the first operation instruction corresponding to the sample environment image, the reward value corresponding to the first operation instruction and the sample image at the second moment corresponding to the first operation instruction after the first operation instruction is executed; and performing the operation of the interventional operation according to the target operation instruction.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A method of operating an interventional procedure, comprising:

2. The method of claim 1, wherein the interventional procedure model is trained based on:

3. The method according to claim 2, wherein the training of the initial interventional operation model according to the first time sample environment image of the interventional operation, the first operation instruction, the reward value corresponding to the first operation instruction, and the second time sample image corresponding to the first operation instruction after the execution of the first operation instruction to obtain the trained interventional operation model comprises:

4. The method of operation of an interventional procedure according to claim 2 or 3, further comprising:

operating the surgical instrument according to the first operating instruction to obtain a first position of the surgical instrument; determining an award value corresponding to the first operation instruction according to the distance between the first position and the target position; the target position represents a position in a blood vessel of the user, which needs to be operated; and/or the presence of a gas in the gas,

determining a reward value corresponding to the first operation instruction based on a first distance between the initial position and the target position and a second distance between the first position and the target position under the condition that the first position of the surgical instrument is located on the target path; and/or;

5. The interventional procedure of claim 4, wherein the initial interventional procedure model is trained to maximize a training objective of:

said r _t The reward value represents the operation instruction of the surgical instrument corresponding to the t-th environmental image; gamma represents an attenuation coefficient; the alpha represents an entropy regular coefficient; the above-mentioned

Representing entropy; the entropy represents a degree of randomness of an operating instruction of the surgical instrument; the E represents the desired operation.

6. The method of claim 5, wherein after acquiring the first time sample environment image of the interventional procedure, further comprising:

7. An interventional surgical manipulation device, comprising:

8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the program, implements a method of operation of an interventional procedure as defined in any one of claims 1 to 6.

9. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a method of operation of an interventional procedure as defined in any one of claims 1 to 6.

10. A computer program product having executable instructions stored thereon, which instructions, when executed by a processor, cause the processor to carry out a method of operation of an interventional procedure as defined in any one of claims 1 to 6.