CN116747026B - Intelligent robot bone cutting method, device and equipment based on deep reinforcement learning - Google Patents

Intelligent robot bone cutting method, device and equipment based on deep reinforcement learning Download PDF

Info

Publication number
CN116747026B
CN116747026B CN202310656264.1A CN202310656264A CN116747026B CN 116747026 B CN116747026 B CN 116747026B CN 202310656264 A CN202310656264 A CN 202310656264A CN 116747026 B CN116747026 B CN 116747026B
Authority
CN
China
Prior art keywords
mechanical arm
osteotomy
reinforcement learning
strategy
plane
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310656264.1A
Other languages
Chinese (zh)
Other versions
CN116747026A (en
Inventor
张逸凌
刘星宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Longwood Valley Medtech Co Ltd
Original Assignee
Longwood Valley Medtech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Longwood Valley Medtech Co Ltd filed Critical Longwood Valley Medtech Co Ltd
Priority to CN202310656264.1A priority Critical patent/CN116747026B/en
Publication of CN116747026A publication Critical patent/CN116747026A/en
Application granted granted Critical
Publication of CN116747026B publication Critical patent/CN116747026B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B34/00Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
    • A61B34/20Surgical navigation systems; Devices for tracking or guiding surgical instruments, e.g. for frameless stereotaxis
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B34/00Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
    • A61B34/30Surgical robots
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B34/00Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
    • A61B34/70Manipulators specially adapted for use in surgery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B34/00Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
    • A61B34/20Surgical navigation systems; Devices for tracking or guiding surgical instruments, e.g. for frameless stereotaxis
    • A61B2034/2046Tracking techniques
    • A61B2034/2065Tracking using image or pattern recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Animal Behavior & Ethology (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Veterinary Medicine (AREA)
  • Robotics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Manipulator (AREA)

Abstract

The application provides a robot intelligent osteotomy method, device and equipment based on deep reinforcement learning and a computer readable storage medium. The intelligent robot bone cutting method based on the deep reinforcement learning comprises the following steps: controlling the mechanical arm to move to the vicinity of the planned osteotomy surface; controlling the mechanical arm to adjust to the same plane of the saw blade and the planned osteotomy plane; when bone cutting is started, controlling the mechanical arm to move on a plane of a planned bone cutting plane according to a preset mechanical arm path movement strategy; the mechanical arm path movement strategy is obtained through model training based on the reinforcement learning strategy. According to the embodiment of the application, the efficiency and the accuracy of knee joint osteotomy can be improved.

Description

Intelligent robot bone cutting method, device and equipment based on deep reinforcement learning
Technical Field
The application belongs to the technical field of deep learning intelligent recognition, and particularly relates to a robot intelligent osteotomy method, device and equipment based on deep reinforcement learning and a computer readable storage medium.
Background
Currently, when a knee joint is used for osteotomy, the doctor is mainly used for manually performing osteotomy according to experience, so that the efficiency and the accuracy are low.
Therefore, how to improve the efficiency and accuracy of knee osteotomies is a technical problem that needs to be solved by those skilled in the art.
Disclosure of Invention
The embodiment of the application provides a robot intelligent osteotomy method, device and equipment based on deep reinforcement learning and a computer readable storage medium, which can improve the efficiency and accuracy of knee joint osteotomy.
In a first aspect, an embodiment of the present application provides a robot intelligent osteotomy method based on deep reinforcement learning, including:
controlling the mechanical arm to move to the vicinity of the planned osteotomy surface;
controlling the mechanical arm to adjust to the same plane of the saw blade and the planned osteotomy plane;
When bone cutting is started, controlling the mechanical arm to move on a plane of a planned bone cutting plane according to a preset mechanical arm path movement strategy; the mechanical arm path movement strategy is obtained through model training based on the reinforcement learning strategy.
Optionally, the reinforcement learning strategy includes:
initializing parameters; wherein the parameters include environmental parameters and network parameters;
Performing actions;
Obtaining rewards;
The network is trained.
Optionally, the data acquisition is performed before the parameter initialization, including:
And acquiring osteotomy face data, moving relative coordinates during osteotomy of the mechanical arm, bone data after knee joint segmentation and the instantaneous speed of the mechanical arm.
Optionally, the action performing includes:
Environment detection and environment interaction to learn state parameters in real time during the osteotomy phase.
Optionally, the mechanical arm path movement strategy is obtained through model training based on a reinforcement learning strategy, and the mechanical arm path movement strategy comprises the following steps:
and sequentially inputting each state information into a long-short-time memory network LSTM with a cyclic neural network structure, selecting the quantity of previous information to be memorized through a forgetting gate, storing the effective information in the current information through an input gate, outputting the effective information through an output gate, storing the effective information into a hidden state, and obtaining a mechanical arm path movement strategy through network model training.
Optionally, in the model training process, the trained batch size is 32, the initial learning rate is set to be 1e-4, a learning rate attenuation strategy is added, the learning rate is attenuated to be 0.9 in each iteration for 5000 times, the optimizer uses the Adam optimizer, the loss function is a mean square error loss function, each iteration is set to be 1000 times, one verification is performed on a training set and a verification set, the network training stop time is judged through an early stop method, and a final model is obtained.
Optionally, the reward mechanism includes:
the mechanical arm learns the correct strategy through a feedback signal obtained by interaction with the environment;
Ending the round of learning when the mechanical arm is out of bounds or does not reach the destination within a specified step length;
Setting a penalty value between-1 and 0 and a prize value between 0 and 1; giving punishment when the mechanical arm is out of bounds, and giving rewards when the mechanical arm is within a specified range; to speed up the network training, a negative prize is given to each step of movement of the robotic arm, set to-0.0002.
In a second aspect, an embodiment of the present application provides a robotic intelligent osteotomy device based on deep reinforcement learning, the device comprising:
The movement control module is used for controlling the mechanical arm to move to the vicinity of the planned osteotomy surface;
the adjustment control module is used for controlling the mechanical arm to be adjusted to the same plane of the saw blade and the planned osteotomy plane;
The osteotomy face movement control module is used for controlling the mechanical arm to move on a plane of the planned osteotomy face according to a preset mechanical arm path movement strategy when osteotomy is started; the mechanical arm path movement strategy is obtained through model training based on the reinforcement learning strategy.
In a third aspect, an embodiment of the present application provides an electronic device, including: a processor and a memory storing computer program instructions;
The processor, when executing the computer program instructions, implements the intelligent osteotomy method for a robot based on deep reinforcement learning as in the first aspect.
In a fourth aspect, embodiments of the present application provide a computer readable storage medium having stored thereon computer program instructions that, when executed by a processor, implement a deep reinforcement learning based robotic intelligent osteotomy method as in the first aspect.
The intelligent osteotomy method, device and equipment for the robot and the computer readable storage medium based on the deep reinforcement learning can improve the efficiency and the accuracy of knee osteotomy.
The intelligent robot bone cutting method based on the deep reinforcement learning comprises the following steps: controlling the mechanical arm to move to the vicinity of the planned osteotomy surface; controlling the mechanical arm to adjust to the same plane of the saw blade and the planned osteotomy plane; when bone cutting is started, controlling the mechanical arm to move on a plane of a planned bone cutting plane according to a preset mechanical arm path movement strategy; the mechanical arm path movement strategy is obtained through model training based on the reinforcement learning strategy.
Therefore, when the method starts osteotomy, the mechanical arm is controlled to move on the plane of the planned osteotomy plane according to the preset mechanical arm path movement strategy, and the mechanical arm path movement strategy is obtained through model training based on the reinforcement learning strategy, so that the efficiency and the accuracy of knee joint osteotomy can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present application, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow diagram of a robotic intelligent osteotomy method based on deep reinforcement learning provided by an embodiment of the present application;
FIG. 2 is a schematic diagram of a reinforcement learning strategy provided by one embodiment of the present application;
FIG. 3 is a schematic diagram of a long and short term memory network LSTM according to an embodiment of the present application;
FIG. 4 is a schematic view of a tibial osteotomy provided in accordance with an embodiment of the present application;
FIG. 5 is a schematic representation of a femoral resection provided in one embodiment of the present application;
FIG. 6 is a schematic structural view of a robotic intelligent osteotomy device based on deep reinforcement learning according to an embodiment of the present application;
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Features and exemplary embodiments of various aspects of the present application will be described in detail below, and in order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail below with reference to the accompanying drawings and the detailed embodiments. It should be understood that the particular embodiments described herein are meant to be illustrative of the application only and not limiting. It will be apparent to one skilled in the art that the present application may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the application by showing examples of the application.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Currently, when a knee joint is used for osteotomy, the doctor is mainly used for manually performing osteotomy according to experience, so that the efficiency and the accuracy are low.
In order to solve the problems in the prior art, the embodiment of the application provides a method, a device, equipment and a computer-readable storage medium for intelligent osteotomy of a robot based on deep reinforcement learning. The following first describes a robot intelligent osteotomy method based on deep reinforcement learning provided by the embodiment of the application.
Fig. 1 shows a flow diagram of a robot intelligent osteotomy method based on deep reinforcement learning according to an embodiment of the present application. As shown in fig. 1, the intelligent osteotomy method based on the deep reinforcement learning robot comprises the following steps:
s101, controlling the mechanical arm to move to the vicinity of a planned osteotomy plane;
s102, controlling the mechanical arm to adjust to the same plane of the saw blade and the planned osteotomy plane;
s103, when bone cutting is started, controlling the mechanical arm to move on a plane of a planned bone cutting surface according to a preset mechanical arm path movement strategy; the mechanical arm path movement strategy is obtained through model training based on the reinforcement learning strategy.
Mechanical arm osteotomy movement reinforcement learning strategy:
1) Aiming at the problem that the training samples are limited due to the fact that the cost of acquiring the mechanical arm samples is excessive in the reinforcement learning training process, the natural track obtained by the interaction of the mechanical arm and the environment is copied and expanded, so that the sample efficiency is improved; the environment is synchronously modified while the track is copied, so that the generalization performance of the mechanical arm in a complex environment is improved.
2) And the expert path planning experience is used as priori knowledge of a design rewarding function, so that the exploration efficiency of the mechanical arm in the training process is improved.
As shown in fig. 2, in one embodiment, the reinforcement learning strategy includes:
initializing parameters; wherein the parameters include environmental parameters and network parameters;
Performing actions;
Obtaining rewards;
The network is trained.
In one embodiment, the act performing comprises:
Environment detection and environment interaction to learn state parameters in real time during the osteotomy phase.
Optimization point of the reinforcement learning strategy:
1) And collecting the movement data of the mechanical arm, inputting the data serving as a feature vector and a reward function into a neural network for training, and finally selecting the optimal action according to the exploration strategy and outputting the optimal action to reach the next visual observation.
2) And continuously and iteratively executing the three stages of action, rewarding and training decision until the training is completed.
3) The environment interaction module is added to learn state parameters in real time in the osteotomy stage.
In one embodiment, data acquisition is performed prior to parameter initialization, including:
And acquiring osteotomy face data, moving relative coordinates during osteotomy of the mechanical arm, bone data after knee joint segmentation and the instantaneous speed of the mechanical arm.
According to the point coordinates on the preoperative planning prosthesis, through registration conversion, a tibia registration matrix is used on the tibia side, a femur registration matrix is used on the femur side osteotomy surface, and one surface is calculated by three points, namely the osteotomy surface.
In one embodiment, the robot arm path movement strategy is model trained based on a reinforcement learning strategy, comprising:
and sequentially inputting each state information into a long-short-time memory network LSTM with a cyclic neural network structure, selecting the quantity of previous information to be memorized through a forgetting gate, storing the effective information in the current information through an input gate, outputting the effective information through an output gate, storing the effective information into a hidden state, and obtaining a mechanical arm path movement strategy through network model training.
Fig. 3 is a schematic diagram of an LSTM structure of a long-short-term memory network according to an embodiment of the present application, in which the long-short-term memory network is introduced to an environment sensing end due to complexity of a motion environment, and an LSTM network internal structure is in a solid frame. LSTM is a recurrent neural network structure that can process continuous data information. The reinforcement learning network inputs each state information into the LSTM network in turn, and selects the amount of previous information to be memorized through the forgetting gate. Second, valid information in the current information is stored through the input gate. Then, the effective information is outputted through the output gate and stored in the hidden state. And finally, obtaining a mechanical arm movement strategy through network training.
In one embodiment, in the model training process, the trained batch size is 32, the initial learning rate is set to be 1e-4, a learning rate attenuation strategy is added, the learning rate is attenuated to be 0.9 in each iteration for 5000 times, an Adam optimizer is used, the loss function is used as a mean square error loss function, each iteration is set for 1000 times, one verification is performed on a training set and a verification set, the network training stop time is judged through an early stop method, and a final model is obtained.
In one embodiment, a reward mechanism includes:
the mechanical arm learns the correct strategy through a feedback signal obtained by interaction with the environment;
Ending the round of learning when the mechanical arm is out of bounds or does not reach the destination within a specified step length;
Setting a penalty value between-1 and 0 and a prize value between 0 and 1; giving punishment when the mechanical arm is out of bounds, and giving rewards when the mechanical arm is within a specified range; to speed up the network training, a negative prize is given to each step of movement of the robotic arm, set to-0.0002.
In one embodiment, a tibial osteotomy diagram and a femoral osteotomy diagram are shown in fig. 4 and 5, respectively.
Fig. 6 is a schematic structural diagram of a deep reinforcement learning-based intelligent osteotomy device of a robot according to an embodiment of the present application, the device includes:
a movement control module 601, configured to control the mechanical arm to move to a position near the planned osteotomy plane;
the adjustment control module 602 is used for controlling the mechanical arm to be adjusted to the same plane of the saw blade and the planned osteotomy plane;
The osteotomy face movement control module 603 is configured to control the movement of the mechanical arm on the planned osteotomy face plane according to a preset mechanical arm path movement strategy when osteotomy is started; the mechanical arm path movement strategy is obtained through model training based on the reinforcement learning strategy.
Fig. 7 shows a schematic structural diagram of an electronic device according to an embodiment of the present application.
The electronic device may include a processor 701 and a memory 702 storing computer program instructions.
In particular, the processor 701 may comprise a Central Processing Unit (CPU), or an Application SPECIFIC INTEGRATED Circuit (ASIC), or may be configured as one or more integrated circuits that implement embodiments of the present application.
Memory 702 may include mass storage for data or instructions. By way of example, and not limitation, memory 702 may include a hard disk drive (HARD DISK DRIVE, HDD), floppy disk drive, flash memory, optical disk, magneto-optical disk, magnetic tape, or universal serial bus (Universal Serial Bus, USB) drive, or a combination of two or more of the foregoing. The memory 702 may include removable or non-removable (or fixed) media, where appropriate. The memory 702 may be internal or external to the electronic device, where appropriate. In a particular embodiment, the memory 702 may be a non-volatile solid state memory.
In one embodiment, memory 702 may be Read Only Memory (ROM). In one embodiment, the ROM may be mask-programmed ROM, programmable ROM (PROM), erasable PROM (EPROM), electrically Erasable PROM (EEPROM), electrically rewritable ROM (EAROM), or flash memory, or a combination of two or more of these.
The processor 701 reads and executes the computer program instructions stored in the memory 702 to implement any of the robot intelligent osteotomy methods based on deep reinforcement learning in the above embodiments.
In one example, the electronic device may also include a communication interface 703 and a bus 710. As shown in fig. 7, the processor 701, the memory 702, and the communication interface 703 are connected by a bus 710 and perform communication with each other.
The communication interface 703 is mainly used for implementing communication between each module, device, unit and/or apparatus in the embodiment of the present application.
Bus 710 includes hardware, software, or both that couple components of the electronic device to one another. By way of example, and not limitation, the buses may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a HyperTransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a micro channel architecture (MCa) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus, or a combination of two or more of the above. Bus 710 may include one or more buses, where appropriate. Although embodiments of the application have been described and illustrated with respect to a particular bus, the application contemplates any suitable bus or interconnect.
In addition, in combination with the robot intelligent osteotomy method based on deep reinforcement learning in the above embodiment, the embodiment of the application can be implemented by providing a computer readable storage medium. The computer readable storage medium has stored thereon computer program instructions; the computer program instructions, when executed by the processor, implement any of the deep reinforcement learning-based intelligent osteotomy methods of the above embodiments.
It should be understood that the application is not limited to the particular arrangements and instrumentality described above and shown in the drawings. For the sake of brevity, a detailed description of known methods is omitted here. In the above embodiments, several specific steps are described and shown as examples. The method processes of the present application are not limited to the specific steps described and shown, but various changes, modifications and additions, or the order between steps may be made by those skilled in the art after appreciating the spirit of the present application.
The functional blocks shown in the above-described structural block diagrams may be implemented in hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, a plug-in, a function card, or the like. When implemented in software, the elements of the application are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine readable medium or transmitted over transmission media or communication links by a data signal carried in a carrier wave. A "machine-readable medium" may include any medium that can store or transfer information. Examples of machine-readable media include electronic circuitry, semiconductor memory devices, ROM, flash memory, erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, radio Frequency (RF) links, and the like. The code segments may be downloaded via computer networks such as the internet, intranets, etc.
It should also be noted that the exemplary embodiments mentioned in this disclosure describe some methods or systems based on a series of steps or devices. The present application is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, or may be performed in a different order from the order in the embodiments, or several steps may be performed simultaneously.
Aspects of the present application are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, enable the implementation of the functions/acts specified in the flowchart and/or block diagram block or blocks. Such a processor may be, but is not limited to being, a general purpose processor, a special purpose processor, an application specific processor, or a field programmable logic circuit. It will also be understood that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware which performs the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In the foregoing, only the specific embodiments of the present application are described, and it will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the systems, modules and units described above may refer to the corresponding processes in the foregoing method embodiments, which are not repeated herein. It should be understood that the scope of the present application is not limited thereto, and any equivalent modifications or substitutions can be easily made by those skilled in the art within the technical scope of the present application, and they should be included in the scope of the present application.

Claims (3)

1. A robotic intelligent osteotomy device based on deep reinforcement learning, the device comprising:
The movement control module is used for controlling the mechanical arm to move to the vicinity of the planned osteotomy surface;
the adjustment control module is used for controlling the mechanical arm to be adjusted to the same plane of the saw blade and the planned osteotomy plane;
The osteotomy face movement control module is used for controlling the mechanical arm to move on a plane of the planned osteotomy face according to a preset mechanical arm path movement strategy when osteotomy is started; the mechanical arm path movement strategy is obtained through model training based on the reinforcement learning strategy;
the intelligent robot osteotomy method based on the deep reinforcement learning of the intelligent robot osteotomy device based on the deep reinforcement learning comprises the following steps:
controlling the mechanical arm to move to the vicinity of the planned osteotomy surface;
controlling the mechanical arm to adjust to the same plane of the saw blade and the planned osteotomy plane;
When bone cutting is started, controlling the mechanical arm to move on a plane of a planned bone cutting plane according to a preset mechanical arm path movement strategy; the mechanical arm path movement strategy is obtained through model training based on the reinforcement learning strategy;
Wherein the reinforcement learning strategy comprises:
initializing parameters; wherein the parameters include environmental parameters and network parameters;
Performing actions;
Obtaining rewards;
training a network;
Wherein, carry out data acquisition before parameter initialization, include:
Acquiring osteotomy face data, moving relative coordinates during osteotomy by a mechanical arm, bone data after knee joint segmentation and the instantaneous speed of the mechanical arm;
Wherein the action execution comprises:
Environment detection and environment interaction to learn state parameters in real time in the osteotomy stage;
the mechanical arm path movement strategy is obtained through model training based on a reinforcement learning strategy, and comprises the following steps:
Sequentially inputting each state information into a long-short-time memory network LSTM with a cyclic neural network structure, selecting the quantity of previous information to be memorized through a forgetting gate, storing effective information in the current information through an input gate, outputting the effective information through an output gate and storing the effective information into a hidden state, and obtaining a mechanical arm path movement strategy through network model training;
In the model training process, the trained batch size is 32, the initial learning rate is set to be 1e-4, a learning rate attenuation strategy is added, the learning rate is attenuated to be 0.9 in each iteration for 5000 times, an Adam optimizer is used by the optimizer, a loss function is used as a mean square error loss function, each iteration for 1000 times is set, one-time verification is carried out on a training set and a verification set, the network training stop time is judged through an early stop method, and a final model is obtained;
wherein the rewarding mechanism comprises:
the mechanical arm learns the correct strategy through a feedback signal obtained by interaction with the environment;
Ending the round of learning when the mechanical arm is out of bounds or does not reach the destination within a specified step length;
Setting a penalty value between-1 and 0 and a prize value between 0 and 1; giving punishment when the mechanical arm is out of bounds, and giving rewards when the mechanical arm is within a specified range; to speed up the network training, a negative prize is given to each step of movement of the robotic arm, set to-0.0002.
2. An electronic device, the electronic device comprising: a processor and a memory storing computer program instructions;
The processor, when executing the computer program instructions, implements the robotic intelligent osteotomy method based on deep reinforcement learning as recited in claim 1.
3. A computer readable storage medium, wherein computer program instructions are stored on the computer readable storage medium, which when executed by a processor, implement the deep reinforcement learning based robotic intelligent osteotomy method as in claim 1.
CN202310656264.1A 2023-06-05 2023-06-05 Intelligent robot bone cutting method, device and equipment based on deep reinforcement learning Active CN116747026B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310656264.1A CN116747026B (en) 2023-06-05 2023-06-05 Intelligent robot bone cutting method, device and equipment based on deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310656264.1A CN116747026B (en) 2023-06-05 2023-06-05 Intelligent robot bone cutting method, device and equipment based on deep reinforcement learning

Publications (2)

Publication Number Publication Date
CN116747026A CN116747026A (en) 2023-09-15
CN116747026B true CN116747026B (en) 2024-06-25

Family

ID=87960024

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310656264.1A Active CN116747026B (en) 2023-06-05 2023-06-05 Intelligent robot bone cutting method, device and equipment based on deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN116747026B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117860382B (en) * 2024-01-02 2024-06-25 北京长木谷医疗科技股份有限公司 Navigation surgery mechanical arm vision servo pose prediction PD control method based on LSTM
CN117860380A (en) * 2024-03-11 2024-04-12 北京壹点灵动科技有限公司 Data processing method and device for knee joint replacement, storage medium and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107028659A (en) * 2017-01-23 2017-08-11 新博医疗技术有限公司 Operation guiding system and air navigation aid under a kind of CT images guiding
CN109191465A (en) * 2018-08-16 2019-01-11 青岛大学附属医院 A kind of system for being determined based on deep learning network, identifying human body or so the first rib cage

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9754221B1 (en) * 2017-03-09 2017-09-05 Alphaics Corporation Processor for implementing reinforcement learning operations
CN109567942B (en) * 2018-10-31 2020-04-14 上海盼研机器人科技有限公司 Craniomaxillofacial surgical robot auxiliary system adopting artificial intelligence technology
CN111035454B (en) * 2019-12-26 2021-09-10 苏州微创畅行机器人有限公司 Readable storage medium and surgical robot
CN111515961B (en) * 2020-06-02 2022-06-21 南京大学 Reinforcement learning reward method suitable for mobile mechanical arm
CN113017829B (en) * 2020-08-22 2023-08-29 张逸凌 Preoperative planning method, system, medium and device for total knee arthroplasty based on deep learning
CN112370163B (en) * 2020-11-11 2022-05-31 上海交通大学医学院附属第九人民医院 Fibula transplantation surgical robot for mandible reconstruction
CN113326872A (en) * 2021-05-19 2021-08-31 广州中国科学院先进技术研究所 Multi-robot trajectory planning method
KR102622932B1 (en) * 2021-06-16 2024-01-10 코넥티브 주식회사 Appartus and method for automated analysis of lower extremity x-ray using deep learning
CA3225127A1 (en) * 2021-07-08 2023-01-12 Riaz Jan Kjell Khan Robot-assisted laser osteotomy
WO2023003912A1 (en) * 2021-07-20 2023-01-26 Carlsmed, Inc. Systems for predicting intraoperative patient mobility and identifying mobility-related surgical steps
CN113633377B (en) * 2021-08-13 2024-02-20 天津大学 Tibia optimization registration system and method for tibia high osteotomy
CN113962927B (en) * 2021-09-01 2022-07-12 北京长木谷医疗科技有限公司 Acetabulum cup position adjusting method and device based on reinforcement learning and storage medium
CN113842213B (en) * 2021-09-03 2022-10-11 北京长木谷医疗科技有限公司 Surgical robot navigation positioning method and system
EP4152339A1 (en) * 2021-09-20 2023-03-22 Universität Zürich Method for determining a surgery plan by means of a reinforcement learning method
WO2023056614A1 (en) * 2021-10-09 2023-04-13 大连理工大学 Method for predicting rotating stall of axial flow compressor on the basis of stacked long short-term memory network
CN114404047B (en) * 2021-12-24 2024-06-14 苏州微创畅行机器人有限公司 Positioning method, system, device, computer equipment and storage medium
CN114246635B (en) * 2021-12-31 2023-06-30 杭州三坛医疗科技有限公司 Osteotomy plane positioning method, system and device
CN114603564B (en) * 2022-04-28 2024-04-12 中国电力科学研究院有限公司 Mechanical arm navigation obstacle avoidance method, system, computer equipment and storage medium
CN114939870B (en) * 2022-05-30 2023-05-09 兰州大学 Model training method and device, strategy optimization method, strategy optimization equipment and medium
CN116100539A (en) * 2022-11-29 2023-05-12 国网安徽省电力有限公司淮南供电公司 Mechanical arm autonomous dynamic obstacle avoidance method and system based on deep reinforcement learning
CN115946133B (en) * 2023-03-16 2023-06-02 季华实验室 Mechanical arm plug-in control method, device, equipment and medium based on reinforcement learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107028659A (en) * 2017-01-23 2017-08-11 新博医疗技术有限公司 Operation guiding system and air navigation aid under a kind of CT images guiding
CN109191465A (en) * 2018-08-16 2019-01-11 青岛大学附属医院 A kind of system for being determined based on deep learning network, identifying human body or so the first rib cage

Also Published As

Publication number Publication date
CN116747026A (en) 2023-09-15

Similar Documents

Publication Publication Date Title
CN116747026B (en) Intelligent robot bone cutting method, device and equipment based on deep reinforcement learning
KR101961421B1 (en) Method, controller, and computer program product for controlling a target system by separately training a first and a second recurrent neural network models, which are initially trained using oparational data of source systems
CN107392125A (en) Training method/system, computer-readable recording medium and the terminal of model of mind
CN112632860B (en) Power transmission system model parameter identification method based on reinforcement learning
Karg et al. Learning-based approximation of robust nonlinear predictive control with state estimation applied to a towing kite
EP3418822A1 (en) Control device, control program, and control system
CN103955136B (en) Electromagnetism causes to drive position control method and application thereof
CN116650110B (en) Automatic knee joint prosthesis placement method and device based on deep reinforcement learning
CN116747016A (en) Intelligent surgical robot navigation and positioning system and method
CN116543221A (en) Intelligent detection method, device and equipment for joint pathology and readable storage medium
CN110018722B (en) Machine learning apparatus, system, and method for thermal control
CN116597002B (en) Automatic femoral stem placement method, device and equipment based on deep reinforcement learning
CN111223141A (en) Automatic assembly line work efficiency optimization system and method based on reinforcement learning
CN116898574B (en) Preoperative planning method, system and equipment for artificial intelligent knee joint ligament reconstruction
CN116309636A (en) Knee joint segmentation method, device and equipment based on multi-task neural network model
CN117350992A (en) Multi-task segmentation network metal implant identification method based on self-guiding attention mechanism
CN110287924A (en) A kind of soil parameters classification method based on GRU-RNN model
CN113156961B (en) Driving control model training method, driving control method and related device
CN112734923B (en) Automatic driving three-dimensional virtual scene construction method, device, equipment and storage medium
CN117860382B (en) Navigation surgery mechanical arm vision servo pose prediction PD control method based on LSTM
CN109614999A (en) A kind of data processing method, device, equipment and computer readable storage medium
CN113406574A (en) Online clustering method for multifunctional radar working mode sequence
Beleznay et al. Comparing value-function estimation algorithms in undiscounted problems
Elimelech et al. Introducing PIVOT: Predictive incremental variable ordering tactic for efficient belief space planning
CN115115790B (en) Training method of prediction model, map prediction method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant