CN117017355B - Thyroid autonomous scanning system based on multi-modal generation type dialogue - Google Patents

Thyroid autonomous scanning system based on multi-modal generation type dialogue Download PDF

Info

Publication number
CN117017355B
CN117017355B CN202311292889.0A CN202311292889A CN117017355B CN 117017355 B CN117017355 B CN 117017355B CN 202311292889 A CN202311292889 A CN 202311292889A CN 117017355 B CN117017355 B CN 117017355B
Authority
CN
China
Prior art keywords
component
thyroid
scanning
module
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311292889.0A
Other languages
Chinese (zh)
Other versions
CN117017355A (en
Inventor
程栋梁
王晨
刘振
黄琦
张泉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Hebin Intelligent Robot Co ltd
Original Assignee
Hefei Hebin Intelligent Robot Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Hebin Intelligent Robot Co ltd filed Critical Hefei Hebin Intelligent Robot Co ltd
Priority to CN202311292889.0A priority Critical patent/CN117017355B/en
Publication of CN117017355A publication Critical patent/CN117017355A/en
Application granted granted Critical
Publication of CN117017355B publication Critical patent/CN117017355B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B8/00Diagnosis using ultrasonic, sonic or infrasonic waves
    • A61B8/08Detecting organic movements or changes, e.g. tumours, cysts, swellings
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B8/00Diagnosis using ultrasonic, sonic or infrasonic waves
    • A61B8/08Detecting organic movements or changes, e.g. tumours, cysts, swellings
    • A61B8/0833Detecting organic movements or changes, e.g. tumours, cysts, swellings involving detecting or locating foreign bodies or organic structures
    • A61B8/085Detecting organic movements or changes, e.g. tumours, cysts, swellings involving detecting or locating foreign bodies or organic structures for locating body or organic structures, e.g. tumours, calculi, blood vessels, nodules
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B8/00Diagnosis using ultrasonic, sonic or infrasonic waves
    • A61B8/52Devices using data or image processing specially adapted for diagnosis using ultrasonic, sonic or infrasonic waves
    • A61B8/5215Devices using data or image processing specially adapted for diagnosis using ultrasonic, sonic or infrasonic waves involving processing of medical diagnostic data
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B8/00Diagnosis using ultrasonic, sonic or infrasonic waves
    • A61B8/54Control of the diagnostic device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • G06F16/90332Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • Radiology & Medical Imaging (AREA)
  • Pathology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Surgery (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Ultra Sonic Daignosis Equipment (AREA)
  • Vascular Medicine (AREA)

Abstract

The invention discloses a thyroid autonomous scanning system based on a multi-mode generation type dialogue, which comprises the following steps: the system comprises a hardware device and an autonomous scanning unit for controlling the action of the hardware device; the hardware device comprises: the device comprises a mechanical arm, a force sensor, a clamping jaw, an upper computer, display equipment and medical ultrasonic equipment; the autonomous scanning unit is arranged in the upper computer, and comprises: the system comprises an image analysis module, a multi-mode question answering module, a decision module and a control module. The invention utilizes the generated multi-mode large language model to realize the full-automatic scanning of thyroid; in the scanning process, no manual path setting and no manual parameter setting are needed; determining a scanning initial position by adopting a zero-force dragging mode; the position and the posture of the probe are adjusted cooperatively according to the sensor data and the ultrasonic image; the continuous absolute motion of the mechanical arm is changed into discrete relative motion through behavior coding, so that scanning failure caused by precision errors in calibration is avoided.

Description

Thyroid autonomous scanning system based on multi-modal generation type dialogue
Technical Field
The invention belongs to the technical field of medical diagnosis, and particularly relates to a thyroid autonomous scanning system based on a multi-mode generation type dialogue.
Background
Ultrasound is a non-invasive, safe, non-radiative, real-time imaging examination that is widely used for clinical diagnosis and monitoring. In china, ultrasound examination is performed billions of times per year, and the massive examination needs are a great challenge for the scarce sonographer, especially in remote areas where medical resources are scarce. With the development of artificial intelligence technology, robots with autonomous scanning capabilities can help the sonographer to improve work efficiency and reduce patient waiting time.
Publication No. CN113855067A describes an autonomous positioning scanning method for fusion of visual images and medical images, which utilizes the visual images and the medical images to collect coordinates of organs and external position areas thereof, and then manually sets a target name, a target parameter, position information and a communication target for the system. The robot motion planning needs to manually set and set acquisition target parameters, and remote control cannot be performed in the scanning process, so that the robot is not really autonomous scanning.
Publication number CN114155940A, CN115227404a describes an autonomous ultrasonic scanning skill strategy generation method based on reinforcement learning, which solves the three difficulties of autonomous scanning, namely, collection and cleaning of collected training data; secondly, the training process is unstable, and the network is not easy to converge; thirdly, the locally optimal solution is not necessarily the globally optimal solution. Meanwhile, the method is a complete end-to-end network prediction, and if abnormal reasons cannot be traced back and self-optimization occurs, only human intervention can be waited.
The publication number CN115429327a describes an automatic scanning sequence selection method using positioning marks and preset scanning tracks, the autonomous scanning of the method is essentially a set of standardized scanning track control, the emergency caused by the difference of physiological structures of scanned persons and small-range movement cannot be solved, and the scanning track formulated by the method only comprises position change and no adjustment of probe posture.
Publication number CN115670515a describes an autonomous scanning method for positioning the thyroid using a depth camera, segmenting the position of the thyroid control probe in the image using a neural network, and adjusting the posture of the probe according to a force sensor, wherein the method adjusts the posture of the probe to be perpendicular to the skin, in practice, the probe is not necessarily perpendicular to the skin or is the best image, and the thyroid scanning is a joint optimization process of position, posture and force, and the scanning of two planes of a transverse plane and a longitudinal plane needs to be completed;
in view of the shortcomings of the above-mentioned published patent, in combination with the technological innovation brought by the generation type artificial intelligence, we propose a thyroid autonomous scanning system based on multi-modal generation type dialogue.
Disclosure of Invention
The invention aims to solve the defects in the prior art, and provides a thyroid autonomous scanning system based on a multi-mode generation dialogue, which realizes the full autonomous scanning of thyroid by using a generated multi-mode large language model; in the scanning process, no manual path setting and no manual parameter setting are needed; and determining the scanning initial position by adopting a zero-force dragging mode.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a thyroid autonomous scanning system based on multimodal generation dialog, comprising:
the system comprises a hardware device and an autonomous scanning unit for controlling the action of the hardware device;
the hardware device includes:
the device comprises a mechanical arm, a force sensor, a clamping jaw, an upper computer, display equipment and medical ultrasonic equipment;
the force sensor is matched with the mechanical arm for use, and the clamping jaw is arranged at the action end of the mechanical arm;
the autonomous scanning unit is built in the upper computer, and comprises:
the system comprises an image analysis module, a multi-mode question-answering module, a decision module and a control module;
the image analysis module is used for processing and identifying the image from the ultrasonic equipment and generating metadata for processing by the decision module;
the multi-mode question-answering module is used for encoding the input characters and images, and obtaining answers by using a generated dialogue language model and encoding the answers into robot operations;
the decision module is used for summarizing the results of the image analysis and the answers of the question-answer module, generating a final decision and delivering the final decision to the control module for execution, and recording the results of the image analysis and summarizing the results into a final ultrasonic report;
the control module is responsible for executing the operation generated by the decision module, so that the safety of the operation is ensured;
the modules interact and communicate appointed data through a data interface, and components in the modules interact all data through sharing storage.
Preferably, the medical ultrasonic equipment is one or more of an ultrasonic three-dimensional diagnostic apparatus, a full-digital color Doppler ultrasound apparatus and ultrasonic color Doppler.
Preferably, the image analysis module includes:
a thyroid region segmentation component, a thyroid nodule detection component, a thyroid nodule attribute classification component, a thyroid diffuse lesion classification component, and an image quality confidence component;
the thyroid region segmentation component identifies a thyroid region and a trachea region in an ultrasonic image based on deep learning, performs pixel-level classification by using a segmentation semantic segmentation network to obtain masks of the thyroid region and the trachea region, calculates centroid coordinates, contours and areas of the thyroid region according to the masks, and gives the centroid coordinates, contours and areas to the decision module;
the thyroid nodule detection component identifies nodules in the thyroid region based on deep learning, uses a target detection network to generate a detection box for the nodules, and uses a multi-target tracking algorithm to obtain a unique identification number for the detection box;
the thyroid nodule attribute classification component classifies the nodules based on deep learning with multiple attributes, and the classification network uses a classification network of 8 output layers;
the thyroid diffuse disease classification component classifies thyroid regions with single attribute based on deep learning, and the classification network uses a classification network of 1 output layer;
the image quality confidence component calculates the confidence of each pixel of the ultrasonic image based on a random walk method and transmits the confidence to the decision module.
Preferably, the multi-mode question-answering module includes:
a word vector component, a graph vector component, a multi-modal dialogue question-answering component and an action coding component;
the word vector component encodes the text based on deep learning, and the word vector encodes the input text into a vector with fixed dimension by using a trained language model;
the image vector component encodes the image based on deep learning, and the image vector extracts the features of the image by using a convolution network without a classification layer and a retention feature layer, and reduces the dimension into a vector with fixed dimension;
the multi-modal dialogue question-answering component generates an answer text according to the word vector and the graph vector based on a multi-modal generated dialogue language model;
the action encoding component encodes answers derived from the generated conversational language model into three dimensional probe operations.
Preferably, the decision module includes:
a scanning state management component, a thyroid search component, a thyroid scanning component, a posture adjustment component and an ultrasonic reporting component;
the scanning state management component is responsible for scheduling the scanning state; judging whether the target in the current state is completed or not, and if the target needs to enter other states, not directly communicating with the control module;
the thyroid searching component judges whether thyroid is found according to the centroid coordinates and the area provided by the thyroid region segmentation component of the image analysis module, and judges whether the thyroid is left or right through the centroid coordinates and the center of the air duct;
the thyroid gland scanning component is responsible for scanning the cross section and the longitudinal section of left and right thyroid glands, and a thyroid gland rectangular frame is obtained according to the thyroid gland region provided by the thyroid gland region segmentation component of the image analysis module;
the gesture adjusting component sends the prompt word text into the word vector component according to the language template assembly, sends the confidence coefficient map and the ultrasonic image into the map vector component, and adjusts the gesture of the probe according to the output of the action encoding component and the image quality confidence coefficient component;
the ultrasonic reporting component collects the attributes of the nodules according to the set priority and calculates the classification of the nodules according to the collected attributes according to the unique thyroid nodule identification number of the image analysis module, and stores the real-time ultrasonic image of the nodules.
Preferably, the control module includes:
the system comprises a robot control assembly, a coordinate system conversion assembly and a safety control assembly;
the robot control assembly is responsible for issuing a control instruction of the mechanical arm, uploading data of each sensor of the mechanical arm, and entering and exiting zero-force dragging;
the coordinate system conversion component converts the instruction based on the working coordinate system provided by the decision module into an instruction of a robot base coordinate system given to the control component, and the component further comprises a set of calibration program for ensuring the accuracy of the coordinate system;
the safety control component is responsible for guaranteeing the action safety of the mechanical arm, ensuring that the contact force is between 2N and 4N during scanning, and the filtering decision module can cause out-of-range operation.
The invention has the technical effects and advantages that: compared with the prior art, the thyroid autonomous scanning system based on the multi-mode generation type dialogue has the following effects:
utilizing the generated multi-mode large language model to realize the full-automatic scanning of thyroid; in the scanning process, no manual path setting and no manual parameter setting are needed; determining a scanning initial position by adopting a zero-force dragging mode;
the position and the posture of the probe are adjusted cooperatively according to the sensor data and the ultrasonic image; the absolute motion of the continuous mechanical arm is changed into discrete relative motion through behavior coding, so that scanning failure caused by precision errors in calibration is avoided;
automatically processing the abnormality of the scanning process through the image segmentation of the thyroid region; and automatically generating a report by analyzing diffuse lesions and placeholder lesions in the scanning process through image classification and target detection.
Drawings
FIG. 1 is a scan state transition diagram;
FIG. 2 is a block and component flow architecture diagram;
fig. 3 is a schematic diagram of a tool coordinate system and a base coordinate system according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to specific embodiments in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention provides a thyroid autonomous scanning system based on a multi-modal generation dialogue, which is shown in fig. 1-3, and the system utilizes a generated multi-modal large language model to realize the full autonomous scanning of thyroid; in the scanning process, no manual path setting and no manual parameter setting are needed; determining a scanning initial position by adopting a zero-force dragging mode;
simultaneously, the position and the posture of the probe are adjusted cooperatively according to the sensor data and the ultrasonic image; the absolute motion of the continuous mechanical arm is changed into discrete relative motion through behavior coding, so that scanning failure caused by precision errors in calibration is avoided;
furthermore, the system automatically processes the abnormality of the scanning process through the image segmentation of the thyroid region; diffuse lesions and placeholder lesions in the scanning process are analyzed through image classification and target detection.
A thyroid autonomous scanning system comprising:
the system comprises a hardware device and an autonomous scanning unit for controlling the action of the hardware device;
the hardware device comprises:
the device comprises a mechanical arm, a force sensor, a clamping jaw, an upper computer, display equipment and medical ultrasonic equipment;
the force sensor is matched with the mechanical arm for use, and the clamping jaw is arranged at the action end of the mechanical arm;
the autonomous scanning unit is arranged in the upper computer, and comprises:
the system comprises an image analysis module, a multi-mode question answering module, a decision module and a control module. The image analysis module comprises a thyroid region segmentation component, a thyroid nodule detection component, a thyroid nodule attribute classification component, a thyroid diffuse lesion classification component and an image quality confidence component. The multi-modal question-answering module comprises a word vector component, a graph vector component, a multi-modal dialogue question-answering component and an action coding component. The decision module comprises a scanning state management component, a thyroid search component, a thyroid scanning component, a posture adjustment component and an ultrasonic reporting component. The control module comprises a robot control component, a coordinate system conversion component and a safety control component. And the different modules interact and communicate appointed data through data interfaces, and the components in the modules interact all the data through sharing storage.
The image analysis module is responsible for processing and identifying the image from the ultrasonic equipment and generating metadata for processing by the decision module.
The thyroid region segmentation component identifies thyroid regions and tracheal regions in the ultrasonic image based on deep learning, performs pixel-level classification by using a segmentation semantic segmentation network to obtain masks of the thyroid regions and the tracheal regions, calculates centroid coordinates, contours and areas of the thyroid regions according to the masks, and gives the centroid coordinates, contours and areas to the decision module. The thyroid nodule detection component identifies nodules in the thyroid region based on deep learning, uses a target detection network to generate a detection box for the nodules, and uses a multi-target tracking algorithm to obtain a unique identification number for the detection box.
The thyroid nodule attribute classification component performs multi-attribute classification of the nodule based on deep learning, the multi-attribute classification network uses a classification network of 8 output layers, such as the resnet of 8 fully connected layers, to identify 8 attributes of the nodule, totaling 30 categories, as follows:
multi-attribute classification table
TABLE 1
Azimuth of (Edge) Areola of sound Structure of the Echo Echo texture Strong echo of range Rear echo characterization
Vertical position Finishing process Sounding halo Solidity of the product Hyperechoic echo Uniformity of Microcalcifications Enhancement
Level bit Irregularities Silence halo The reality is mainly Iso-echo Non-uniformity of Comet tail artifacts Attenuation of
Blurring Mainly in the form of capsule Low echo Coarse calcification No change
Exothyroiditis invasion Cystic nature Very low echo Peripheral calcification Mixing changes
Spongy Anechoic echo Strong echo without range
Point strong echo with undefined meaning
The thyroid diffuse lesions classification component performs single attribute classification of thyroid regions based on deep learning, classification networks use classification networks of 1 output layer to classify 4 categories, normal thyroid, toxic goiter, subacute thyroiditis, and hashimoto thyroiditis.
The image quality confidence component calculates the confidence of each pixel of the ultrasound image based on the random walk method and delivers the confidence to the decision module.
The multimodal question-answering module is responsible for encoding the input characters and images, obtaining answers by using a generated dialogue language model and encoding the answers into robot operations.
The word vector component encodes the text based on deep learning, and the word vector encodes the input text into a vector with fixed dimension by using a trained language model;
the image vector component encodes the image based on the deep learning, and the image vector extracts the features of the image by using a convolution network without a classification layer and a feature layer, and reduces the dimension into a vector with fixed dimension;
the multi-modal dialogue question-answering component generates an answer text according to the word vector and the graph vector based on the multi-modal generated dialogue language model;
the action encoding component encodes answers derived from the generated conversational language model into three-dimensional probe operations based on a 6-output multi-attribute classification network, such as a 6-output-layer multi-layer perceptron. There are 6 operations in total in three dimensions, each operation having 3 categories, as listed below:
operation class table
TABLE 2
Tool coordinate system Move in the X direction Move in the Y direction Z-direction movement Rotated about the X-axis Rotated about the Y-axis Rotated about the Z axis
Forward direction movement/rotation 1 1 1 1 1 1
No operation 0 0 0 0 0 0
Negative direction movement/rotation 2 2 2 2 2 2
The decision module is responsible for summarizing the results of the image analysis and the answers of the question-answer module, generating a final decision to be transmitted to the control module for execution, and recording the results of the image analysis to be summarized into a final ultrasonic report.
The scanning state management component is responsible for scheduling the scanning state; and judging whether the target in the current state is completed or not, and if the target needs to enter other states, not directly communicating with the control module. The scanning process has three states, namely thyroid search, thyroid scanning and posture adjustment. The completion goal of the thyroid search state is to find the thyroid region and the thyroid region is located in the center of the image; the completion goal of the thyroid gland scanning state is to scan four boundaries of the upper, lower, left and right of the thyroid gland area; the posture adjustment state is completed with the aim that the image confidence map exceeds a threshold value and the triggering condition is that the image confidence map is lower than the threshold value. The thyroid search is performed twice to search for left and right thyroid leaves, and after the target is achieved, the thyroid scanning state is entered, the posture adjustment state is entered when the condition is not satisfied, and the thyroid adjustment state is exited when the condition is satisfied.
And the thyroid searching component judges whether the thyroid is found according to the centroid coordinates and the area provided by the thyroid region segmentation component of the image analysis module, and judges whether the thyroid is a left leaf or a right leaf through the centroid coordinates and the air duct center. If the thyroid region is found, determining the moving direction of the probe according to the thyroid center position and the center of the image; if the thyroid region is not found, the probe scans according to a fixed Z-shaped route, the scanning height of a Z-shaped window is set to 65 mm according to clinical statistics, the scanning width is set to 55 mm, the step length is 1 mm when the probe moves, and a generation instruction is transmitted to the control module for execution.
The thyroid gland scanning component is responsible for scanning the cross section and the longitudinal section of left and right thyroid glands, a thyroid gland rectangular frame is obtained according to the thyroid gland region provided by the thyroid gland region segmentation component of the image analysis module, the probe is moved to the upper left corner of the rectangular frame and is positioned at the center of the image, and the probe is moved leftwards after being moved downwards once when the upper right corner of the rectangular frame is positioned at the center of the image according to the sequence scanning of the X-axis direction and the lower direction of the tool coordinate system, namely the Y-axis direction, the left, the lower and the right bow-shaped directions of the tool coordinate system. Similarly, when the lower right corner of the rectangular frame is positioned in the center of the image, the scanning state management component is informed of the completion of thyroid gland scanning. The step size of the downward movement is set to 5 mm, and the step size of the leftward and rightward movement is set to 1 mm. When a thyroid nodule is found, the probe is rotated 90 degrees, i.e. 90 degrees around the Z axis of the tool coordinate system, into a longitudinal section scan, and a pose adjustment is performed.
The gesture adjusting component sends the confidence level diagram and the ultrasonic image into the word vector component by analyzing the confidence level diagram output by the image quality confidence level component and prompting the word text according to the language template assembly, the confidence level diagram and the ultrasonic image are sent into the image vector component, the response text is obtained through the multi-mode dialogue question-answering component, the movement direction and the rotation direction of the probe are obtained through the action coding component, the movement step length is 1 millimeter each time, the rotation step length is 1 degree, and the gesture is submitted to the control module for execution. And recording the confidence coefficient of the ultrasonic image and the length and width of the nodule detection frame in the gesture adjustment process, adjusting the probe to the gesture with the highest confidence coefficient in the adjustment period after 20 steps of adjustment, and re-entering the gesture adjustment state if the highest confidence coefficient is still greater than a threshold value.
The ultrasonic reporting component collects the attributes of the nodules according to the set priority and calculates the nodule classification according to the collected attributes (table 4) according to the thyroid nodule unique identification number of the image analysis module, saves the real-time ultrasonic image of the nodules, collects the assembly text according to the language template, and specifically comprises: finding a nodule, grading 4B, vertical, finishing, halation, solidity, hypoechoism, uniformity, coarse calcification, rear echo enhancement, sending to a word vector component, sending the saved nodule ultrasound image to a graph vector component, generating a conclusion writing report.
The set priorities are as follows: priority setting
Priority setting table
TABLE 3 Table 3
Azimuth of Vertical position (1 min)>Level (0 min)
(Edge) Invasion of thyroid gland (1 minute)>Irregular (1 minute)>Fuzzy (1 minute)>Finishing (0 min)
Areola of sound Sounding areola (0 min)>Soundless corona (0 min)
Structure of the Nature (1 min)>The reality is mainly (0 min)>The cystic nature is the main (0 min)>Spongy (0 min)>Cystic (0 min)
Echo Very low echo (1 min)>Low echo (0 min)>Iso-echo (0 min)>Hyperechoic (0 min)>Anechoic (0 min)
Echo texture Non-uniformity (0 min)>Even (0 min)
Strong return of range Sound production Microcalcifications (1 minute)>Coarse calcification (0 point)>Peripheral calcification (0 score)>Point-like strong echo (0 min)>Comet Tail artifact (-1 min)>Non-range strong echo (0 min)
Rear echo Enhancement (0 min)>Attenuation (0 min)>No change (0 point), any two kinds of optional mixed changes (0 point) appear in the three materials
The nodule classification is as follows:
nodule classification table
TABLE 4 Table 4
Score of -1 0 1 2 3 4 5
Grading Level 2 3 grade 4A 4B 4C 4C Grade 5
The control module is responsible for executing the operation generated by the decision module, and ensures the safety of the operation.
The robot control assembly is responsible for issuing a control command of the mechanical arm, uploading data of each sensor of the mechanical arm, and entering and exiting zero-force dragging.
The coordinate system conversion component converts the instructions based on the working coordinate system provided by the decision module into instructions of the robot base coordinate system given to the control component, and the component further comprises a set of calibration procedures to ensure the accuracy of the coordinate system.
The safety control component is responsible for guaranteeing the action safety of the mechanical arm, ensuring that the contact force is between 2N and 4N during scanning, and the filtering decision module can cause out-of-range operation.
Finally, it should be noted that: the foregoing description of the preferred embodiments of the present invention is not intended to be limiting, but rather, it will be apparent to those skilled in the art that the foregoing description of the preferred embodiments of the present invention can be modified or equivalents can be substituted for some of the features thereof, and any modification, equivalent substitution, improvement or the like that is within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (5)

1. A multimodal generation dialog-based thyroid autonomous scanning system, comprising:
the system comprises a hardware device and an autonomous scanning unit for controlling the action of the hardware device;
the hardware device includes:
the device comprises a mechanical arm, a force sensor, a clamping jaw, an upper computer, display equipment and medical ultrasonic equipment;
the force sensor is matched with the mechanical arm for use, and the clamping jaw is arranged at the action end of the mechanical arm;
the autonomous scanning unit is built in the upper computer, and comprises:
the system comprises an image analysis module, a multi-mode question-answering module, a decision module and a control module;
the image analysis module is used for processing and identifying the image from the ultrasonic equipment and generating metadata for processing by the decision module;
the multi-mode question-answering module is used for encoding the input characters and images, and obtaining answers by using a generated dialogue language model and encoding the answers into robot operations; the multi-modal question-answering module comprises:
a word vector component, a graph vector component, a multi-modal dialogue question-answering component and an action coding component;
the word vector component encodes the text based on deep learning, and the word vector encodes the input text into a vector with fixed dimension by using a trained language model;
the image vector component encodes the image based on deep learning, and the image vector extracts the features of the image by using a convolution network without a classification layer and a retention feature layer, and reduces the dimension into a vector with fixed dimension;
the multi-modal dialogue question-answering component generates an answer text according to the word vector and the graph vector based on a multi-modal generated dialogue language model;
the action coding component codes answers obtained by the generated dialogue language model into probe operation in three dimensions;
the decision module is used for summarizing the results of the image analysis and the answers of the question-answer module, generating a final decision and delivering the final decision to the control module for execution, and recording the results of the image analysis and summarizing the results into a final ultrasonic report;
the control module is responsible for executing the operation generated by the decision module, so that the safety of the operation is ensured;
the modules interact and communicate appointed data through a data interface, and components in the modules interact all data through sharing storage.
2. A multimodal generation dialog-based thyroid autonomous scanning system as claimed in claim 1 wherein: the medical ultrasonic equipment is one or more of an ultrasonic three-dimensional diagnostic apparatus, a full-digital color Doppler ultrasound apparatus and ultrasonic color Doppler.
3. A multimodal generation dialog-based thyroid autonomous scanning system as claimed in claim 1 wherein: the image analysis module comprises:
a thyroid region segmentation component, a thyroid nodule detection component, a thyroid nodule attribute classification component, a thyroid diffuse lesion classification component, and an image quality confidence component;
the thyroid region segmentation component identifies a thyroid region and a trachea region in an ultrasonic image based on deep learning, performs pixel-level classification by using a segmentation semantic segmentation network to obtain masks of the thyroid region and the trachea region, calculates centroid coordinates, contours and areas of the thyroid region according to the masks, and gives the centroid coordinates, contours and areas to the decision module;
the thyroid nodule detection component identifies nodules in the thyroid region based on deep learning, uses a target detection network to generate a detection box for the nodules, and uses a multi-target tracking algorithm to obtain a unique identification number for the detection box;
the thyroid nodule attribute classification component classifies the nodules based on deep learning with multiple attributes, and the classification network uses a classification network of 8 output layers;
the thyroid diffuse disease classification component classifies thyroid regions with single attribute based on deep learning, and the classification network uses a classification network of 1 output layer;
the image quality confidence component calculates the confidence of each pixel of the ultrasonic image based on a random walk method and transmits the confidence to the decision module.
4. A multimodal generation dialog-based thyroid autonomous scanning system as claimed in claim 1 wherein: the decision module comprises:
a scanning state management component, a thyroid search component, a thyroid scanning component, a posture adjustment component and an ultrasonic reporting component;
the scanning state management component is responsible for scheduling the scanning state; judging whether the target in the current state is completed or not, and if the target needs to enter other states, not directly communicating with the control module;
the thyroid searching component judges whether thyroid is found according to the centroid coordinates and the area provided by the thyroid region segmentation component of the image analysis module, and judges whether the thyroid is left or right through the centroid coordinates and the center of the air duct;
the thyroid gland scanning component is responsible for scanning the cross section and the longitudinal section of left and right thyroid glands, and a thyroid gland rectangular frame is obtained according to the thyroid gland region provided by the thyroid gland region segmentation component of the image analysis module;
the gesture adjusting component sends the prompt word text into the word vector component according to the language template assembly, sends the confidence coefficient map and the ultrasonic image into the map vector component, and adjusts the gesture of the probe according to the output of the action encoding component and the image quality confidence coefficient component;
the ultrasonic reporting component collects the attributes of the nodules according to the set priority and calculates the classification of the nodules according to the collected attributes according to the unique thyroid nodule identification number of the image analysis module, and stores the real-time ultrasonic image of the nodules.
5. A multimodal generation dialog-based thyroid autonomous scanning system as claimed in claim 1 wherein: the control module includes:
the system comprises a robot control assembly, a coordinate system conversion assembly and a safety control assembly;
the robot control assembly is responsible for issuing a control instruction of the mechanical arm, uploading data of each sensor of the mechanical arm, and entering and exiting zero-force dragging;
the coordinate system conversion component converts the instruction based on the working coordinate system provided by the decision module into an instruction of a robot base coordinate system given to the control component, and the component further comprises a set of calibration program for ensuring the accuracy of the coordinate system;
the safety control component is responsible for guaranteeing the action safety of the mechanical arm, ensuring that the contact force is between 2N and 4N during scanning, and the filtering decision module can cause out-of-range operation.
CN202311292889.0A 2023-10-08 2023-10-08 Thyroid autonomous scanning system based on multi-modal generation type dialogue Active CN117017355B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311292889.0A CN117017355B (en) 2023-10-08 2023-10-08 Thyroid autonomous scanning system based on multi-modal generation type dialogue

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311292889.0A CN117017355B (en) 2023-10-08 2023-10-08 Thyroid autonomous scanning system based on multi-modal generation type dialogue

Publications (2)

Publication Number Publication Date
CN117017355A CN117017355A (en) 2023-11-10
CN117017355B true CN117017355B (en) 2024-01-12

Family

ID=88632233

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311292889.0A Active CN117017355B (en) 2023-10-08 2023-10-08 Thyroid autonomous scanning system based on multi-modal generation type dialogue

Country Status (1)

Country Link
CN (1) CN117017355B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014207642A1 (en) * 2013-06-28 2014-12-31 Koninklijke Philips N.V. Ultrasound acquisition feedback guidance to a target view
CN113855067A (en) * 2021-08-23 2021-12-31 谈斯聪 Visual image and medical image fusion recognition and autonomous positioning scanning method
CN115089212A (en) * 2022-05-08 2022-09-23 中南大学湘雅二医院 Three-dimensional vision-guided automatic neck ultrasonic scanning method and system for mechanical arm
CN115153634A (en) * 2022-07-22 2022-10-11 中山大学孙逸仙纪念医院 Intelligent ultrasonic examination and diagnosis method and system
CN115429327A (en) * 2022-08-23 2022-12-06 安徽医科大学第一附属医院 Thyroid full-automatic ultrasonic intelligent scanning method and platform
CN115670515A (en) * 2022-10-26 2023-02-03 华南理工大学 Ultrasonic robot thyroid detection system based on deep learning

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3373820A4 (en) * 2015-11-10 2019-06-26 Exact Imaging Inc. A system comprising indicator features in high-resolution micro-ultrasound images
AU2017281281B2 (en) * 2016-06-20 2022-03-10 Butterfly Network, Inc. Automated image acquisition for assisting a user to operate an ultrasound device
US20200194117A1 (en) * 2018-12-13 2020-06-18 University Of Maryland, College Park Systems, methods, and media for remote trauma assessment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014207642A1 (en) * 2013-06-28 2014-12-31 Koninklijke Philips N.V. Ultrasound acquisition feedback guidance to a target view
CN113855067A (en) * 2021-08-23 2021-12-31 谈斯聪 Visual image and medical image fusion recognition and autonomous positioning scanning method
CN115089212A (en) * 2022-05-08 2022-09-23 中南大学湘雅二医院 Three-dimensional vision-guided automatic neck ultrasonic scanning method and system for mechanical arm
CN115153634A (en) * 2022-07-22 2022-10-11 中山大学孙逸仙纪念医院 Intelligent ultrasonic examination and diagnosis method and system
CN115429327A (en) * 2022-08-23 2022-12-06 安徽医科大学第一附属医院 Thyroid full-automatic ultrasonic intelligent scanning method and platform
CN115670515A (en) * 2022-10-26 2023-02-03 华南理工大学 Ultrasonic robot thyroid detection system based on deep learning

Also Published As

Publication number Publication date
CN117017355A (en) 2023-11-10

Similar Documents

Publication Publication Date Title
WO2021226891A1 (en) 3d printing device and method based on multi-axis linkage control and machine visual feedback measurement
EP3477589B1 (en) Method of processing medical image, and medical image processing apparatus performing the method
CN101919685B (en) Tongue diagnosis intelligent control and diagnosis system
US20230042756A1 (en) Autonomous mobile grabbing method for mechanical arm based on visual-haptic fusion under complex illumination condition
CN112270993B (en) Ultrasonic robot online decision-making method and system taking diagnosis result as feedback
CN112206006A (en) Intelligent auxiliary identification equipment and method for autonomously evaluating benign and malignant thyroid nodules
CN113766997A (en) Method for guiding a robot arm, guiding system
CN112998749A (en) Automatic ultrasonic inspection system based on visual servoing
CN108161930A (en) A kind of robot positioning system of view-based access control model and method
CN117017355B (en) Thyroid autonomous scanning system based on multi-modal generation type dialogue
CN112132805B (en) Ultrasonic robot state normalization method and system based on human body characteristics
CN109079777B (en) Manipulator hand-eye coordination operation system
CN113243933A (en) Remote ultrasonic diagnosis system and use method
CN113066111A (en) Automatic positioning method for cardiac mitral valve vertex based on CT image
He et al. Puncture site decision method for venipuncture robot based on near-infrared vision and multiobjective optimization
Zhou et al. A VS ultrasound diagnostic system with kidney image evaluation functions
CN115869013A (en) Blood vessel positioning and navigation method for blood vessel ultrasonic autonomous scanning
WO2022059539A1 (en) Computer program, information processing method, and information processing device
Garrote et al. Reinforcement learning motion planning for an EOG-centered robot assisted navigation in a virtual environment
Tian et al. A Vision-Based Target Localization Method for Robot-Assisted Sonography
CN116687452B (en) Early pregnancy fetus ultrasonic autonomous scanning method, system and equipment
CN115570574B (en) Auxiliary remote control method, system, device and medium for remote ultrasonic robot
CN113319854B (en) Visual demonstration method and system for bath robot
Li et al. Automatic Robotic Scanning for real-time 3D Ultrasound Reconstruction in Spine Surgery
CN117773959A (en) Clinical intelligent AI auxiliary robot

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant