CN117017355B - Thyroid autonomous scanning system based on multi-modal generation type dialogue - Google Patents
Thyroid autonomous scanning system based on multi-modal generation type dialogue Download PDFInfo
- Publication number
- CN117017355B CN117017355B CN202311292889.0A CN202311292889A CN117017355B CN 117017355 B CN117017355 B CN 117017355B CN 202311292889 A CN202311292889 A CN 202311292889A CN 117017355 B CN117017355 B CN 117017355B
- Authority
- CN
- China
- Prior art keywords
- component
- thyroid
- scanning
- module
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 210000001685 thyroid gland Anatomy 0.000 title claims abstract description 100
- 238000010191 image analysis Methods 0.000 claims abstract description 25
- 238000000034 method Methods 0.000 claims abstract description 25
- 239000000523 sample Substances 0.000 claims abstract description 21
- 230000009471 action Effects 0.000 claims abstract description 19
- 238000013135 deep learning Methods 0.000 claims description 18
- 238000001514 detection method Methods 0.000 claims description 18
- 230000011218 segmentation Effects 0.000 claims description 18
- 208000009453 Thyroid Nodule Diseases 0.000 claims description 16
- 208000024770 Thyroid neoplasm Diseases 0.000 claims description 16
- 230000003902 lesion Effects 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 7
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 238000002604 ultrasonography Methods 0.000 claims description 6
- 210000003437 trachea Anatomy 0.000 claims description 4
- 239000000284 extract Substances 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 3
- 238000005295 random walk Methods 0.000 claims description 3
- 238000003860 storage Methods 0.000 claims description 3
- 201000010099 disease Diseases 0.000 claims description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 2
- 230000014759 maintenance of location Effects 0.000 claims description 2
- 230000033001 locomotion Effects 0.000 abstract description 15
- 230000008569 process Effects 0.000 abstract description 14
- 208000004434 Calcinosis Diseases 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 230000002308 calcification Effects 0.000 description 5
- 238000007726 management method Methods 0.000 description 5
- 210000004883 areola Anatomy 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 230000005856 abnormality Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 125000001475 halogen functional group Chemical group 0.000 description 2
- 238000003709 image segmentation Methods 0.000 description 2
- 230000009545 invasion Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 208000001204 Hashimoto Disease Diseases 0.000 description 1
- 208000030836 Hashimoto thyroiditis Diseases 0.000 description 1
- 208000021738 Plummer disease Diseases 0.000 description 1
- 206010043784 Thyroiditis subacute Diseases 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 239000002775 capsule Substances 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007730 finishing process Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 201000007497 subacute thyroiditis Diseases 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/70—Labelling scene content, e.g. deriving syntactic or semantic representations
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B8/00—Diagnosis using ultrasonic, sonic or infrasonic waves
- A61B8/08—Detecting organic movements or changes, e.g. tumours, cysts, swellings
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B8/00—Diagnosis using ultrasonic, sonic or infrasonic waves
- A61B8/08—Detecting organic movements or changes, e.g. tumours, cysts, swellings
- A61B8/0833—Detecting organic movements or changes, e.g. tumours, cysts, swellings involving detecting or locating foreign bodies or organic structures
- A61B8/085—Detecting organic movements or changes, e.g. tumours, cysts, swellings involving detecting or locating foreign bodies or organic structures for locating body or organic structures, e.g. tumours, calculi, blood vessels, nodules
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B8/00—Diagnosis using ultrasonic, sonic or infrasonic waves
- A61B8/52—Devices using data or image processing specially adapted for diagnosis using ultrasonic, sonic or infrasonic waves
- A61B8/5215—Devices using data or image processing specially adapted for diagnosis using ultrasonic, sonic or infrasonic waves involving processing of medical diagnostic data
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B8/00—Diagnosis using ultrasonic, sonic or infrasonic waves
- A61B8/54—Control of the diagnostic device
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/9032—Query formulation
- G06F16/90332—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Medical Informatics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Databases & Information Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Animal Behavior & Ethology (AREA)
- Veterinary Medicine (AREA)
- Public Health (AREA)
- Radiology & Medical Imaging (AREA)
- Pathology (AREA)
- Heart & Thoracic Surgery (AREA)
- Surgery (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Ultra Sonic Daignosis Equipment (AREA)
- Vascular Medicine (AREA)
Abstract
The invention discloses a thyroid autonomous scanning system based on a multi-mode generation type dialogue, which comprises the following steps: the system comprises a hardware device and an autonomous scanning unit for controlling the action of the hardware device; the hardware device comprises: the device comprises a mechanical arm, a force sensor, a clamping jaw, an upper computer, display equipment and medical ultrasonic equipment; the autonomous scanning unit is arranged in the upper computer, and comprises: the system comprises an image analysis module, a multi-mode question answering module, a decision module and a control module. The invention utilizes the generated multi-mode large language model to realize the full-automatic scanning of thyroid; in the scanning process, no manual path setting and no manual parameter setting are needed; determining a scanning initial position by adopting a zero-force dragging mode; the position and the posture of the probe are adjusted cooperatively according to the sensor data and the ultrasonic image; the continuous absolute motion of the mechanical arm is changed into discrete relative motion through behavior coding, so that scanning failure caused by precision errors in calibration is avoided.
Description
Technical Field
The invention belongs to the technical field of medical diagnosis, and particularly relates to a thyroid autonomous scanning system based on a multi-mode generation type dialogue.
Background
Ultrasound is a non-invasive, safe, non-radiative, real-time imaging examination that is widely used for clinical diagnosis and monitoring. In china, ultrasound examination is performed billions of times per year, and the massive examination needs are a great challenge for the scarce sonographer, especially in remote areas where medical resources are scarce. With the development of artificial intelligence technology, robots with autonomous scanning capabilities can help the sonographer to improve work efficiency and reduce patient waiting time.
Publication No. CN113855067A describes an autonomous positioning scanning method for fusion of visual images and medical images, which utilizes the visual images and the medical images to collect coordinates of organs and external position areas thereof, and then manually sets a target name, a target parameter, position information and a communication target for the system. The robot motion planning needs to manually set and set acquisition target parameters, and remote control cannot be performed in the scanning process, so that the robot is not really autonomous scanning.
Publication number CN114155940A, CN115227404a describes an autonomous ultrasonic scanning skill strategy generation method based on reinforcement learning, which solves the three difficulties of autonomous scanning, namely, collection and cleaning of collected training data; secondly, the training process is unstable, and the network is not easy to converge; thirdly, the locally optimal solution is not necessarily the globally optimal solution. Meanwhile, the method is a complete end-to-end network prediction, and if abnormal reasons cannot be traced back and self-optimization occurs, only human intervention can be waited.
The publication number CN115429327a describes an automatic scanning sequence selection method using positioning marks and preset scanning tracks, the autonomous scanning of the method is essentially a set of standardized scanning track control, the emergency caused by the difference of physiological structures of scanned persons and small-range movement cannot be solved, and the scanning track formulated by the method only comprises position change and no adjustment of probe posture.
Publication number CN115670515a describes an autonomous scanning method for positioning the thyroid using a depth camera, segmenting the position of the thyroid control probe in the image using a neural network, and adjusting the posture of the probe according to a force sensor, wherein the method adjusts the posture of the probe to be perpendicular to the skin, in practice, the probe is not necessarily perpendicular to the skin or is the best image, and the thyroid scanning is a joint optimization process of position, posture and force, and the scanning of two planes of a transverse plane and a longitudinal plane needs to be completed;
in view of the shortcomings of the above-mentioned published patent, in combination with the technological innovation brought by the generation type artificial intelligence, we propose a thyroid autonomous scanning system based on multi-modal generation type dialogue.
Disclosure of Invention
The invention aims to solve the defects in the prior art, and provides a thyroid autonomous scanning system based on a multi-mode generation dialogue, which realizes the full autonomous scanning of thyroid by using a generated multi-mode large language model; in the scanning process, no manual path setting and no manual parameter setting are needed; and determining the scanning initial position by adopting a zero-force dragging mode.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a thyroid autonomous scanning system based on multimodal generation dialog, comprising:
the system comprises a hardware device and an autonomous scanning unit for controlling the action of the hardware device;
the hardware device includes:
the device comprises a mechanical arm, a force sensor, a clamping jaw, an upper computer, display equipment and medical ultrasonic equipment;
the force sensor is matched with the mechanical arm for use, and the clamping jaw is arranged at the action end of the mechanical arm;
the autonomous scanning unit is built in the upper computer, and comprises:
the system comprises an image analysis module, a multi-mode question-answering module, a decision module and a control module;
the image analysis module is used for processing and identifying the image from the ultrasonic equipment and generating metadata for processing by the decision module;
the multi-mode question-answering module is used for encoding the input characters and images, and obtaining answers by using a generated dialogue language model and encoding the answers into robot operations;
the decision module is used for summarizing the results of the image analysis and the answers of the question-answer module, generating a final decision and delivering the final decision to the control module for execution, and recording the results of the image analysis and summarizing the results into a final ultrasonic report;
the control module is responsible for executing the operation generated by the decision module, so that the safety of the operation is ensured;
the modules interact and communicate appointed data through a data interface, and components in the modules interact all data through sharing storage.
Preferably, the medical ultrasonic equipment is one or more of an ultrasonic three-dimensional diagnostic apparatus, a full-digital color Doppler ultrasound apparatus and ultrasonic color Doppler.
Preferably, the image analysis module includes:
a thyroid region segmentation component, a thyroid nodule detection component, a thyroid nodule attribute classification component, a thyroid diffuse lesion classification component, and an image quality confidence component;
the thyroid region segmentation component identifies a thyroid region and a trachea region in an ultrasonic image based on deep learning, performs pixel-level classification by using a segmentation semantic segmentation network to obtain masks of the thyroid region and the trachea region, calculates centroid coordinates, contours and areas of the thyroid region according to the masks, and gives the centroid coordinates, contours and areas to the decision module;
the thyroid nodule detection component identifies nodules in the thyroid region based on deep learning, uses a target detection network to generate a detection box for the nodules, and uses a multi-target tracking algorithm to obtain a unique identification number for the detection box;
the thyroid nodule attribute classification component classifies the nodules based on deep learning with multiple attributes, and the classification network uses a classification network of 8 output layers;
the thyroid diffuse disease classification component classifies thyroid regions with single attribute based on deep learning, and the classification network uses a classification network of 1 output layer;
the image quality confidence component calculates the confidence of each pixel of the ultrasonic image based on a random walk method and transmits the confidence to the decision module.
Preferably, the multi-mode question-answering module includes:
a word vector component, a graph vector component, a multi-modal dialogue question-answering component and an action coding component;
the word vector component encodes the text based on deep learning, and the word vector encodes the input text into a vector with fixed dimension by using a trained language model;
the image vector component encodes the image based on deep learning, and the image vector extracts the features of the image by using a convolution network without a classification layer and a retention feature layer, and reduces the dimension into a vector with fixed dimension;
the multi-modal dialogue question-answering component generates an answer text according to the word vector and the graph vector based on a multi-modal generated dialogue language model;
the action encoding component encodes answers derived from the generated conversational language model into three dimensional probe operations.
Preferably, the decision module includes:
a scanning state management component, a thyroid search component, a thyroid scanning component, a posture adjustment component and an ultrasonic reporting component;
the scanning state management component is responsible for scheduling the scanning state; judging whether the target in the current state is completed or not, and if the target needs to enter other states, not directly communicating with the control module;
the thyroid searching component judges whether thyroid is found according to the centroid coordinates and the area provided by the thyroid region segmentation component of the image analysis module, and judges whether the thyroid is left or right through the centroid coordinates and the center of the air duct;
the thyroid gland scanning component is responsible for scanning the cross section and the longitudinal section of left and right thyroid glands, and a thyroid gland rectangular frame is obtained according to the thyroid gland region provided by the thyroid gland region segmentation component of the image analysis module;
the gesture adjusting component sends the prompt word text into the word vector component according to the language template assembly, sends the confidence coefficient map and the ultrasonic image into the map vector component, and adjusts the gesture of the probe according to the output of the action encoding component and the image quality confidence coefficient component;
the ultrasonic reporting component collects the attributes of the nodules according to the set priority and calculates the classification of the nodules according to the collected attributes according to the unique thyroid nodule identification number of the image analysis module, and stores the real-time ultrasonic image of the nodules.
Preferably, the control module includes:
the system comprises a robot control assembly, a coordinate system conversion assembly and a safety control assembly;
the robot control assembly is responsible for issuing a control instruction of the mechanical arm, uploading data of each sensor of the mechanical arm, and entering and exiting zero-force dragging;
the coordinate system conversion component converts the instruction based on the working coordinate system provided by the decision module into an instruction of a robot base coordinate system given to the control component, and the component further comprises a set of calibration program for ensuring the accuracy of the coordinate system;
the safety control component is responsible for guaranteeing the action safety of the mechanical arm, ensuring that the contact force is between 2N and 4N during scanning, and the filtering decision module can cause out-of-range operation.
The invention has the technical effects and advantages that: compared with the prior art, the thyroid autonomous scanning system based on the multi-mode generation type dialogue has the following effects:
utilizing the generated multi-mode large language model to realize the full-automatic scanning of thyroid; in the scanning process, no manual path setting and no manual parameter setting are needed; determining a scanning initial position by adopting a zero-force dragging mode;
the position and the posture of the probe are adjusted cooperatively according to the sensor data and the ultrasonic image; the absolute motion of the continuous mechanical arm is changed into discrete relative motion through behavior coding, so that scanning failure caused by precision errors in calibration is avoided;
automatically processing the abnormality of the scanning process through the image segmentation of the thyroid region; and automatically generating a report by analyzing diffuse lesions and placeholder lesions in the scanning process through image classification and target detection.
Drawings
FIG. 1 is a scan state transition diagram;
FIG. 2 is a block and component flow architecture diagram;
fig. 3 is a schematic diagram of a tool coordinate system and a base coordinate system according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to specific embodiments in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention provides a thyroid autonomous scanning system based on a multi-modal generation dialogue, which is shown in fig. 1-3, and the system utilizes a generated multi-modal large language model to realize the full autonomous scanning of thyroid; in the scanning process, no manual path setting and no manual parameter setting are needed; determining a scanning initial position by adopting a zero-force dragging mode;
simultaneously, the position and the posture of the probe are adjusted cooperatively according to the sensor data and the ultrasonic image; the absolute motion of the continuous mechanical arm is changed into discrete relative motion through behavior coding, so that scanning failure caused by precision errors in calibration is avoided;
furthermore, the system automatically processes the abnormality of the scanning process through the image segmentation of the thyroid region; diffuse lesions and placeholder lesions in the scanning process are analyzed through image classification and target detection.
A thyroid autonomous scanning system comprising:
the system comprises a hardware device and an autonomous scanning unit for controlling the action of the hardware device;
the hardware device comprises:
the device comprises a mechanical arm, a force sensor, a clamping jaw, an upper computer, display equipment and medical ultrasonic equipment;
the force sensor is matched with the mechanical arm for use, and the clamping jaw is arranged at the action end of the mechanical arm;
the autonomous scanning unit is arranged in the upper computer, and comprises:
the system comprises an image analysis module, a multi-mode question answering module, a decision module and a control module. The image analysis module comprises a thyroid region segmentation component, a thyroid nodule detection component, a thyroid nodule attribute classification component, a thyroid diffuse lesion classification component and an image quality confidence component. The multi-modal question-answering module comprises a word vector component, a graph vector component, a multi-modal dialogue question-answering component and an action coding component. The decision module comprises a scanning state management component, a thyroid search component, a thyroid scanning component, a posture adjustment component and an ultrasonic reporting component. The control module comprises a robot control component, a coordinate system conversion component and a safety control component. And the different modules interact and communicate appointed data through data interfaces, and the components in the modules interact all the data through sharing storage.
The image analysis module is responsible for processing and identifying the image from the ultrasonic equipment and generating metadata for processing by the decision module.
The thyroid region segmentation component identifies thyroid regions and tracheal regions in the ultrasonic image based on deep learning, performs pixel-level classification by using a segmentation semantic segmentation network to obtain masks of the thyroid regions and the tracheal regions, calculates centroid coordinates, contours and areas of the thyroid regions according to the masks, and gives the centroid coordinates, contours and areas to the decision module. The thyroid nodule detection component identifies nodules in the thyroid region based on deep learning, uses a target detection network to generate a detection box for the nodules, and uses a multi-target tracking algorithm to obtain a unique identification number for the detection box.
The thyroid nodule attribute classification component performs multi-attribute classification of the nodule based on deep learning, the multi-attribute classification network uses a classification network of 8 output layers, such as the resnet of 8 fully connected layers, to identify 8 attributes of the nodule, totaling 30 categories, as follows:
multi-attribute classification table
TABLE 1
Azimuth of | (Edge) | Areola of sound | Structure of the | Echo | Echo texture | Strong echo of range | Rear echo characterization |
Vertical position | Finishing process | Sounding halo | Solidity of the product | Hyperechoic echo | Uniformity of | Microcalcifications | Enhancement |
Level bit | Irregularities | Silence halo | The reality is mainly | Iso-echo | Non-uniformity of | Comet tail artifacts | Attenuation of |
Blurring | Mainly in the form of capsule | Low echo | Coarse calcification | No change | |||
Exothyroiditis invasion | Cystic nature | Very low echo | Peripheral calcification | Mixing changes | |||
Spongy | Anechoic echo | Strong echo without range | |||||
Point strong echo with undefined meaning |
The thyroid diffuse lesions classification component performs single attribute classification of thyroid regions based on deep learning, classification networks use classification networks of 1 output layer to classify 4 categories, normal thyroid, toxic goiter, subacute thyroiditis, and hashimoto thyroiditis.
The image quality confidence component calculates the confidence of each pixel of the ultrasound image based on the random walk method and delivers the confidence to the decision module.
The multimodal question-answering module is responsible for encoding the input characters and images, obtaining answers by using a generated dialogue language model and encoding the answers into robot operations.
The word vector component encodes the text based on deep learning, and the word vector encodes the input text into a vector with fixed dimension by using a trained language model;
the image vector component encodes the image based on the deep learning, and the image vector extracts the features of the image by using a convolution network without a classification layer and a feature layer, and reduces the dimension into a vector with fixed dimension;
the multi-modal dialogue question-answering component generates an answer text according to the word vector and the graph vector based on the multi-modal generated dialogue language model;
the action encoding component encodes answers derived from the generated conversational language model into three-dimensional probe operations based on a 6-output multi-attribute classification network, such as a 6-output-layer multi-layer perceptron. There are 6 operations in total in three dimensions, each operation having 3 categories, as listed below:
operation class table
TABLE 2
Tool coordinate system | Move in the X direction | Move in the Y direction | Z-direction movement | Rotated about the X-axis | Rotated about the Y-axis | Rotated about the Z axis |
Forward direction movement/rotation | 1 | 1 | 1 | 1 | 1 | 1 |
No operation | 0 | 0 | 0 | 0 | 0 | 0 |
Negative direction movement/rotation | 2 | 2 | 2 | 2 | 2 | 2 |
The decision module is responsible for summarizing the results of the image analysis and the answers of the question-answer module, generating a final decision to be transmitted to the control module for execution, and recording the results of the image analysis to be summarized into a final ultrasonic report.
The scanning state management component is responsible for scheduling the scanning state; and judging whether the target in the current state is completed or not, and if the target needs to enter other states, not directly communicating with the control module. The scanning process has three states, namely thyroid search, thyroid scanning and posture adjustment. The completion goal of the thyroid search state is to find the thyroid region and the thyroid region is located in the center of the image; the completion goal of the thyroid gland scanning state is to scan four boundaries of the upper, lower, left and right of the thyroid gland area; the posture adjustment state is completed with the aim that the image confidence map exceeds a threshold value and the triggering condition is that the image confidence map is lower than the threshold value. The thyroid search is performed twice to search for left and right thyroid leaves, and after the target is achieved, the thyroid scanning state is entered, the posture adjustment state is entered when the condition is not satisfied, and the thyroid adjustment state is exited when the condition is satisfied.
And the thyroid searching component judges whether the thyroid is found according to the centroid coordinates and the area provided by the thyroid region segmentation component of the image analysis module, and judges whether the thyroid is a left leaf or a right leaf through the centroid coordinates and the air duct center. If the thyroid region is found, determining the moving direction of the probe according to the thyroid center position and the center of the image; if the thyroid region is not found, the probe scans according to a fixed Z-shaped route, the scanning height of a Z-shaped window is set to 65 mm according to clinical statistics, the scanning width is set to 55 mm, the step length is 1 mm when the probe moves, and a generation instruction is transmitted to the control module for execution.
The thyroid gland scanning component is responsible for scanning the cross section and the longitudinal section of left and right thyroid glands, a thyroid gland rectangular frame is obtained according to the thyroid gland region provided by the thyroid gland region segmentation component of the image analysis module, the probe is moved to the upper left corner of the rectangular frame and is positioned at the center of the image, and the probe is moved leftwards after being moved downwards once when the upper right corner of the rectangular frame is positioned at the center of the image according to the sequence scanning of the X-axis direction and the lower direction of the tool coordinate system, namely the Y-axis direction, the left, the lower and the right bow-shaped directions of the tool coordinate system. Similarly, when the lower right corner of the rectangular frame is positioned in the center of the image, the scanning state management component is informed of the completion of thyroid gland scanning. The step size of the downward movement is set to 5 mm, and the step size of the leftward and rightward movement is set to 1 mm. When a thyroid nodule is found, the probe is rotated 90 degrees, i.e. 90 degrees around the Z axis of the tool coordinate system, into a longitudinal section scan, and a pose adjustment is performed.
The gesture adjusting component sends the confidence level diagram and the ultrasonic image into the word vector component by analyzing the confidence level diagram output by the image quality confidence level component and prompting the word text according to the language template assembly, the confidence level diagram and the ultrasonic image are sent into the image vector component, the response text is obtained through the multi-mode dialogue question-answering component, the movement direction and the rotation direction of the probe are obtained through the action coding component, the movement step length is 1 millimeter each time, the rotation step length is 1 degree, and the gesture is submitted to the control module for execution. And recording the confidence coefficient of the ultrasonic image and the length and width of the nodule detection frame in the gesture adjustment process, adjusting the probe to the gesture with the highest confidence coefficient in the adjustment period after 20 steps of adjustment, and re-entering the gesture adjustment state if the highest confidence coefficient is still greater than a threshold value.
The ultrasonic reporting component collects the attributes of the nodules according to the set priority and calculates the nodule classification according to the collected attributes (table 4) according to the thyroid nodule unique identification number of the image analysis module, saves the real-time ultrasonic image of the nodules, collects the assembly text according to the language template, and specifically comprises: finding a nodule, grading 4B, vertical, finishing, halation, solidity, hypoechoism, uniformity, coarse calcification, rear echo enhancement, sending to a word vector component, sending the saved nodule ultrasound image to a graph vector component, generating a conclusion writing report.
The set priorities are as follows: priority setting
Priority setting table
TABLE 3 Table 3
Azimuth of | Vertical position (1 min)>Level (0 min) |
(Edge) | Invasion of thyroid gland (1 minute)>Irregular (1 minute)>Fuzzy (1 minute)>Finishing (0 min) |
Areola of sound | Sounding areola (0 min)>Soundless corona (0 min) |
Structure of the | Nature (1 min)>The reality is mainly (0 min)>The cystic nature is the main (0 min)>Spongy (0 min)>Cystic (0 min) |
Echo | Very low echo (1 min)>Low echo (0 min)>Iso-echo (0 min)>Hyperechoic (0 min)>Anechoic (0 min) |
Echo texture | Non-uniformity (0 min)>Even (0 min) |
Strong return of range Sound production | Microcalcifications (1 minute)>Coarse calcification (0 point)>Peripheral calcification (0 score)>Point-like strong echo (0 min)>Comet Tail artifact (-1 min)>Non-range strong echo (0 min) |
Rear echo | Enhancement (0 min)>Attenuation (0 min)>No change (0 point), any two kinds of optional mixed changes (0 point) appear in the three materials |
The nodule classification is as follows:
nodule classification table
TABLE 4 Table 4
Score of | -1 | 0 | 1 | 2 | 3 | 4 | 5 |
Grading | Level 2 | 3 grade | 4A | 4B | 4C | 4C | Grade 5 |
The control module is responsible for executing the operation generated by the decision module, and ensures the safety of the operation.
The robot control assembly is responsible for issuing a control command of the mechanical arm, uploading data of each sensor of the mechanical arm, and entering and exiting zero-force dragging.
The coordinate system conversion component converts the instructions based on the working coordinate system provided by the decision module into instructions of the robot base coordinate system given to the control component, and the component further comprises a set of calibration procedures to ensure the accuracy of the coordinate system.
The safety control component is responsible for guaranteeing the action safety of the mechanical arm, ensuring that the contact force is between 2N and 4N during scanning, and the filtering decision module can cause out-of-range operation.
Finally, it should be noted that: the foregoing description of the preferred embodiments of the present invention is not intended to be limiting, but rather, it will be apparent to those skilled in the art that the foregoing description of the preferred embodiments of the present invention can be modified or equivalents can be substituted for some of the features thereof, and any modification, equivalent substitution, improvement or the like that is within the spirit and principles of the present invention should be included in the scope of the present invention.
Claims (5)
1. A multimodal generation dialog-based thyroid autonomous scanning system, comprising:
the system comprises a hardware device and an autonomous scanning unit for controlling the action of the hardware device;
the hardware device includes:
the device comprises a mechanical arm, a force sensor, a clamping jaw, an upper computer, display equipment and medical ultrasonic equipment;
the force sensor is matched with the mechanical arm for use, and the clamping jaw is arranged at the action end of the mechanical arm;
the autonomous scanning unit is built in the upper computer, and comprises:
the system comprises an image analysis module, a multi-mode question-answering module, a decision module and a control module;
the image analysis module is used for processing and identifying the image from the ultrasonic equipment and generating metadata for processing by the decision module;
the multi-mode question-answering module is used for encoding the input characters and images, and obtaining answers by using a generated dialogue language model and encoding the answers into robot operations; the multi-modal question-answering module comprises:
a word vector component, a graph vector component, a multi-modal dialogue question-answering component and an action coding component;
the word vector component encodes the text based on deep learning, and the word vector encodes the input text into a vector with fixed dimension by using a trained language model;
the image vector component encodes the image based on deep learning, and the image vector extracts the features of the image by using a convolution network without a classification layer and a retention feature layer, and reduces the dimension into a vector with fixed dimension;
the multi-modal dialogue question-answering component generates an answer text according to the word vector and the graph vector based on a multi-modal generated dialogue language model;
the action coding component codes answers obtained by the generated dialogue language model into probe operation in three dimensions;
the decision module is used for summarizing the results of the image analysis and the answers of the question-answer module, generating a final decision and delivering the final decision to the control module for execution, and recording the results of the image analysis and summarizing the results into a final ultrasonic report;
the control module is responsible for executing the operation generated by the decision module, so that the safety of the operation is ensured;
the modules interact and communicate appointed data through a data interface, and components in the modules interact all data through sharing storage.
2. A multimodal generation dialog-based thyroid autonomous scanning system as claimed in claim 1 wherein: the medical ultrasonic equipment is one or more of an ultrasonic three-dimensional diagnostic apparatus, a full-digital color Doppler ultrasound apparatus and ultrasonic color Doppler.
3. A multimodal generation dialog-based thyroid autonomous scanning system as claimed in claim 1 wherein: the image analysis module comprises:
a thyroid region segmentation component, a thyroid nodule detection component, a thyroid nodule attribute classification component, a thyroid diffuse lesion classification component, and an image quality confidence component;
the thyroid region segmentation component identifies a thyroid region and a trachea region in an ultrasonic image based on deep learning, performs pixel-level classification by using a segmentation semantic segmentation network to obtain masks of the thyroid region and the trachea region, calculates centroid coordinates, contours and areas of the thyroid region according to the masks, and gives the centroid coordinates, contours and areas to the decision module;
the thyroid nodule detection component identifies nodules in the thyroid region based on deep learning, uses a target detection network to generate a detection box for the nodules, and uses a multi-target tracking algorithm to obtain a unique identification number for the detection box;
the thyroid nodule attribute classification component classifies the nodules based on deep learning with multiple attributes, and the classification network uses a classification network of 8 output layers;
the thyroid diffuse disease classification component classifies thyroid regions with single attribute based on deep learning, and the classification network uses a classification network of 1 output layer;
the image quality confidence component calculates the confidence of each pixel of the ultrasonic image based on a random walk method and transmits the confidence to the decision module.
4. A multimodal generation dialog-based thyroid autonomous scanning system as claimed in claim 1 wherein: the decision module comprises:
a scanning state management component, a thyroid search component, a thyroid scanning component, a posture adjustment component and an ultrasonic reporting component;
the scanning state management component is responsible for scheduling the scanning state; judging whether the target in the current state is completed or not, and if the target needs to enter other states, not directly communicating with the control module;
the thyroid searching component judges whether thyroid is found according to the centroid coordinates and the area provided by the thyroid region segmentation component of the image analysis module, and judges whether the thyroid is left or right through the centroid coordinates and the center of the air duct;
the thyroid gland scanning component is responsible for scanning the cross section and the longitudinal section of left and right thyroid glands, and a thyroid gland rectangular frame is obtained according to the thyroid gland region provided by the thyroid gland region segmentation component of the image analysis module;
the gesture adjusting component sends the prompt word text into the word vector component according to the language template assembly, sends the confidence coefficient map and the ultrasonic image into the map vector component, and adjusts the gesture of the probe according to the output of the action encoding component and the image quality confidence coefficient component;
the ultrasonic reporting component collects the attributes of the nodules according to the set priority and calculates the classification of the nodules according to the collected attributes according to the unique thyroid nodule identification number of the image analysis module, and stores the real-time ultrasonic image of the nodules.
5. A multimodal generation dialog-based thyroid autonomous scanning system as claimed in claim 1 wherein: the control module includes:
the system comprises a robot control assembly, a coordinate system conversion assembly and a safety control assembly;
the robot control assembly is responsible for issuing a control instruction of the mechanical arm, uploading data of each sensor of the mechanical arm, and entering and exiting zero-force dragging;
the coordinate system conversion component converts the instruction based on the working coordinate system provided by the decision module into an instruction of a robot base coordinate system given to the control component, and the component further comprises a set of calibration program for ensuring the accuracy of the coordinate system;
the safety control component is responsible for guaranteeing the action safety of the mechanical arm, ensuring that the contact force is between 2N and 4N during scanning, and the filtering decision module can cause out-of-range operation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311292889.0A CN117017355B (en) | 2023-10-08 | 2023-10-08 | Thyroid autonomous scanning system based on multi-modal generation type dialogue |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311292889.0A CN117017355B (en) | 2023-10-08 | 2023-10-08 | Thyroid autonomous scanning system based on multi-modal generation type dialogue |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117017355A CN117017355A (en) | 2023-11-10 |
CN117017355B true CN117017355B (en) | 2024-01-12 |
Family
ID=88632233
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311292889.0A Active CN117017355B (en) | 2023-10-08 | 2023-10-08 | Thyroid autonomous scanning system based on multi-modal generation type dialogue |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117017355B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014207642A1 (en) * | 2013-06-28 | 2014-12-31 | Koninklijke Philips N.V. | Ultrasound acquisition feedback guidance to a target view |
CN113855067A (en) * | 2021-08-23 | 2021-12-31 | 谈斯聪 | Visual image and medical image fusion recognition and autonomous positioning scanning method |
CN115089212A (en) * | 2022-05-08 | 2022-09-23 | 中南大学湘雅二医院 | Three-dimensional vision-guided automatic neck ultrasonic scanning method and system for mechanical arm |
CN115153634A (en) * | 2022-07-22 | 2022-10-11 | 中山大学孙逸仙纪念医院 | Intelligent ultrasonic examination and diagnosis method and system |
CN115429327A (en) * | 2022-08-23 | 2022-12-06 | 安徽医科大学第一附属医院 | Thyroid full-automatic ultrasonic intelligent scanning method and platform |
CN115670515A (en) * | 2022-10-26 | 2023-02-03 | 华南理工大学 | Ultrasonic robot thyroid detection system based on deep learning |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3373820A4 (en) * | 2015-11-10 | 2019-06-26 | Exact Imaging Inc. | A system comprising indicator features in high-resolution micro-ultrasound images |
AU2017281281B2 (en) * | 2016-06-20 | 2022-03-10 | Butterfly Network, Inc. | Automated image acquisition for assisting a user to operate an ultrasound device |
US20200194117A1 (en) * | 2018-12-13 | 2020-06-18 | University Of Maryland, College Park | Systems, methods, and media for remote trauma assessment |
-
2023
- 2023-10-08 CN CN202311292889.0A patent/CN117017355B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014207642A1 (en) * | 2013-06-28 | 2014-12-31 | Koninklijke Philips N.V. | Ultrasound acquisition feedback guidance to a target view |
CN113855067A (en) * | 2021-08-23 | 2021-12-31 | 谈斯聪 | Visual image and medical image fusion recognition and autonomous positioning scanning method |
CN115089212A (en) * | 2022-05-08 | 2022-09-23 | 中南大学湘雅二医院 | Three-dimensional vision-guided automatic neck ultrasonic scanning method and system for mechanical arm |
CN115153634A (en) * | 2022-07-22 | 2022-10-11 | 中山大学孙逸仙纪念医院 | Intelligent ultrasonic examination and diagnosis method and system |
CN115429327A (en) * | 2022-08-23 | 2022-12-06 | 安徽医科大学第一附属医院 | Thyroid full-automatic ultrasonic intelligent scanning method and platform |
CN115670515A (en) * | 2022-10-26 | 2023-02-03 | 华南理工大学 | Ultrasonic robot thyroid detection system based on deep learning |
Also Published As
Publication number | Publication date |
---|---|
CN117017355A (en) | 2023-11-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021226891A1 (en) | 3d printing device and method based on multi-axis linkage control and machine visual feedback measurement | |
EP3477589B1 (en) | Method of processing medical image, and medical image processing apparatus performing the method | |
CN101919685B (en) | Tongue diagnosis intelligent control and diagnosis system | |
US20230042756A1 (en) | Autonomous mobile grabbing method for mechanical arm based on visual-haptic fusion under complex illumination condition | |
CN112270993B (en) | Ultrasonic robot online decision-making method and system taking diagnosis result as feedback | |
CN112206006A (en) | Intelligent auxiliary identification equipment and method for autonomously evaluating benign and malignant thyroid nodules | |
CN113766997A (en) | Method for guiding a robot arm, guiding system | |
CN112998749A (en) | Automatic ultrasonic inspection system based on visual servoing | |
CN108161930A (en) | A kind of robot positioning system of view-based access control model and method | |
CN117017355B (en) | Thyroid autonomous scanning system based on multi-modal generation type dialogue | |
CN112132805B (en) | Ultrasonic robot state normalization method and system based on human body characteristics | |
CN109079777B (en) | Manipulator hand-eye coordination operation system | |
CN113243933A (en) | Remote ultrasonic diagnosis system and use method | |
CN113066111A (en) | Automatic positioning method for cardiac mitral valve vertex based on CT image | |
He et al. | Puncture site decision method for venipuncture robot based on near-infrared vision and multiobjective optimization | |
Zhou et al. | A VS ultrasound diagnostic system with kidney image evaluation functions | |
CN115869013A (en) | Blood vessel positioning and navigation method for blood vessel ultrasonic autonomous scanning | |
WO2022059539A1 (en) | Computer program, information processing method, and information processing device | |
Garrote et al. | Reinforcement learning motion planning for an EOG-centered robot assisted navigation in a virtual environment | |
Tian et al. | A Vision-Based Target Localization Method for Robot-Assisted Sonography | |
CN116687452B (en) | Early pregnancy fetus ultrasonic autonomous scanning method, system and equipment | |
CN115570574B (en) | Auxiliary remote control method, system, device and medium for remote ultrasonic robot | |
CN113319854B (en) | Visual demonstration method and system for bath robot | |
Li et al. | Automatic Robotic Scanning for real-time 3D Ultrasound Reconstruction in Spine Surgery | |
CN117773959A (en) | Clinical intelligent AI auxiliary robot |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |