CN117314932A - Token pyramid-based pancreatic bile duct segmentation method, model and storage medium - Google Patents
Token pyramid-based pancreatic bile duct segmentation method, model and storage medium Download PDFInfo
- Publication number
- CN117314932A CN117314932A CN202311169169.5A CN202311169169A CN117314932A CN 117314932 A CN117314932 A CN 117314932A CN 202311169169 A CN202311169169 A CN 202311169169A CN 117314932 A CN117314932 A CN 117314932A
- Authority
- CN
- China
- Prior art keywords
- bile duct
- pancreatic bile
- pancreatic
- scale
- segmentation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 210000000013 bile duct Anatomy 0.000 title claims abstract description 101
- 230000011218 segmentation Effects 0.000 title claims abstract description 70
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000012549 training Methods 0.000 claims abstract description 14
- 230000008447 perception Effects 0.000 claims abstract description 12
- 230000010354 integration Effects 0.000 claims description 28
- 238000010606 normalization Methods 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 12
- 238000011176 pooling Methods 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 8
- 238000007781 pre-processing Methods 0.000 claims description 6
- 230000002776 aggregation Effects 0.000 claims description 3
- 238000004220 aggregation Methods 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 2
- 230000003190 augmentative effect Effects 0.000 claims 1
- 238000002627 tracheal intubation Methods 0.000 abstract description 7
- 238000012545 processing Methods 0.000 abstract description 6
- 238000013434 data augmentation Methods 0.000 abstract description 3
- 238000013528 artificial neural network Methods 0.000 abstract description 2
- 238000010276 construction Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 7
- 210000000277 pancreatic duct Anatomy 0.000 description 6
- 230000009471 action Effects 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 210000002445 nipple Anatomy 0.000 description 3
- 230000002183 duodenal effect Effects 0.000 description 2
- 238000007459 endoscopic retrograde cholangiopancreatography Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 208000016222 Pancreatic disease Diseases 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000013189 cholangiography Methods 0.000 description 1
- 210000001953 common bile duct Anatomy 0.000 description 1
- 239000002872 contrast media Substances 0.000 description 1
- 238000001816 cooling Methods 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 210000000496 pancreas Anatomy 0.000 description 1
- 208000024691 pancreas disease Diseases 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- NTHWMYGWWRZVTN-UHFFFAOYSA-N sodium silicate Chemical compound [Na+].[Na+].[O-][Si]([O-])=O NTHWMYGWWRZVTN-UHFFFAOYSA-N 0.000 description 1
- 210000005070 sphincter Anatomy 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/098—Distributed learning, e.g. federated learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Biodiversity & Conservation Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method, a model and a storage medium for dividing a pancreatic bile duct based on a token pyramid, which comprise construction of a pancreatic bile duct data set and data augmentation; training a pre-constructed pancreatic bile duct segmentation model; training and judging the pancreatic bile duct data; training new pancreatic bile duct data by utilizing a final pancreatic bile duct segmentation model to obtain a final segmentation result; from the image processing point of view, the invention provides a novel characteristic pyramid structure which can dynamically integrate local and global dependency relations, guide the neural network to more accurately output the pancreatic bile duct characteristics with scale perception, further improve the generalization capability of the model and effectively assist doctors to cope with the difficult problem of blind intubation.
Description
Technical Field
The invention relates to the technical field of image processing of deep learning, in particular to a token pyramid-based pancreatic bile duct segmentation method, a token pyramid-based pancreatic bile duct segmentation model and a token pyramid-based pancreatic bile duct segmentation storage medium.
Background
Endoscopic retrograde cholangiopancreatography (endoscopic retrograde cholangiopancreatography, ERCP) is an important tool for the treatment of biliary and pancreatic diseases. During ERCP, the endoscope is advanced to the duodenal segment to find the duodenal papilla, and a sphincter cutter or catheter is used to perform intubation into the common bile duct or pancreatic duct.
However, ERCP is difficult to operate and has a long learning time, and a beginner is usually required to perform an operation under the supervision and guidance of a doctor with abundant experience. Among these, the critical step is intubation, the correct intubation being critical to the success of the procedure. However, anatomical variations in the nipple, uncertainty in the running of bile and pancreatic ducts behind the nipple, blind insertion by the physician while intubating (the nipple is visible only under direct view of the endoscope, and the fluoroscopic judgment is opened after insertion), etc., can lead to problems with "blind" intubations such as prolonged intubate, repeated intubations, erroneous entry (e.g., the bile duct should be entered but the pancreatic duct is entered and contrast agent injected), etc., and are highly correlated with the occurrence of surgical complications.
Disclosure of Invention
The invention aims to provide a method for segmenting a pancreatic bile duct based on a token pyramid, a model and a storage medium. The network can divide the pancreatic bile duct, thereby providing judgment basis for doctors.
In order to achieve the above purpose, the present invention provides the following technical solutions:
in one aspect, a method for pancreatic bile duct segmentation based on token pyramid includes
Collecting pancreatic bile duct data, and preprocessing the pancreatic bile duct data to obtain a pancreatic bile duct data set;
training a pre-constructed pancreatic bile duct segmentation model by using a pancreatic bile duct data set;
inputting the image of the pancreatic bile duct to be segmented into a trained pancreatic bile duct segmentation model to obtain a pancreatic bile duct segmentation map.
In a preferred embodiment, the pancreatic bile duct data preprocessing comprises three-dimensional labeling of the pancreatic bile duct data, and normalization processing and data augmentation are carried out on the labeled pancreatic bile duct data.
In a preferred embodiment, training a pre-constructed pancreatic bile duct segmentation model with a pancreatic bile duct dataset comprises:
generating a series of local scale tokens through a pancreatic bile duct data set, assimilating all scale tokens to obtain a feature pyramid, and extracting global scale perception semantics G from the feature pyramid;
the local scale token generates local attention features through a convolution layer and batch normalization;
the global scale perception semantic G sequentially generates semantic weights after up-sampling, convolution, batch normalization and sigmoid layers;
the local attention feature and the semantic weight are multiplied element by element to obtain a gating integration feature F i ;
Integrating different scale gating features F i And (3) obtaining a final pancreatic bile duct data segmentation map through convolution and normalization after the high-scale characteristic information and the low-scale characteristic information are adjusted to be consistent.
In a preferred embodiment, assimilating all scale tokens to obtain a feature pyramid comprises:
a series of local scale tokens { T } 1 ,T 2 ,…,T N Average pooling to the same target size;
using filter dynamics L i To modulate the multi-scale features to obtain modulation features T of the ith layer i' The calculation formula is as follows: ti (Ti) ' =pool(T i )⊙L i Wherein pool is average pooling, L i Is a learnable filter;
adopting cascading multi-scale dynamic feature aggregation to obtain a feature pyramid Z,the calculation formula is as follows: z=concat (T 1' ,T 2' ,…,T N' ) Wherein T is 1' ,T 2' ,...,T N' The modulation characteristics are 1 st, 2 nd, … th and N th, and Concat is the characteristic and spliced element by element.
In a preferred embodiment, the integration feature F is gated i The calculation formula is as follows:
wherein the local scale token { T } 1 ,T 2 ,…,T N Global semantics G, conv_bn is a 1 x 1 convolution layer and batch normalization. upsampling is an upsampling. F (F) i ' is a local feature subjected to semantic weight modulation,for multiplication element by element>Is added element by element.
In a preferred embodiment, after outputting the final pancreatic bile duct data segmentation map, training the model by using a dice loss sparse and cross entropy loss function to obtain a final pancreatic bile duct segmentation model.
In a second aspect, a token pyramid-based pancreatic bile duct segmentation model includes:
the token pyramid transducer module is used for extracting and splicing different scale feature layers and capturing context-related scale perception information; comprising a plurality of encoded blocks and a plurality of stacked Transformer blocks, wherein each encoded block comprises a plurality of three-dimensional volumes and at least one maximum pooling;
the pancreatic bile duct gating integration module is used for dynamically fusing the local token and the global semantic information through a gating structure;
and the pancreatic bile duct global integration segmentation module is used for effectively fusing segmentation results of decoding blocks with different scales.
In a third aspect, an electronic device comprises a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the pancreatic bile duct segmentation method described above when executing the computer program.
In a fourth aspect, a computer readable storage medium stores a computer program which, when executed by a processor, implements the steps of the pancreatic bile duct segmentation method described above.
Compared with the prior art, the invention has the following beneficial effects:
(1) According to the invention, by introducing the token pyramid method, the features of the pancreatic bile duct data are deeply enriched, and tokens from different scales are fused into input, so that highly enriched scale perception semantic information is obtained.
(2) The invention utilizes the outstanding capability of the transducer in the aspect of remote self-attention to construct a strong layering characteristic system, which has a vital role in the field of pancreatic bile duct segmentation.
(3) The invention introduces a pancreatic bile duct gating integration module, and precisely controls the information transmission direction through an intelligent gating function. Meanwhile, the method effectively merges the local features and the global features under each scale, thereby skillfully avoiding the loss of feature information.
(4) From the view point of image processing, the invention provides a novel characteristic pyramid structure which can dynamically integrate local and global dependency relations and guide a neural network to more accurately output the pancreatic bile duct characteristics with scale perception. This further enhances the generalization ability of the model, strongly assisting the physician in coping with "blind" cannula challenges.
(5) Aiming at the model which has completed training, the accurate segmentation of the pancreatic bile duct can be rapidly realized, not only the manpower and material resources required by marking the pancreatic bile duct are saved, but also the learning efficiency of Endoscopic Retrograde Cholangiography (ERCP) is greatly improved, the success rate of intubation is further improved, and the incidence rate of operation complications is reduced.
Drawings
For a clearer description of embodiments of the invention or of the prior art, the drawings that are necessary for the description of the embodiments or of the prior art will be briefly described, it being apparent that the drawings in the description below are some of the embodiments of the invention and that, without the inventive effort, further drawings may be obtained according to these drawings, for a person skilled in the art, in which:
fig. 1 is a flowchart of a pancreatic bile duct segmentation method according to the present embodiment.
Fig. 2 is a schematic diagram of a token pyramid transducer module provided in this embodiment.
Fig. 3 is a schematic diagram of a pancreatic bile duct gating integration module according to the present embodiment.
Fig. 4 is a schematic diagram of a pancreatic bile duct global integration segmentation module provided in the present embodiment.
Fig. 5 is a schematic diagram of an electronic device according to the present embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to fig. 1 to 3, and the described embodiments should not be construed as limiting the present invention, and all other embodiments obtained by those skilled in the art without making inventive efforts are within the scope of protection of the present invention.
FIG. 1 is a flow chart of a method of pancreatic bile duct segmentation based on a token pyramid; the method comprises the following steps:
firstly, collecting data of a labeled pancreatic bile duct, labeling the data, and obtaining a pancreatic bile duct data set by utilizing random rotation, random inversion and random contrast enhancement;
constructing a pancreatic duct segmentation model of a token pyramid-based transducer gating network, wherein the model comprises a token pyramid transducer module, a pancreatic duct gating integration module and a pancreatic duct global integration segmentation module;
and thirdly, realizing automatic segmentation of the pancreatic bile duct by using the pancreatic bile duct segmentation model of the token pyramid-based transducer gating network constructed in the steps one to two.
In a preferred embodiment, in the first step, the data set includes a pancreatic bile duct and a label data set thereof, and the three-dimensional labels of the pancreatic bile duct are manually marked, and the pancreatic bile duct data are read, normalized and amplified, and the data amplification includes random rotation, random inversion and random contrast enhancement.
In a preferred embodiment, in the second step, the token pyramid transducer module includes extraction and concatenation of feature layers of different scales, and the transducer structure is used to capture context-related scale-aware information; the pancreatic bile duct gating integration module comprises dynamic fusion of local tokens and global semantic information through a gating structure; the pancreatic bile duct global integration segmentation module comprises effective fusion of segmentation results of decoding blocks with different scales.
In a preferred embodiment, the specific construction process of the segmentation model is as follows:
step 1: the data volume of the model is enhanced by preprocessing the pancreatic bile duct data set through random rotation, random inversion, random contrast enhancement and the like, so that the overfitting capacity of the model is reduced, the generalization of the model is enhanced, and then the model is sent into a segmentation network.
Step 2: the token pyramid module contains 4 coded blocks, where each module contains 2 three-dimensional convolution and one max pooling operation. The convolutions of the 4 modules are spliced together to form a token pyramid, so that rich scale perception semantic information is obtained. Specifically, the input image is pancreatic bile duct X as input, our token Jin Dada first passes the image through 4 encoded blocks, where each encoded block contains 2 three-dimensional convolution and one max pooling operation, and generates a series of scale tokens { T ] 1 ,T 2 ,...,T N Where N is the total number of coded blocks. Then we will run a series of scale tokens { T ] 1 ,T 2 ,…,T N Average pooling (pool) to the same target size, e.gCoding block bisection for each scaleThe contribution values of the cuts are different, for this purpose we use a learning filter dynamics L i The multi-scale features are modulated, and the final feature pyramid Z is obtained through cascading multi-scale dynamic feature aggregation, and the calculation formula is as follows:
step 3: feature pyramids are fed into several stacked transfomer blocks to extract a global scale-aware semantic extractor, in the present invention the number of transfomers is 2. The transducer consists of a multi-head attention module, a feed forward network and a recursive connection.
Step 4: the pancreas bile duct gating integration module obtains a local scale token { T }, which is obtained by a token pyramid 1 ,T 2 ,…,T N Global semantics G obtained by the } and the transducer as input. Wherein T is locally marked i Local features of interest are generated by 1 x 1 convolution layer transfer and batch normalization. The global semantics are up-sampled and then input to a 1 multiplied by 1 convolution layer, and then the semantics weights are generated through a batch normalization layer and a sigmoid layer. Wherein the semantic weight and the local attention feature are multiplied element by element and then are subjected to 1X 1 convolution layer with global semantics, and then the global attention features subjected to batch normalization are added element by element, so that the gating integrated feature F with local and global dependency relationship is obtained i The calculation formula is as follows:
wherein, conv_bn is 1×1×1 convolution layer and batch normalization. upsampling is an upsampling. F (F) i ' is a local feature subjected to semantic weight modulation,for multiplication element by element>Is added element by element.
Step 5: in order to integrate gaps of multi-scale gating integration features, the invention sends gating integration features with different scales into a pancreatic bile duct global integration segmentation module. The module upsamples the high-scale feature information to be consistent with the low-scale feature information, then through 2 3 x 3 convolution layers and batch normalization, and obtaining a final segmentation map.
Step 7: the input segmentation map is modeled with a dice loss sparsity and a cross entropy loss function (dice loss function is the similarity between two segmentation samples, and cross entropy loss function is the accuracy of classification of each pixel point in the two samples).
Step 7: on the model prediction after training, the pancreatic bile duct segmentation result is automatically obtained according to the pre-trained model.
In a preferred embodiment, the fourth step specifically includes the following steps:
step 41, data collection and augmentation;
step 42, network training;
step 43, new data prediction and model evaluation.
In step 41, preprocessing methods such as random rotation, random inversion, random contrast enhancement and the like are used for amplifying the pancreatic bile duct data.
Wherein, in step 42, the amplified pancreatic bile duct data, 128 x 128 is randomly cropped from the image, and (5) normalizing and sending into a model. The model was trained using an SGD optimizer with an initial learning rate of 0.001 and a weight decay of 2e-4. The invention uses the ReduceLROnPlateau mechanism, the coefficient is 0.5, the endurance and cooling time is 3, and the minimum learning rate is 1e-8. During training, batch size was set to 2 and num-works was set to 12, and a total of 120 iterative learning was performed per experiment. After each iteration learning, the segmentation model judges the evaluation result of segmentation, if the current error is smaller than the error of the previous iteration, the current segmentation model is saved, and then training is continued until the maximum iteration times are reached.
In step 43, a plurality of loss integration is used for measurement, and a segmentation model with an optimal evaluation index is stored.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic of each process, and should not limit the implementation process of the embodiment of the present application in any way.
Fig. 2-4 are schematic diagrams of a token pyramid transducer module, a pancreatic bile duct gating integration module, and a pancreatic bile duct global integration segmentation module provided in this embodiment, where the pancreatic bile duct segmentation model includes: the token pyramid transducer module is used for extracting and splicing different scale feature layers and capturing context-related scale perception information; comprising a plurality of encoded blocks and a plurality of stacked Transformer blocks, wherein each encoded block comprises a plurality of three-dimensional volumes and at least one maximum pooling;
the pancreatic bile duct gating integration module is used for dynamically fusing the local token and the global semantic information through a gating structure; and the pancreatic bile duct global integration segmentation module is used for effectively fusing segmentation results of decoding blocks with different scales.
The embodiment provides a pancreatic bile duct segmentation model of a Transformer gating network based on a token pyramid module, a Transformer module, a pancreatic bile duct gating integration module and a pancreatic bile duct global integration segmentation module. The token pyramid rapidly generates pyramid features through multi-scale features, so that scale semantic perception is obtained, and three-dimensional more discernable features can be provided for the pancreatic bile duct. The transducer module can sense semantic information in a scale as input, learn long-distance information and improve the overall information capturing capability of the pancreatic bile duct. The pancreatic bile duct gating integration module provides gating information according to the segmentation target, controls the flow of the information, and enables the information favorable for segmentation to flow to the segmentation module. The pancreatic bile duct global integration segmentation module is used for outputting segmentation from different scales and integrating the segmentation information step by step so as to improve generalization of the model.
Fig. 5 is a schematic diagram of an electronic device according to the present embodiment; the electronic device comprises a memory, a processor, and a computer program stored in the memory and capable of running on the processor, wherein the processor executes the computer program to realize the steps of the pancreatic bile duct segmentation method; alternatively, the processor may perform the functions of the modules/units in the above-described apparatus embodiments when executing the computer program.
The processor may be a central processing unit (Central Processing Unit, CPU) or other general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like.
The memory may be an internal storage unit of the electronic device, for example, a hard disk or a memory of the electronic device. The memory may also be an external storage device of the electronic device, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device. The memory may also include both internal storage units and external storage devices of the electronic device. The memory is used to store computer programs and other programs and data required by the electronic device.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other manners as well. The apparatus embodiments described above are merely illustrative, for example, of the flowcharts and block diagrams in the figures that illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present invention may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention. It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (9)
1. A pancreatic bile duct segmentation method based on a token pyramid is characterized by comprising the following steps of
Collecting pancreatic bile duct data, and preprocessing the pancreatic bile duct data to obtain a pancreatic bile duct data set;
training a pre-constructed pancreatic bile duct segmentation model by using a pancreatic bile duct data set;
inputting the image of the pancreatic bile duct to be segmented into a trained pancreatic bile duct segmentation model to obtain a pancreatic bile duct segmentation map.
2. The method of claim 1, wherein the pre-processing of the pancreatic bile duct data comprises three-dimensional labeling of the pancreatic bile duct data, normalizing the labeled pancreatic bile duct data and augmenting the data.
3. The method of pancreatic bile duct segmentation according to claim 1, wherein training the pre-constructed pancreatic bile duct segmentation model with a pancreatic bile duct dataset comprises:
generating a series of local scale tokens through a pancreatic bile duct data set, assimilating all scale tokens to obtain a feature pyramid, and extracting global scale perception semantics G from the feature pyramid;
the local scale token generates local attention features through a convolution layer and batch normalization;
the global scale perception semantic G sequentially generates semantic weights after up-sampling, convolution, batch normalization and sigmoid layers;
the local attention feature and the semantic weight are multiplied element by element to obtain a gating integration feature F i ;
Integrating different scale gating features F i And (3) obtaining a final pancreatic bile duct data segmentation map through convolution and normalization after the high-scale characteristic information and the low-scale characteristic information are adjusted to be consistent.
4. The method of pancreatic bile duct segmentation according to claim 3, wherein assimilating all scale tokens to obtain a feature pyramid comprises:
a series of local scale tokens { T } 1 ,T 2 ,…,T N Average pooling to the same target size;
using filter dynamics L i To modulate the multi-scale features to obtain modulation features T of the ith layer i′ The calculation formula is as follows: t (T) i′ =pool(T i )⊙L i Wherein pool is average pooling, L i Is a learnable filter;
adopting cascading multi-scale dynamic feature aggregation to obtain a feature pyramid Z, wherein the calculation formula is as follows: z=concat (T 1′ ,T 2′ ,…T N′ ) Wherein T is 1' ,T 2' ,…,T N' And is 1 st, 2 nd.
5. The method of claim 3, wherein the feature F is a gating integration feature i The calculation formula is as follows:
wherein the local scale token { T } 1 ,T 2 ,…,T N All }, allThe office semantic meaning G, conv_bn is 1×1×1 convolution layer and batch normalization. upsampling is an upsampling. F (F) i ' is a local feature subjected to semantic weight modulation,for multiplication element by element>Is added element by element.
6. The method of claim 3, wherein the final pancreatic bile duct segmentation model is obtained by training the model with a race loss sparse and cross entropy loss function after outputting the final pancreatic bile duct data segmentation map.
7. A segmentation model for a token pyramid-based segmentation method of the pancreatic bile duct according to any of claims 1-6, comprising
The token pyramid transducer module is used for extracting and splicing different scale feature layers and capturing context-related scale perception information; comprising a plurality of encoded blocks and a plurality of stacked Transformer blocks, wherein each encoded block comprises a plurality of three-dimensional volumes and at least one maximum pooling;
the pancreatic bile duct gating integration module is used for dynamically fusing the local token and the global semantic information through a gating structure;
and the pancreatic bile duct global integration segmentation module is used for effectively fusing segmentation results of decoding blocks with different scales.
8. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 7 when the computer program is executed.
9. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311169169.5A CN117314932B (en) | 2023-09-12 | 2023-09-12 | Token pyramid-based pancreatic bile duct segmentation method, model and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311169169.5A CN117314932B (en) | 2023-09-12 | 2023-09-12 | Token pyramid-based pancreatic bile duct segmentation method, model and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117314932A true CN117314932A (en) | 2023-12-29 |
CN117314932B CN117314932B (en) | 2024-06-07 |
Family
ID=89287500
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311169169.5A Active CN117314932B (en) | 2023-09-12 | 2023-09-12 | Token pyramid-based pancreatic bile duct segmentation method, model and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117314932B (en) |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130123934A1 (en) * | 2011-05-13 | 2013-05-16 | Riad Azar | Grooved pancreatic stent |
CN109063710A (en) * | 2018-08-09 | 2018-12-21 | 成都信息工程大学 | Based on the pyramidal 3D CNN nasopharyngeal carcinoma dividing method of Analysis On Multi-scale Features |
CN109325534A (en) * | 2018-09-22 | 2019-02-12 | 天津大学 | A kind of semantic segmentation method based on two-way multi-Scale Pyramid |
CN113762395A (en) * | 2021-09-09 | 2021-12-07 | 深圳大学 | Pancreatic bile duct type ampulla carcinoma classification model generation method and image classification method |
WO2022051344A1 (en) * | 2020-09-01 | 2022-03-10 | The Research Foundation For The State University Of New York | System and method for virtual pancreatography pipeline |
CN114693930A (en) * | 2022-03-31 | 2022-07-01 | 福州大学 | Example segmentation method and system based on multi-scale features and context attention |
CN115239637A (en) * | 2022-06-28 | 2022-10-25 | 中国科学院深圳先进技术研究院 | Automatic segmentation method, system, terminal and storage medium for CT pancreatic tumors |
CN115797931A (en) * | 2023-02-13 | 2023-03-14 | 山东锋士信息技术有限公司 | Remote sensing image semantic segmentation method based on double-branch feature fusion |
CN116012320A (en) * | 2022-12-26 | 2023-04-25 | 南开大学 | Image segmentation method for small irregular pancreatic tumors based on deep learning |
US20230148847A1 (en) * | 2021-11-18 | 2023-05-18 | Olympus Corporation | Information processing system, medical system and cannulation method |
CN116229461A (en) * | 2023-01-31 | 2023-06-06 | 西南大学 | Indoor scene image real-time semantic segmentation method based on multi-scale refinement |
CN116363368A (en) * | 2023-04-23 | 2023-06-30 | 云南电网有限责任公司电力科学研究院 | Image semantic segmentation method and device based on convolutional neural network |
CN116433903A (en) * | 2023-04-03 | 2023-07-14 | 南昌智能新能源汽车研究院 | Instance segmentation model construction method, system, electronic equipment and storage medium |
-
2023
- 2023-09-12 CN CN202311169169.5A patent/CN117314932B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130123934A1 (en) * | 2011-05-13 | 2013-05-16 | Riad Azar | Grooved pancreatic stent |
CN109063710A (en) * | 2018-08-09 | 2018-12-21 | 成都信息工程大学 | Based on the pyramidal 3D CNN nasopharyngeal carcinoma dividing method of Analysis On Multi-scale Features |
CN109325534A (en) * | 2018-09-22 | 2019-02-12 | 天津大学 | A kind of semantic segmentation method based on two-way multi-Scale Pyramid |
WO2022051344A1 (en) * | 2020-09-01 | 2022-03-10 | The Research Foundation For The State University Of New York | System and method for virtual pancreatography pipeline |
CN113762395A (en) * | 2021-09-09 | 2021-12-07 | 深圳大学 | Pancreatic bile duct type ampulla carcinoma classification model generation method and image classification method |
US20230148847A1 (en) * | 2021-11-18 | 2023-05-18 | Olympus Corporation | Information processing system, medical system and cannulation method |
CN114693930A (en) * | 2022-03-31 | 2022-07-01 | 福州大学 | Example segmentation method and system based on multi-scale features and context attention |
CN115239637A (en) * | 2022-06-28 | 2022-10-25 | 中国科学院深圳先进技术研究院 | Automatic segmentation method, system, terminal and storage medium for CT pancreatic tumors |
CN116012320A (en) * | 2022-12-26 | 2023-04-25 | 南开大学 | Image segmentation method for small irregular pancreatic tumors based on deep learning |
CN116229461A (en) * | 2023-01-31 | 2023-06-06 | 西南大学 | Indoor scene image real-time semantic segmentation method based on multi-scale refinement |
CN115797931A (en) * | 2023-02-13 | 2023-03-14 | 山东锋士信息技术有限公司 | Remote sensing image semantic segmentation method based on double-branch feature fusion |
CN116433903A (en) * | 2023-04-03 | 2023-07-14 | 南昌智能新能源汽车研究院 | Instance segmentation model construction method, system, electronic equipment and storage medium |
CN116363368A (en) * | 2023-04-23 | 2023-06-30 | 云南电网有限责任公司电力科学研究院 | Image semantic segmentation method and device based on convolutional neural network |
Non-Patent Citations (2)
Title |
---|
MARTIN GARANCE ET AL.: "Instruments Segmentation in X-ray Fluoroscopic Images for Endoscopic Retrograde Cholangio Pancreatography", 《 STUDIES IN HEALTH TECHNOLOGY AND INFORMATICS》, vol. 294, 25 May 2022 (2022-05-25) * |
陈芝涛: "基于多尺度上下文特征学习的胆管及胆结石图像分割", 《现代计算机》, vol. 27, no. 25, 5 September 2021 (2021-09-05) * |
Also Published As
Publication number | Publication date |
---|---|
CN117314932B (en) | 2024-06-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11580646B2 (en) | Medical image segmentation method based on U-Net | |
CN109471895B (en) | Electronic medical record phenotype extraction and phenotype name normalization method and system | |
CN109711463B (en) | Attention-based important object detection method | |
Kang et al. | Depth-adaptive deep neural network for semantic segmentation | |
CN111316281A (en) | Semantic classification of numerical data in natural language context based on machine learning | |
CN112418209B (en) | Character recognition method and device, computer equipment and storage medium | |
US11935213B2 (en) | Laparoscopic image smoke removal method based on generative adversarial network | |
CN110163117B (en) | Pedestrian re-identification method based on self-excitation discriminant feature learning | |
Oktay | Human identification with dental panoramic radiographic images | |
CN112163092B (en) | Entity and relation extraction method, system, device and medium | |
CN112613471B (en) | Face living body detection method, device and computer readable storage medium | |
CN114358203A (en) | Training method and device for image description sentence generation module and electronic equipment | |
CN114299082A (en) | New coronary pneumonia CT image segmentation method, device and storage medium | |
CN114821770B (en) | Cross-modal pedestrian re-identification method, system, medium and device from text to image | |
CN109493931B (en) | Medical record file encoding method, server and computer readable storage medium | |
CN117314932B (en) | Token pyramid-based pancreatic bile duct segmentation method, model and storage medium | |
Lin et al. | Contrastive pre-training and linear interaction attention-based transformer for universal medical reports generation | |
CN117012370A (en) | Multi-mode disease auxiliary reasoning system, method, terminal and storage medium | |
CN114881038B (en) | Chinese entity and relation extraction method and device based on span and attention mechanism | |
CN116450829A (en) | Medical text classification method, device, equipment and medium | |
CN116109980A (en) | Action recognition method based on video text matching | |
CN111971686A (en) | Automatic generation of training data sets for object recognition | |
US11587345B2 (en) | Image identification device, method for performing semantic segmentation, and storage medium | |
CN115455969A (en) | Medical text named entity recognition method, device, equipment and storage medium | |
CN115359492A (en) | Text image matching model training method, picture labeling method, device and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |