CN117314932A - Token pyramid-based pancreatic bile duct segmentation method, model and storage medium - Google Patents

Token pyramid-based pancreatic bile duct segmentation method, model and storage medium Download PDF

Info

Publication number
CN117314932A
CN117314932A CN202311169169.5A CN202311169169A CN117314932A CN 117314932 A CN117314932 A CN 117314932A CN 202311169169 A CN202311169169 A CN 202311169169A CN 117314932 A CN117314932 A CN 117314932A
Authority
CN
China
Prior art keywords
bile duct
pancreatic bile
pancreatic
scale
segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311169169.5A
Other languages
Chinese (zh)
Other versions
CN117314932B (en
Inventor
曾宪晖
蒋卫丽
袁湘蕾
李佳文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
West China No 4 Hospital Of Sichuan University West China Occupational Hospital Of Sichuan University
Original Assignee
West China No 4 Hospital Of Sichuan University West China Occupational Hospital Of Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by West China No 4 Hospital Of Sichuan University West China Occupational Hospital Of Sichuan University filed Critical West China No 4 Hospital Of Sichuan University West China Occupational Hospital Of Sichuan University
Priority to CN202311169169.5A priority Critical patent/CN117314932B/en
Publication of CN117314932A publication Critical patent/CN117314932A/en
Application granted granted Critical
Publication of CN117314932B publication Critical patent/CN117314932B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method, a model and a storage medium for dividing a pancreatic bile duct based on a token pyramid, which comprise construction of a pancreatic bile duct data set and data augmentation; training a pre-constructed pancreatic bile duct segmentation model; training and judging the pancreatic bile duct data; training new pancreatic bile duct data by utilizing a final pancreatic bile duct segmentation model to obtain a final segmentation result; from the image processing point of view, the invention provides a novel characteristic pyramid structure which can dynamically integrate local and global dependency relations, guide the neural network to more accurately output the pancreatic bile duct characteristics with scale perception, further improve the generalization capability of the model and effectively assist doctors to cope with the difficult problem of blind intubation.

Description

Token pyramid-based pancreatic bile duct segmentation method, model and storage medium
Technical Field
The invention relates to the technical field of image processing of deep learning, in particular to a token pyramid-based pancreatic bile duct segmentation method, a token pyramid-based pancreatic bile duct segmentation model and a token pyramid-based pancreatic bile duct segmentation storage medium.
Background
Endoscopic retrograde cholangiopancreatography (endoscopic retrograde cholangiopancreatography, ERCP) is an important tool for the treatment of biliary and pancreatic diseases. During ERCP, the endoscope is advanced to the duodenal segment to find the duodenal papilla, and a sphincter cutter or catheter is used to perform intubation into the common bile duct or pancreatic duct.
However, ERCP is difficult to operate and has a long learning time, and a beginner is usually required to perform an operation under the supervision and guidance of a doctor with abundant experience. Among these, the critical step is intubation, the correct intubation being critical to the success of the procedure. However, anatomical variations in the nipple, uncertainty in the running of bile and pancreatic ducts behind the nipple, blind insertion by the physician while intubating (the nipple is visible only under direct view of the endoscope, and the fluoroscopic judgment is opened after insertion), etc., can lead to problems with "blind" intubations such as prolonged intubate, repeated intubations, erroneous entry (e.g., the bile duct should be entered but the pancreatic duct is entered and contrast agent injected), etc., and are highly correlated with the occurrence of surgical complications.
Disclosure of Invention
The invention aims to provide a method for segmenting a pancreatic bile duct based on a token pyramid, a model and a storage medium. The network can divide the pancreatic bile duct, thereby providing judgment basis for doctors.
In order to achieve the above purpose, the present invention provides the following technical solutions:
in one aspect, a method for pancreatic bile duct segmentation based on token pyramid includes
Collecting pancreatic bile duct data, and preprocessing the pancreatic bile duct data to obtain a pancreatic bile duct data set;
training a pre-constructed pancreatic bile duct segmentation model by using a pancreatic bile duct data set;
inputting the image of the pancreatic bile duct to be segmented into a trained pancreatic bile duct segmentation model to obtain a pancreatic bile duct segmentation map.
In a preferred embodiment, the pancreatic bile duct data preprocessing comprises three-dimensional labeling of the pancreatic bile duct data, and normalization processing and data augmentation are carried out on the labeled pancreatic bile duct data.
In a preferred embodiment, training a pre-constructed pancreatic bile duct segmentation model with a pancreatic bile duct dataset comprises:
generating a series of local scale tokens through a pancreatic bile duct data set, assimilating all scale tokens to obtain a feature pyramid, and extracting global scale perception semantics G from the feature pyramid;
the local scale token generates local attention features through a convolution layer and batch normalization;
the global scale perception semantic G sequentially generates semantic weights after up-sampling, convolution, batch normalization and sigmoid layers;
the local attention feature and the semantic weight are multiplied element by element to obtain a gating integration feature F i
Integrating different scale gating features F i And (3) obtaining a final pancreatic bile duct data segmentation map through convolution and normalization after the high-scale characteristic information and the low-scale characteristic information are adjusted to be consistent.
In a preferred embodiment, assimilating all scale tokens to obtain a feature pyramid comprises:
a series of local scale tokens { T } 1 ,T 2 ,…,T N Average pooling to the same target size;
using filter dynamics L i To modulate the multi-scale features to obtain modulation features T of the ith layer i' The calculation formula is as follows: ti (Ti) ' =pool(T i )⊙L i Wherein pool is average pooling, L i Is a learnable filter;
adopting cascading multi-scale dynamic feature aggregation to obtain a feature pyramid Z,the calculation formula is as follows: z=concat (T 1' ,T 2' ,…,T N' ) Wherein T is 1' ,T 2' ,...,T N' The modulation characteristics are 1 st, 2 nd, … th and N th, and Concat is the characteristic and spliced element by element.
In a preferred embodiment, the integration feature F is gated i The calculation formula is as follows:
wherein the local scale token { T } 1 ,T 2 ,…,T N Global semantics G, conv_bn is a 1 x 1 convolution layer and batch normalization. upsampling is an upsampling. F (F) i ' is a local feature subjected to semantic weight modulation,for multiplication element by element>Is added element by element.
In a preferred embodiment, after outputting the final pancreatic bile duct data segmentation map, training the model by using a dice loss sparse and cross entropy loss function to obtain a final pancreatic bile duct segmentation model.
In a second aspect, a token pyramid-based pancreatic bile duct segmentation model includes:
the token pyramid transducer module is used for extracting and splicing different scale feature layers and capturing context-related scale perception information; comprising a plurality of encoded blocks and a plurality of stacked Transformer blocks, wherein each encoded block comprises a plurality of three-dimensional volumes and at least one maximum pooling;
the pancreatic bile duct gating integration module is used for dynamically fusing the local token and the global semantic information through a gating structure;
and the pancreatic bile duct global integration segmentation module is used for effectively fusing segmentation results of decoding blocks with different scales.
In a third aspect, an electronic device comprises a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the pancreatic bile duct segmentation method described above when executing the computer program.
In a fourth aspect, a computer readable storage medium stores a computer program which, when executed by a processor, implements the steps of the pancreatic bile duct segmentation method described above.
Compared with the prior art, the invention has the following beneficial effects:
(1) According to the invention, by introducing the token pyramid method, the features of the pancreatic bile duct data are deeply enriched, and tokens from different scales are fused into input, so that highly enriched scale perception semantic information is obtained.
(2) The invention utilizes the outstanding capability of the transducer in the aspect of remote self-attention to construct a strong layering characteristic system, which has a vital role in the field of pancreatic bile duct segmentation.
(3) The invention introduces a pancreatic bile duct gating integration module, and precisely controls the information transmission direction through an intelligent gating function. Meanwhile, the method effectively merges the local features and the global features under each scale, thereby skillfully avoiding the loss of feature information.
(4) From the view point of image processing, the invention provides a novel characteristic pyramid structure which can dynamically integrate local and global dependency relations and guide a neural network to more accurately output the pancreatic bile duct characteristics with scale perception. This further enhances the generalization ability of the model, strongly assisting the physician in coping with "blind" cannula challenges.
(5) Aiming at the model which has completed training, the accurate segmentation of the pancreatic bile duct can be rapidly realized, not only the manpower and material resources required by marking the pancreatic bile duct are saved, but also the learning efficiency of Endoscopic Retrograde Cholangiography (ERCP) is greatly improved, the success rate of intubation is further improved, and the incidence rate of operation complications is reduced.
Drawings
For a clearer description of embodiments of the invention or of the prior art, the drawings that are necessary for the description of the embodiments or of the prior art will be briefly described, it being apparent that the drawings in the description below are some of the embodiments of the invention and that, without the inventive effort, further drawings may be obtained according to these drawings, for a person skilled in the art, in which:
fig. 1 is a flowchart of a pancreatic bile duct segmentation method according to the present embodiment.
Fig. 2 is a schematic diagram of a token pyramid transducer module provided in this embodiment.
Fig. 3 is a schematic diagram of a pancreatic bile duct gating integration module according to the present embodiment.
Fig. 4 is a schematic diagram of a pancreatic bile duct global integration segmentation module provided in the present embodiment.
Fig. 5 is a schematic diagram of an electronic device according to the present embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to fig. 1 to 3, and the described embodiments should not be construed as limiting the present invention, and all other embodiments obtained by those skilled in the art without making inventive efforts are within the scope of protection of the present invention.
FIG. 1 is a flow chart of a method of pancreatic bile duct segmentation based on a token pyramid; the method comprises the following steps:
firstly, collecting data of a labeled pancreatic bile duct, labeling the data, and obtaining a pancreatic bile duct data set by utilizing random rotation, random inversion and random contrast enhancement;
constructing a pancreatic duct segmentation model of a token pyramid-based transducer gating network, wherein the model comprises a token pyramid transducer module, a pancreatic duct gating integration module and a pancreatic duct global integration segmentation module;
and thirdly, realizing automatic segmentation of the pancreatic bile duct by using the pancreatic bile duct segmentation model of the token pyramid-based transducer gating network constructed in the steps one to two.
In a preferred embodiment, in the first step, the data set includes a pancreatic bile duct and a label data set thereof, and the three-dimensional labels of the pancreatic bile duct are manually marked, and the pancreatic bile duct data are read, normalized and amplified, and the data amplification includes random rotation, random inversion and random contrast enhancement.
In a preferred embodiment, in the second step, the token pyramid transducer module includes extraction and concatenation of feature layers of different scales, and the transducer structure is used to capture context-related scale-aware information; the pancreatic bile duct gating integration module comprises dynamic fusion of local tokens and global semantic information through a gating structure; the pancreatic bile duct global integration segmentation module comprises effective fusion of segmentation results of decoding blocks with different scales.
In a preferred embodiment, the specific construction process of the segmentation model is as follows:
step 1: the data volume of the model is enhanced by preprocessing the pancreatic bile duct data set through random rotation, random inversion, random contrast enhancement and the like, so that the overfitting capacity of the model is reduced, the generalization of the model is enhanced, and then the model is sent into a segmentation network.
Step 2: the token pyramid module contains 4 coded blocks, where each module contains 2 three-dimensional convolution and one max pooling operation. The convolutions of the 4 modules are spliced together to form a token pyramid, so that rich scale perception semantic information is obtained. Specifically, the input image is pancreatic bile duct X as input, our token Jin Dada first passes the image through 4 encoded blocks, where each encoded block contains 2 three-dimensional convolution and one max pooling operation, and generates a series of scale tokens { T ] 1 ,T 2 ,...,T N Where N is the total number of coded blocks. Then we will run a series of scale tokens { T ] 1 ,T 2 ,…,T N Average pooling (pool) to the same target size, e.gCoding block bisection for each scaleThe contribution values of the cuts are different, for this purpose we use a learning filter dynamics L i The multi-scale features are modulated, and the final feature pyramid Z is obtained through cascading multi-scale dynamic feature aggregation, and the calculation formula is as follows:
step 3: feature pyramids are fed into several stacked transfomer blocks to extract a global scale-aware semantic extractor, in the present invention the number of transfomers is 2. The transducer consists of a multi-head attention module, a feed forward network and a recursive connection.
Step 4: the pancreas bile duct gating integration module obtains a local scale token { T }, which is obtained by a token pyramid 1 ,T 2 ,…,T N Global semantics G obtained by the } and the transducer as input. Wherein T is locally marked i Local features of interest are generated by 1 x 1 convolution layer transfer and batch normalization. The global semantics are up-sampled and then input to a 1 multiplied by 1 convolution layer, and then the semantics weights are generated through a batch normalization layer and a sigmoid layer. Wherein the semantic weight and the local attention feature are multiplied element by element and then are subjected to 1X 1 convolution layer with global semantics, and then the global attention features subjected to batch normalization are added element by element, so that the gating integrated feature F with local and global dependency relationship is obtained i The calculation formula is as follows:
wherein, conv_bn is 1×1×1 convolution layer and batch normalization. upsampling is an upsampling. F (F) i ' is a local feature subjected to semantic weight modulation,for multiplication element by element>Is added element by element.
Step 5: in order to integrate gaps of multi-scale gating integration features, the invention sends gating integration features with different scales into a pancreatic bile duct global integration segmentation module. The module upsamples the high-scale feature information to be consistent with the low-scale feature information, then through 2 3 x 3 convolution layers and batch normalization, and obtaining a final segmentation map.
Step 7: the input segmentation map is modeled with a dice loss sparsity and a cross entropy loss function (dice loss function is the similarity between two segmentation samples, and cross entropy loss function is the accuracy of classification of each pixel point in the two samples).
Step 7: on the model prediction after training, the pancreatic bile duct segmentation result is automatically obtained according to the pre-trained model.
In a preferred embodiment, the fourth step specifically includes the following steps:
step 41, data collection and augmentation;
step 42, network training;
step 43, new data prediction and model evaluation.
In step 41, preprocessing methods such as random rotation, random inversion, random contrast enhancement and the like are used for amplifying the pancreatic bile duct data.
Wherein, in step 42, the amplified pancreatic bile duct data, 128 x 128 is randomly cropped from the image, and (5) normalizing and sending into a model. The model was trained using an SGD optimizer with an initial learning rate of 0.001 and a weight decay of 2e-4. The invention uses the ReduceLROnPlateau mechanism, the coefficient is 0.5, the endurance and cooling time is 3, and the minimum learning rate is 1e-8. During training, batch size was set to 2 and num-works was set to 12, and a total of 120 iterative learning was performed per experiment. After each iteration learning, the segmentation model judges the evaluation result of segmentation, if the current error is smaller than the error of the previous iteration, the current segmentation model is saved, and then training is continued until the maximum iteration times are reached.
In step 43, a plurality of loss integration is used for measurement, and a segmentation model with an optimal evaluation index is stored.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic of each process, and should not limit the implementation process of the embodiment of the present application in any way.
Fig. 2-4 are schematic diagrams of a token pyramid transducer module, a pancreatic bile duct gating integration module, and a pancreatic bile duct global integration segmentation module provided in this embodiment, where the pancreatic bile duct segmentation model includes: the token pyramid transducer module is used for extracting and splicing different scale feature layers and capturing context-related scale perception information; comprising a plurality of encoded blocks and a plurality of stacked Transformer blocks, wherein each encoded block comprises a plurality of three-dimensional volumes and at least one maximum pooling;
the pancreatic bile duct gating integration module is used for dynamically fusing the local token and the global semantic information through a gating structure; and the pancreatic bile duct global integration segmentation module is used for effectively fusing segmentation results of decoding blocks with different scales.
The embodiment provides a pancreatic bile duct segmentation model of a Transformer gating network based on a token pyramid module, a Transformer module, a pancreatic bile duct gating integration module and a pancreatic bile duct global integration segmentation module. The token pyramid rapidly generates pyramid features through multi-scale features, so that scale semantic perception is obtained, and three-dimensional more discernable features can be provided for the pancreatic bile duct. The transducer module can sense semantic information in a scale as input, learn long-distance information and improve the overall information capturing capability of the pancreatic bile duct. The pancreatic bile duct gating integration module provides gating information according to the segmentation target, controls the flow of the information, and enables the information favorable for segmentation to flow to the segmentation module. The pancreatic bile duct global integration segmentation module is used for outputting segmentation from different scales and integrating the segmentation information step by step so as to improve generalization of the model.
Fig. 5 is a schematic diagram of an electronic device according to the present embodiment; the electronic device comprises a memory, a processor, and a computer program stored in the memory and capable of running on the processor, wherein the processor executes the computer program to realize the steps of the pancreatic bile duct segmentation method; alternatively, the processor may perform the functions of the modules/units in the above-described apparatus embodiments when executing the computer program.
The processor may be a central processing unit (Central Processing Unit, CPU) or other general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like.
The memory may be an internal storage unit of the electronic device, for example, a hard disk or a memory of the electronic device. The memory may also be an external storage device of the electronic device, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device. The memory may also include both internal storage units and external storage devices of the electronic device. The memory is used to store computer programs and other programs and data required by the electronic device.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other manners as well. The apparatus embodiments described above are merely illustrative, for example, of the flowcharts and block diagrams in the figures that illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present invention may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention. It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (9)

1. A pancreatic bile duct segmentation method based on a token pyramid is characterized by comprising the following steps of
Collecting pancreatic bile duct data, and preprocessing the pancreatic bile duct data to obtain a pancreatic bile duct data set;
training a pre-constructed pancreatic bile duct segmentation model by using a pancreatic bile duct data set;
inputting the image of the pancreatic bile duct to be segmented into a trained pancreatic bile duct segmentation model to obtain a pancreatic bile duct segmentation map.
2. The method of claim 1, wherein the pre-processing of the pancreatic bile duct data comprises three-dimensional labeling of the pancreatic bile duct data, normalizing the labeled pancreatic bile duct data and augmenting the data.
3. The method of pancreatic bile duct segmentation according to claim 1, wherein training the pre-constructed pancreatic bile duct segmentation model with a pancreatic bile duct dataset comprises:
generating a series of local scale tokens through a pancreatic bile duct data set, assimilating all scale tokens to obtain a feature pyramid, and extracting global scale perception semantics G from the feature pyramid;
the local scale token generates local attention features through a convolution layer and batch normalization;
the global scale perception semantic G sequentially generates semantic weights after up-sampling, convolution, batch normalization and sigmoid layers;
the local attention feature and the semantic weight are multiplied element by element to obtain a gating integration feature F i
Integrating different scale gating features F i And (3) obtaining a final pancreatic bile duct data segmentation map through convolution and normalization after the high-scale characteristic information and the low-scale characteristic information are adjusted to be consistent.
4. The method of pancreatic bile duct segmentation according to claim 3, wherein assimilating all scale tokens to obtain a feature pyramid comprises:
a series of local scale tokens { T } 1 ,T 2 ,…,T N Average pooling to the same target size;
using filter dynamics L i To modulate the multi-scale features to obtain modulation features T of the ith layer i′ The calculation formula is as follows: t (T) i′ =pool(T i )⊙L i Wherein pool is average pooling, L i Is a learnable filter;
adopting cascading multi-scale dynamic feature aggregation to obtain a feature pyramid Z, wherein the calculation formula is as follows: z=concat (T 1′ ,T 2′ ,…T N′ ) Wherein T is 1' ,T 2' ,…,T N' And is 1 st, 2 nd.
5. The method of claim 3, wherein the feature F is a gating integration feature i The calculation formula is as follows:
wherein the local scale token { T } 1 ,T 2 ,…,T N All }, allThe office semantic meaning G, conv_bn is 1×1×1 convolution layer and batch normalization. upsampling is an upsampling. F (F) i ' is a local feature subjected to semantic weight modulation,for multiplication element by element>Is added element by element.
6. The method of claim 3, wherein the final pancreatic bile duct segmentation model is obtained by training the model with a race loss sparse and cross entropy loss function after outputting the final pancreatic bile duct data segmentation map.
7. A segmentation model for a token pyramid-based segmentation method of the pancreatic bile duct according to any of claims 1-6, comprising
The token pyramid transducer module is used for extracting and splicing different scale feature layers and capturing context-related scale perception information; comprising a plurality of encoded blocks and a plurality of stacked Transformer blocks, wherein each encoded block comprises a plurality of three-dimensional volumes and at least one maximum pooling;
the pancreatic bile duct gating integration module is used for dynamically fusing the local token and the global semantic information through a gating structure;
and the pancreatic bile duct global integration segmentation module is used for effectively fusing segmentation results of decoding blocks with different scales.
8. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 7 when the computer program is executed.
9. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 7.
CN202311169169.5A 2023-09-12 2023-09-12 Token pyramid-based pancreatic bile duct segmentation method, model and storage medium Active CN117314932B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311169169.5A CN117314932B (en) 2023-09-12 2023-09-12 Token pyramid-based pancreatic bile duct segmentation method, model and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311169169.5A CN117314932B (en) 2023-09-12 2023-09-12 Token pyramid-based pancreatic bile duct segmentation method, model and storage medium

Publications (2)

Publication Number Publication Date
CN117314932A true CN117314932A (en) 2023-12-29
CN117314932B CN117314932B (en) 2024-06-07

Family

ID=89287500

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311169169.5A Active CN117314932B (en) 2023-09-12 2023-09-12 Token pyramid-based pancreatic bile duct segmentation method, model and storage medium

Country Status (1)

Country Link
CN (1) CN117314932B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130123934A1 (en) * 2011-05-13 2013-05-16 Riad Azar Grooved pancreatic stent
CN109063710A (en) * 2018-08-09 2018-12-21 成都信息工程大学 Based on the pyramidal 3D CNN nasopharyngeal carcinoma dividing method of Analysis On Multi-scale Features
CN109325534A (en) * 2018-09-22 2019-02-12 天津大学 A kind of semantic segmentation method based on two-way multi-Scale Pyramid
CN113762395A (en) * 2021-09-09 2021-12-07 深圳大学 Pancreatic bile duct type ampulla carcinoma classification model generation method and image classification method
WO2022051344A1 (en) * 2020-09-01 2022-03-10 The Research Foundation For The State University Of New York System and method for virtual pancreatography pipeline
CN114693930A (en) * 2022-03-31 2022-07-01 福州大学 Example segmentation method and system based on multi-scale features and context attention
CN115239637A (en) * 2022-06-28 2022-10-25 中国科学院深圳先进技术研究院 Automatic segmentation method, system, terminal and storage medium for CT pancreatic tumors
CN115797931A (en) * 2023-02-13 2023-03-14 山东锋士信息技术有限公司 Remote sensing image semantic segmentation method based on double-branch feature fusion
CN116012320A (en) * 2022-12-26 2023-04-25 南开大学 Image segmentation method for small irregular pancreatic tumors based on deep learning
US20230148847A1 (en) * 2021-11-18 2023-05-18 Olympus Corporation Information processing system, medical system and cannulation method
CN116229461A (en) * 2023-01-31 2023-06-06 西南大学 Indoor scene image real-time semantic segmentation method based on multi-scale refinement
CN116363368A (en) * 2023-04-23 2023-06-30 云南电网有限责任公司电力科学研究院 Image semantic segmentation method and device based on convolutional neural network
CN116433903A (en) * 2023-04-03 2023-07-14 南昌智能新能源汽车研究院 Instance segmentation model construction method, system, electronic equipment and storage medium

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130123934A1 (en) * 2011-05-13 2013-05-16 Riad Azar Grooved pancreatic stent
CN109063710A (en) * 2018-08-09 2018-12-21 成都信息工程大学 Based on the pyramidal 3D CNN nasopharyngeal carcinoma dividing method of Analysis On Multi-scale Features
CN109325534A (en) * 2018-09-22 2019-02-12 天津大学 A kind of semantic segmentation method based on two-way multi-Scale Pyramid
WO2022051344A1 (en) * 2020-09-01 2022-03-10 The Research Foundation For The State University Of New York System and method for virtual pancreatography pipeline
CN113762395A (en) * 2021-09-09 2021-12-07 深圳大学 Pancreatic bile duct type ampulla carcinoma classification model generation method and image classification method
US20230148847A1 (en) * 2021-11-18 2023-05-18 Olympus Corporation Information processing system, medical system and cannulation method
CN114693930A (en) * 2022-03-31 2022-07-01 福州大学 Example segmentation method and system based on multi-scale features and context attention
CN115239637A (en) * 2022-06-28 2022-10-25 中国科学院深圳先进技术研究院 Automatic segmentation method, system, terminal and storage medium for CT pancreatic tumors
CN116012320A (en) * 2022-12-26 2023-04-25 南开大学 Image segmentation method for small irregular pancreatic tumors based on deep learning
CN116229461A (en) * 2023-01-31 2023-06-06 西南大学 Indoor scene image real-time semantic segmentation method based on multi-scale refinement
CN115797931A (en) * 2023-02-13 2023-03-14 山东锋士信息技术有限公司 Remote sensing image semantic segmentation method based on double-branch feature fusion
CN116433903A (en) * 2023-04-03 2023-07-14 南昌智能新能源汽车研究院 Instance segmentation model construction method, system, electronic equipment and storage medium
CN116363368A (en) * 2023-04-23 2023-06-30 云南电网有限责任公司电力科学研究院 Image semantic segmentation method and device based on convolutional neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MARTIN GARANCE ET AL.: "Instruments Segmentation in X-ray Fluoroscopic Images for Endoscopic Retrograde Cholangio Pancreatography", 《 STUDIES IN HEALTH TECHNOLOGY AND INFORMATICS》, vol. 294, 25 May 2022 (2022-05-25) *
陈芝涛: "基于多尺度上下文特征学习的胆管及胆结石图像分割", 《现代计算机》, vol. 27, no. 25, 5 September 2021 (2021-09-05) *

Also Published As

Publication number Publication date
CN117314932B (en) 2024-06-07

Similar Documents

Publication Publication Date Title
US11580646B2 (en) Medical image segmentation method based on U-Net
CN109471895B (en) Electronic medical record phenotype extraction and phenotype name normalization method and system
CN109711463B (en) Attention-based important object detection method
Kang et al. Depth-adaptive deep neural network for semantic segmentation
CN111316281A (en) Semantic classification of numerical data in natural language context based on machine learning
CN112418209B (en) Character recognition method and device, computer equipment and storage medium
US11935213B2 (en) Laparoscopic image smoke removal method based on generative adversarial network
CN110163117B (en) Pedestrian re-identification method based on self-excitation discriminant feature learning
Oktay Human identification with dental panoramic radiographic images
CN112163092B (en) Entity and relation extraction method, system, device and medium
CN112613471B (en) Face living body detection method, device and computer readable storage medium
CN114358203A (en) Training method and device for image description sentence generation module and electronic equipment
CN114299082A (en) New coronary pneumonia CT image segmentation method, device and storage medium
CN114821770B (en) Cross-modal pedestrian re-identification method, system, medium and device from text to image
CN109493931B (en) Medical record file encoding method, server and computer readable storage medium
CN117314932B (en) Token pyramid-based pancreatic bile duct segmentation method, model and storage medium
Lin et al. Contrastive pre-training and linear interaction attention-based transformer for universal medical reports generation
CN117012370A (en) Multi-mode disease auxiliary reasoning system, method, terminal and storage medium
CN114881038B (en) Chinese entity and relation extraction method and device based on span and attention mechanism
CN116450829A (en) Medical text classification method, device, equipment and medium
CN116109980A (en) Action recognition method based on video text matching
CN111971686A (en) Automatic generation of training data sets for object recognition
US11587345B2 (en) Image identification device, method for performing semantic segmentation, and storage medium
CN115455969A (en) Medical text named entity recognition method, device, equipment and storage medium
CN115359492A (en) Text image matching model training method, picture labeling method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant