CN112734748B - Image segmentation system for hepatobiliary and biliary calculi - Google Patents

Image segmentation system for hepatobiliary and biliary calculi Download PDF

Info

Publication number
CN112734748B
CN112734748B CN202110083764.1A CN202110083764A CN112734748B CN 112734748 B CN112734748 B CN 112734748B CN 202110083764 A CN202110083764 A CN 202110083764A CN 112734748 B CN112734748 B CN 112734748B
Authority
CN
China
Prior art keywords
feature map
context
encoder
module
decoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110083764.1A
Other languages
Chinese (zh)
Other versions
CN112734748A (en
Inventor
蔡念
陈芝涛
罗智浩
何兆泉
王平
王晗
陈梅云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202110083764.1A priority Critical patent/CN112734748B/en
Publication of CN112734748A publication Critical patent/CN112734748A/en
Application granted granted Critical
Publication of CN112734748B publication Critical patent/CN112734748B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30056Liver; Hepatic

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses an image segmentation system for hepatobiliary and biliary calculi.A small target area is subjected to feature learning in a down-sampling stage and feature fusion in an up-sampling stage in a first encoder-decoder, and low-level multi-scale time domain context features are extracted in the down-sampling stage and are used for subsequent layer-by-layer feature fusion; in the second encoder-decoder module, the low-layer multi-scale time domain context characteristics and the high-layer bidirectional time domain context characteristics are fused in the up-sampling stage, and a prediction sequence diagram of the original resolution size is obtained, so that more accurate position and classification information is obtained; the segmentation network model is trained through an improved loss function, so that the attention of time domain context information among the sequence image slices is increased, and the sequence image slices are more suitable for processing the sequence images. Therefore, the technical problem that the prior art cannot simultaneously consider the precision and the high efficiency for segmenting the hepatobiliary calculus and the biliary calculus in the CT image of the hepatobiliary calculus is solved.

Description

Image segmentation system for hepatobiliary and biliary calculi
Technical Field
The application relates to the technical field of medical image segmentation, in particular to an image segmentation system for hepatobiliary and biliary calculi.
Background
Hepatobiliary lithiasis is a common and frequently encountered disease in hepatobiliary surgery, and is well developed in east Asia regions. Surgical treatment of hepatobiliary stones presents a number of difficulties and challenges, including difficulty in removing the hepatobiliary stones, among others. The calculus removing treatment of hepatobiliary calculus mainly depends on minimally invasive surgery. Lithotomy depends largely on the preoperative analysis of CT scan slices of a patient. The most important link is to segment the bile duct and the gall-stone in the CT image so as to know the specific distribution of the bile duct and the gall-stone and facilitate the surgical intervention.
At present, the technologies for segmenting the CT image of hepatobiliary lithiasis mainly include: by means of full convolution neural networks, such as U-Net networks and M-Net networks; or the CT image is segmented by a 3D convolutional network.
However, since neither the U-Net network nor the M-Net network can directly process the 3D medical image, but the 3D image is cut into a plurality of 2D image slices and then respectively sent to the convolution network, so that the related information between the slices in the 3D image is ignored, the spatial structure information in the CT data is lost, under-segmentation is easily caused, and the hepatobiliary duct and biliary duct stones in the CT image are not accurately segmented. The 3D convolution network directly processes the voxels in the CT image, so that the number of parameters in the whole network model is extremely large, the calculation load is greatly increased, and the segmentation efficiency of hepatobiliary stones and biliary calculi in the CT image is remarkably reduced.
Disclosure of Invention
The embodiment of the application provides an image segmentation system for hepatobiliary and biliary calculi, which solves the technical problem that in the prior art, the accuracy and the high efficiency cannot be considered simultaneously for hepatobiliary and biliary calculi segmentation in a CT image of hepatobiliary calculi.
In view of the above, the present application provides, in a first aspect, an image segmentation system for hepatobiliary and biliary stones, the system comprising:
segmenting the network model; the segmentation network model includes: a first encoder-decoder module, a second encoder-decoder module, an optimization module;
the first encoder-decoder module is to:
respectively extracting the features of each image of the CT sequence image to obtain a plurality of first multi-scale feature maps and high-level feature maps; the context information features of the first multi-scale feature map are subjected to learning processing again to generate a multi-scale context feature map; performing time domain context information learning on the high-level feature map to generate a high-level bidirectional context feature map; splicing the multi-scale context feature map and the high-level bidirectional context feature map to generate a first context fusion feature map; the CT sequence images are continuous and adjacent CT sequence images of hepatobiliary lithiasis;
the second encoder-decoder module is to:
splicing the multi-scale context feature map and the first context fusion feature map to generate a second context fusion feature map; performing time domain context information learning on the second context fusion feature map to generate a high-level bidirectional context fusion feature map; after the multi-scale context feature map and the high-level bidirectional context fusion feature map are connected, a sequence probability map is output;
the optimization module is configured to:
and calculating a loss function value of the sequence probability chart and a label sequence chart corresponding to the sequence probability chart based on a preset loss function, judging whether the loss function value is smaller than a preset value, if so, obtaining an optimal segmentation network model, otherwise, after updating the characteristic parameters of the segmentation network model according to the loss function value, triggering the first encoder-decoder module and the second encoder-decoder module.
Optionally, the first encoder-decoder module specifically includes: the device comprises a first encoder, a ConvLSTM module, a first BiConvLSTM module and a first decoder;
the first encoder is to: respectively extracting features of small target areas in each image of the CT sequence image to obtain a plurality of first multi-scale feature maps and high-level feature maps, sending the first multi-scale feature maps to the ConvLSTM module, and sending the high-level feature maps to the first BiConvLSTM module;
the ConvLSTM module is configured to: relearning the context information features of the first multi-scale feature map to generate a multi-scale context feature map, and sending the multi-scale context feature map to the first decoder and the second encoder-decoder module;
the first BiConvLSTM module is configured to: performing two times of time domain context information learning in opposite directions on the high-level feature map to generate a high-level bidirectional context feature map, and sending the high-level bidirectional context feature map to the first decoder;
the first decoder is to: skip-splicing the multi-scale context feature map and the high-level bi-directional context feature map to generate a first context fusion feature map, and sending the first context fusion feature map to the second encoder-decoder module.
Optionally, the second encoder-decoder module specifically includes: the second encoder, the second BiConvLSTM module and the second decoder;
the second encoder is to: performing skip splicing on the multi-scale context feature map and the first context fusion feature map to generate a second context fusion feature map, and sending the second context fusion feature map to the second BiConvLSTM module;
the second BiConvLSTM module is configured to: performing two times of time domain context information learning in opposite directions on the second context fusion feature map to generate a high-level bidirectional context fusion feature map, and sending the high-level bidirectional context fusion feature map to the second decoder;
the second decoder is to: and after the multi-scale context feature map and the high-level bidirectional context fusion feature map are connected, a sequence probability map is output through an activation function.
Optionally, the activation function is: sigmoid function.
Optionally, the method further comprises: an input module;
the input module is used for: controlling the number of sequences of the CT sequence images of consecutive and adjacent hepatobiliary lithiasis such that the CT sequence images are input to the first encoder-decoder module.
Optionally, the first encoder is composed of a number of first convolution layers and a first pooling layer, wherein a size of a gaussian convolution kernel of the first convolution layer is 1 × 1.
Optionally, the first decoder is composed of a number of first feature fusion layers, a second convolution layer, and an upsampling layer, wherein a size of a gaussian convolution kernel of the second convolution layer is 3 × 3.
Optionally, the second encoder is composed of a number of second feature fusion layers, a third convolution layer, and a second pooling layer, wherein the size of the sparse convolution kernel of the third convolution layer is 5 × 5.
Optionally, the second decoder is composed of several feature fusion layers, a fourth convolution layer, and an upper convolution layer, wherein the size of the sparse convolution kernel of the fourth convolution layer is 7 × 7.
Optionally, the preset loss function is:
Figure GDA0003556046990000041
wherein the right side of the preset loss function equation is divided into two parts;
the method comprises the following steps of firstly, obtaining a common bootstrap cross entropy loss function, wherein N and K respectively represent the number of image pixels and the number of pixel categories; in parentheses, yiJ denotes this condition belonging to class j, pijRepresents the measured probability of the ith pixel belonging to the jth class, where t is a threshold value with a value range of (0, 1)](ii) a Condition (y) in parenthesesi=j)∩(pij≤tj) When it is established, 1{ (y)i=j)∩(pij≤tj) Equal to 1, otherwiseEqual to 0;
secondly, a weighted value based on the relevance, namely w;
Figure GDA0003556046990000042
wherein n represents the number of slice frames (positive integer) of the input sequence images, and s represents the slice position at the current time (s ≦ n); c represents the similarity between two different image slices, such as the common cosine similarity and euclidean distance similarity.
According to the technical scheme, the method has the following advantages:
the application provides an image segmentation system for hepatobiliary and biliary calculi. Due to the close information connection between different image slices of the same case, aiming at the characteristic; the application designs a novel time domain context information correlation mechanism which mainly comprises a first coder-decoder module of low-level multi-scale time domain context information, a second coder-decoder module of high-level bidirectional time domain context information and an improved loss function based on sequence image slice correlation. In a first encoder-decoder, feature learning is carried out on a small target region in a down-sampling stage, feature fusion is carried out in an up-sampling stage, and low-layer multi-scale time domain context features are extracted in the down-sampling stage and are used for subsequent layer-by-layer feature fusion; in the second encoder-decoder module, the low-layer multi-scale time domain context features and the high-layer bidirectional time domain context features are fused in the up-sampling stage, and a prediction sequence diagram of the original resolution size is obtained, so that the image receptive field is increased, and more accurate position and classification information is obtained; and finally, training the segmentation network model through an improved loss function to increase the attention of time domain context information among sequence image slices, and performing numerical weight analysis on the relevance degree of the time domain context information to make the time domain context information more suitable for processing sequence images. Compared with the existing hepatobiliary duct and calculus segmentation network, the segmentation system has higher segmentation precision and efficiency, so that the technical problem that the accuracy and the high efficiency cannot be simultaneously considered for hepatobiliary duct and biliary tract calculus segmentation in a CT image of hepatobiliary duct lithiasis in the prior art is solved.
Drawings
Fig. 1 is a system architecture diagram of an image segmentation system for hepatobiliary and biliary calculi provided in an embodiment of the present application;
fig. 2 is a block diagram of a ConvLSTM module provided in an embodiment of the present application.
Detailed Description
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, fig. 1 is a system architecture diagram of an image segmentation system for hepatobiliary and biliary calculi according to an embodiment of the present application.
An image segmentation system for hepatobiliary and biliary calculi provided in an embodiment of the present application includes: segmenting the network model; the segmentation network model comprises: a first encoder-decoder module, a second encoder-decoder module, an optimization module.
In order to improve the accuracy of segmentation of gallstones and hepatobiliary ducts and effectively utilize context information among CT image slices, the invention provides a segmentation method based on deep sequence learning, and designs a novel time domain context information correlation mechanism (segmentation network model).
It should be noted that the segmentation network model is composed of two encoders and two decoders, and the optimization module is a virtual function module in the segmentation network model that calculates the loss value based on the improved loss function. As shown in fig. 1, the arrows in different gray levels in the figure are represented as: upper convolution, pooling, sparse convolution, one-way convolution, two-way convolution, feature map fusion, and the like.
The specific engineering process for segmenting each module in the network model is as follows:
the first encoder-decoder module is to:
respectively extracting the features of each image of the CT sequence image to obtain a plurality of first multi-scale feature maps and high-level feature maps; the context information features of the first multi-scale feature map are subjected to learning processing again to generate a multi-scale context feature map; performing time domain context information learning on the high-level feature map to generate a high-level bidirectional context feature map; splicing the multi-scale context feature graph and the high-level bidirectional context feature graph to generate a first context fusion feature graph; wherein, the CT sequence images are continuous and adjacent CT sequence images of hepatobiliary lithiasis.
The second encoder-decoder module is to:
splicing the multi-scale context feature graph and the first context fusion feature graph to generate a second context fusion feature graph; performing time domain context information learning on the second context fusion feature graph to generate a high-level bidirectional context fusion feature graph; and after connecting the multi-scale context feature map and the high-level bidirectional context fusion feature map, outputting a sequence probability map.
The optimization module is to:
and calculating a sequence probability graph and a loss function value of a label sequence graph corresponding to the sequence probability graph based on a preset loss function, judging whether the loss function value is smaller than a preset value, if so, obtaining an optimal segmentation network model, otherwise, after updating the characteristic parameters of the segmentation network model according to the loss function value, triggering a first coder-decoder module and a second coder-decoder module.
It should be noted that, the process of updating the feature parameters of the split network model is automatically completed by the split network model, specifically, each time the loss function value is calculated, the split network model aims to reduce the loss function value, and the Adam optimizer and the like calculate the parameter value which is larger and larger each time, and actively update the model parameters.
For the task of the sequence image, the common loss function only focuses on the information inside the 2D image slice, and gives the same weight to the image context information at different slice distances, which is not in accordance with the distribution rule of the context information of the sequence image slice. Aiming at the defects of the existing loss function, a relevance weighted bootstrap cross entropy function is designed, namely the preset loss function of the application, so that the relevance weighted bootstrap cross entropy function is more suitable for processing sequence tasks, a sequence probability graph and a loss function value optimization model of a label sequence graph corresponding to the sequence probability graph are calculated through the preset loss function, preset values which are set by technicians in the field are obtained, and therefore the optimal segmentation network model is obtained and used for segmenting hepatobiliary stones and biliary calculi in images.
Wherein the preset loss function is:
Figure GDA0003556046990000061
wherein, the right side of the preset loss function equation is divided into two parts;
the method comprises the following steps that (1) a common bootstrap cross entropy loss function is adopted, wherein N and K respectively represent the number of image pixels and the number of pixel categories; in parentheses, yiJ denotes this condition belonging to class j, pijRepresents the measured probability of the ith pixel belonging to the jth class, where t is a threshold value with a value range of (0, 1)](ii) a Condition (y) in parenthesesi=j)∩(pij≤tj) When it is established, 1{ (y)i=j)∩(pij≤tj) Equal to 1, otherwise equal to 0;
secondly, a weighted value based on the relevance, namely w;
Figure GDA0003556046990000071
wherein n represents the number of slice frames (positive integer) of the input sequence images, and s represents the slice position at the current time (s ≦ n); c represents the similarity between two different image slices, such as the common cosine similarity and euclidean distance similarity.
The application provides an image segmentation system for hepatobiliary and biliary calculi. Due to the close information relation among different case image slices, aiming at the characteristic; the application designs a novel time domain context information correlation mechanism which mainly comprises a first coder-decoder module of low-level multi-scale time domain context information, a second coder-decoder module of high-level bidirectional time domain context information and an improved loss function based on sequence image slice correlation. In a first encoder-decoder, feature learning is carried out on a small target region in a down-sampling stage, feature fusion is carried out in an up-sampling stage, and low-layer multi-scale time domain context features are extracted in the down-sampling stage and are used for subsequent layer-by-layer feature fusion; in the second encoder-decoder module, the low-layer multi-scale time domain context features and the high-layer bidirectional time domain context features are fused in the up-sampling stage, and a prediction sequence diagram of the original resolution size is obtained, so that the image receptive field is increased, and more accurate position and classification information is obtained; the segmentation network model is trained through an improved loss function, so that the attention of time domain context information among sequence image slices is increased, and the degree of relevance of the space information is subjected to numerical weight analysis, so that the method is more suitable for processing sequence images. Compared with the existing hepatobiliary duct and calculus segmentation network, the segmentation system has higher segmentation precision and efficiency, so that the technical problem that the accuracy and the high efficiency cannot be simultaneously considered for hepatobiliary duct and biliary tract calculus segmentation in a CT image of hepatobiliary duct lithiasis in the prior art is solved.
Further, the first encoder-decoder module in the present application specifically includes: the device comprises a first encoder, a ConvLSTM module, a first BiConvLSTM module and a first decoder.
Since the CT image is a series of 2D series images, ConvLSTM has a greater adaptability to 2D image sequence data. Compared with the standard LSTM, ConvLSTM replaces matrix multiplication with convolution operation, which is very effective for solving the problem of sequence image prediction and is beneficial to learning the temporal context information in the sequence images. The structure of ConvLSTM is shown in FIG. 2, and the expression is as follows:
it=σ(xt*Wxi+ht-1*Whi+bi)
ft=σ(xt*Wxf+ht-1*Whf+bf)
ct=ct-1⊙ft+it⊙tanh(xt*Wxo+ht-1*Who+bo)
ot=σ(xt*Wxo+ht-1*Who+bo)
ht=ot⊙tanh(ct);
wherein, the operation is a convolution operation,
Figure GDA0003556046990000081
hadamard product operation, namely multiplication of corresponding elements of a matrix; w is the network parameter and b is the bias term. One ConvLSTM has three gates in total, input gate itForgetting door ftAnd an output gate ot。xt、ct、htAre the input, the active state and the hidden state at time t.
The operation of each specific block in the first encoder-decoder is as follows:
the first encoder is for: respectively extracting features of small target areas in each image of the CT sequence image to obtain a plurality of first multi-scale feature maps and high-level feature maps, sending the first multi-scale feature maps to a ConvLSTM module, and sending the high-level feature maps to a first BiConvLSTM module.
It should be noted that, the first encoder of the embodiment of the present application includes several first convolution layers and a first pooling layer, where the size of the gaussian convolution kernel of the first convolution layer is 1 × 1.
The ConvLSTM module is used for: and the context information characteristic of the first multi-scale characteristic diagram is subjected to learning processing again to generate a multi-scale context characteristic diagram, and the multi-scale context characteristic diagram is sent to the first decoder and the second encoder-decoder module.
It should be noted that, for the extraction of spatial information features between CT sequence image slices, the present application utilizes the characteristics of ConvLSTM to accomplish this task. Since the difference between image slice frames is different at different scales, the application uses ConvLSTM at different scales to obtain multi-scale temporal context information. And flowing the information rich in the edge details and the multi-scale time domain context information layer by layer, and fusing the information rich in the edge details and the multi-scale time domain context information with the subsequent information features.
The first BiConvLSTM module is used for: and performing two times of time domain context information learning in opposite directions on the high-level feature map to generate a high-level bidirectional context feature map, and sending the high-level bidirectional context feature map to a first decoder.
It should be noted that, in general, the feature map of higher layer in the network is considered to contain the high-level semantic information of the image, and it is often the global feature of the image that is captured, which means that the high-level feature semantic information is very rich, and the application uses biconvlst, which is composed of two ConvLSTM with opposite directions, which allows the network model to learn more temporal context information. Connecting the results of the forward and backward ConvLSTM learning, and then applying the tanh function to obtain the corresponding result.
The first decoder is for: and carrying out skip splicing on the multi-scale context feature map and the high-level bidirectional context feature map to generate a first context fusion feature map, and sending the first context fusion feature map to a second encoder-decoder module.
It should be noted that the first decoder in the embodiment of the present application is composed of several first feature fusion layers, a second convolution layer, and an upsampling layer, where the size of the gaussian convolution kernel of the second convolution layer is 3 × 3.
Further, the second encoder-decoder module of the present application specifically includes: a second encoder, a second BiConvLSTM module and a second decoder.
The operation of each of the second encoder-decoder modules is as follows:
the second encoder is for: and carrying out skip splicing on the multi-scale context feature map and the first context fusion feature map to generate a second context fusion feature map, and sending the second context fusion feature map to a second BiConvLSTM module.
It should be noted that the second encoder according to the embodiment of the present application is composed of a plurality of second feature fusion layers, a third convolution layer, and a second pooling layer, where the size of the sparse convolution kernel of the third convolution layer is 5 × 5.
The second BiConvLSTM module is used for: and performing two times of time domain context information learning in opposite directions on the second context fusion characteristic graph to generate a high-level bidirectional context fusion characteristic graph, and sending the high-level bidirectional context fusion characteristic graph to a second decoder.
The second decoder is for: and after connecting the multi-scale context feature map and the high-level bidirectional context fusion feature map, outputting a sequence probability map by an activation function.
It should be noted that the second decoder in the embodiment of the present application is composed of several feature fusion layers, a fourth convolution layer, and an upper convolution layer, where the size of the sparse convolution kernel of the fourth convolution layer is 7 × 7. Meanwhile, in the embodiment of the present application, the activation function is set as a Sigmoid function, and a person skilled in the art can set the activation function according to an actual situation, which is not limited herein.
Furthermore, when the image is segmented, the numerical value of the segmentation network model for controlling the input of the CT sequence image is divided, and the image segmentation system for the hepatobiliary duct and the biliary tract calculus is also provided with an input module: the input module is used for: the number of sequences of CT sequence images of consecutive and adjacent hepatobiliary calculosis is controlled such that the CT sequence images are input to the first encoder-decoder module.
It can be understood that the general idea of the image segmentation system for hepatobiliary and biliary calculi of the present application is as follows: the method comprises the steps of training a novel segmentation network model of a time domain context information correlation mechanism designed by the application by taking CT sequence images of hepatobiliary lithiasis as training data, calculating loss values through a loss function improved by the application, and updating model parameters until parameters meeting preset conditions are obtained, so that an optimal segmentation network model is obtained and is used for segmenting hepatobiliary calculosis and biliary calculi of hepatobiliary calculosis images.
The terms "first," "second," "third," "fourth," and the like in the description of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of single item(s) or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (10)

1. An image segmentation system for hepatobiliary and biliary calculi, comprising: segmenting the network model; the segmentation network model includes: a first encoder-decoder module, a second encoder-decoder module, an optimization module;
the first encoder-decoder module is to:
respectively extracting the features of each image of the CT sequence image to obtain a plurality of first multi-scale feature maps and high-level feature maps; the context information features of the first multi-scale feature map are subjected to learning processing again to generate a multi-scale context feature map; performing time domain context information learning on the high-level feature map to generate a high-level bidirectional context feature map; splicing the multi-scale context feature map and the high-level bidirectional context feature map to generate a first context fusion feature map; the CT sequence images are continuous and adjacent CT sequence images of hepatobiliary lithiasis;
the second encoder-decoder module is to:
splicing the multi-scale context feature map and the first context fusion feature map to generate a second context fusion feature map; performing time domain context information learning on the second context fusion characteristic diagram to generate a high-level bidirectional context fusion characteristic diagram; after the multi-scale context feature map and the high-level bidirectional context fusion feature map are connected, a sequence probability map is output;
the optimization module is configured to:
and calculating a loss function value of the sequence probability chart and a label sequence chart corresponding to the sequence probability chart based on a preset loss function, judging whether the loss function value is smaller than a preset value, if so, obtaining an optimal segmentation network model, otherwise, after updating the characteristic parameters of the segmentation network model according to the loss function value, triggering the first encoder-decoder module and the second encoder-decoder module.
2. The system of claim 1, wherein the first encoder-decoder module comprises: the device comprises a first encoder, a ConvLSTM module, a first BiConvLSTM module and a first decoder;
the first encoder is to: respectively extracting features of a target area in each image of the CT sequence image to obtain a plurality of first multi-scale feature maps and high-level feature maps, sending the first multi-scale feature maps to the ConvLSTM module, and sending the high-level feature maps to the first BiConvLSTM module;
the ConvLSTM module is configured to: relearning the context information features of the first multi-scale feature map to generate a multi-scale context feature map, and sending the multi-scale context feature map to the first decoder and the second encoder-decoder module;
the first BiConvLSTM module is configured to: performing two times of time domain context information learning in opposite directions on the high-level feature map to generate a high-level bidirectional context feature map, and sending the high-level bidirectional context feature map to the first decoder;
the first decoder is to: skip-splicing the multi-scale context feature map and the high-level bidirectional context feature map to generate a first context fusion feature map, and sending the first context fusion feature map to the second encoder-decoder module.
3. The system of claim 2, wherein the second encoder-decoder module comprises: the second encoder, the second BiConvLSTM module and the second decoder;
the second encoder is to: performing skip splicing on the multi-scale context feature map and the first context fusion feature map to generate a second context fusion feature map, and sending the second context fusion feature map to the second BiConvLSTM module;
the second BiConvLSTM module is configured to: performing two times of time domain context information learning in opposite directions on the second context fusion feature map to generate a high-level bidirectional context fusion feature map, and sending the high-level bidirectional context fusion feature map to the second decoder;
the second decoder is to: and after the multi-scale context feature map and the high-level bidirectional context fusion feature map are connected, outputting a sequence probability map through an activation function.
4. The system of claim 3, wherein the activation function is: sigmoid function.
5. The system for image segmentation of hepatobiliary and biliary stones of claim 1, further comprising: an input module;
the input module is used for: controlling the number of sequences of the CT sequence images of consecutive and adjacent hepatobiliary lithiasis such that the CT sequence images are input to the first encoder-decoder module.
6. The system of claim 2, wherein the first encoder is comprised of a number of first convolution layers and a first pooling layer, wherein the first convolution layers have a gaussian convolution kernel size of 1 x 1.
7. The system of claim 2, wherein the first decoder comprises a plurality of first feature fusion layers, a second convolution layer and an upsampling layer, and wherein the second convolution layer has a gaussian convolution kernel size of 3 x 3.
8. The system of claim 3, wherein the second encoder is comprised of a plurality of second feature fusion layers, a third convolution layer, and a second pooling layer, wherein the size of the sparse convolution kernel of the third convolution layer is 5 x 5.
9. The system of image segmentation of hepatobiliary and biliary stones of claim 3, wherein the second decoder is composed of several feature fusion layers, a fourth convolution layer, and an upper convolution layer, wherein the size of the sparse convolution kernel of the fourth convolution layer is 7 x 7.
10. The system of claim 1, wherein the preset loss function is:
Figure FDA0003556046980000031
wherein the right side of the preset loss function equation is divided into two parts;
the method comprises the following steps of firstly, obtaining a common bootstrap cross entropy loss function, wherein N and K respectively represent the number of image pixels and the number of pixel categories; in parentheses, yiJ denotes this condition belonging to class j, pijRepresenting the measured probability of the ith pixel belonging to the jth class, where tjIs a threshold value with a value range of (0, 1)](ii) a When the condition (y) is in parenthesesi=j)∩(pij≤tj) When established, 1{ (y)i=j)∩(pij≤tj) Equal to 1, otherwise equal to 0, yiIn order for the value of the tag to be true,
Figure FDA0003556046980000032
a predicted value of the segmentation network model;
secondly, a weighted value based on the relevance, namely w;
Figure FDA0003556046980000033
wherein n represents the number of slice frames of the input sequence image, n is a positive integer, s represents the slice position at the current moment, and s is less than or equal to n; c represents the similarity between two different image slices.
CN202110083764.1A 2021-01-21 2021-01-21 Image segmentation system for hepatobiliary and biliary calculi Active CN112734748B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110083764.1A CN112734748B (en) 2021-01-21 2021-01-21 Image segmentation system for hepatobiliary and biliary calculi

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110083764.1A CN112734748B (en) 2021-01-21 2021-01-21 Image segmentation system for hepatobiliary and biliary calculi

Publications (2)

Publication Number Publication Date
CN112734748A CN112734748A (en) 2021-04-30
CN112734748B true CN112734748B (en) 2022-05-17

Family

ID=75594848

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110083764.1A Active CN112734748B (en) 2021-01-21 2021-01-21 Image segmentation system for hepatobiliary and biliary calculi

Country Status (1)

Country Link
CN (1) CN112734748B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113327254A (en) * 2021-05-27 2021-08-31 北京深睿博联科技有限责任公司 Image segmentation method and system based on U-type network
CN113378929B (en) * 2021-06-11 2022-08-30 武汉大学 Pulmonary nodule growth prediction method and computer equipment
CN116188786B (en) * 2023-05-04 2023-08-01 潍坊医学院附属医院 Image segmentation system for hepatic duct and biliary tract calculus

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110310280A (en) * 2019-07-10 2019-10-08 广东工业大学 Hepatic duct and the image-recognizing method of calculus, system, equipment and storage medium
CN111012377A (en) * 2019-12-06 2020-04-17 北京安德医智科技有限公司 Echocardiogram heart parameter calculation and myocardial strain measurement method and device
CN111523410A (en) * 2020-04-09 2020-08-11 哈尔滨工业大学 Video saliency target detection method based on attention mechanism
CN111832393A (en) * 2020-05-29 2020-10-27 东南大学 Video target detection method and device based on deep learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11170504B2 (en) * 2019-05-02 2021-11-09 Keyamed Na, Inc. Method and system for intracerebral hemorrhage detection and segmentation based on a multi-task fully convolutional network
CN111950467B (en) * 2020-08-14 2021-06-25 清华大学 Fusion network lane line detection method based on attention mechanism and terminal equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110310280A (en) * 2019-07-10 2019-10-08 广东工业大学 Hepatic duct and the image-recognizing method of calculus, system, equipment and storage medium
CN111012377A (en) * 2019-12-06 2020-04-17 北京安德医智科技有限公司 Echocardiogram heart parameter calculation and myocardial strain measurement method and device
CN111523410A (en) * 2020-04-09 2020-08-11 哈尔滨工业大学 Video saliency target detection method based on attention mechanism
CN111832393A (en) * 2020-05-29 2020-10-27 东南大学 Video target detection method and device based on deep learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Bi-Directional ConvLSTM U-Net with Densley Connected Convolutions;Reza Azad et al.;《arXiv:1909.00166v1 [eess.IV]》;20190831;第1-10页 *
M-Net: A Novel U-Net With Multi-Stream Feature Fusion and Multi-Scale Dilated Convolutions for Bile Ducts and Hepatolith Segmentation;XIAORUI FU et al.;《IEEE Access》;20191011;第148645-148657页 *
基于新型深度全卷积网络的肝脏CT影像三维区域自动分割;孙明建 等;《中国生物医学工程学报》;20180831;第37卷(第4期);第385-393页 *

Also Published As

Publication number Publication date
CN112734748A (en) 2021-04-30

Similar Documents

Publication Publication Date Title
CN110378381B (en) Object detection method, device and computer storage medium
CN112734748B (en) Image segmentation system for hepatobiliary and biliary calculi
CN111429460B (en) Image segmentation method, image segmentation model training method, device and storage medium
CN111292330A (en) Image semantic segmentation method and device based on coder and decoder
US20230043026A1 (en) Learning-based active surface model for medical image segmentation
CN112330684B (en) Object segmentation method and device, computer equipment and storage medium
AU2021354030B2 (en) Processing images using self-attention based neural networks
WO2021164280A1 (en) Three-dimensional edge detection method and apparatus, storage medium and computer device
CN114266794B (en) Pathological section image cancer region segmentation system based on full convolution neural network
Wazir et al. HistoSeg: Quick attention with multi-loss function for multi-structure segmentation in digital histology images
CN114418030A (en) Image classification method, and training method and device of image classification model
CN113313810A (en) 6D attitude parameter calculation method for transparent object
CN115578589A (en) Unsupervised echocardiography section identification method
CN114612902A (en) Image semantic segmentation method, device, equipment, storage medium and program product
CN111626134A (en) Dense crowd counting method, system and terminal based on hidden density distribution
CN114565628B (en) Image segmentation method and system based on boundary perception attention
CN113706544A (en) Medical image segmentation method based on complete attention convolution neural network
CN116363081A (en) Placenta implantation MRI sign detection classification method and device based on deep neural network
CN113538530B (en) Ear medical image segmentation method and device, electronic equipment and storage medium
KR20230056300A (en) A residual learning based multi-scale parallel convolutions system for liver tumor detection and the method thereof
Adegun et al. Deep convolutional network-based framework for melanoma lesion detection and segmentation
CN116597263A (en) Training method and related device for image synthesis model
CN113554656B (en) Optical remote sensing image example segmentation method and device based on graph neural network
CN113327221A (en) Image synthesis method and device fusing ROI (region of interest), electronic equipment and medium
Khajuria et al. A comparison of deep reinforcement learning and deep learning for complex image analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant