CN115496720A - Gastrointestinal cancer pathological image segmentation method based on ViT mechanism model and related equipment - Google Patents

Gastrointestinal cancer pathological image segmentation method based on ViT mechanism model and related equipment Download PDF

Info

Publication number
CN115496720A
CN115496720A CN202211147264.0A CN202211147264A CN115496720A CN 115496720 A CN115496720 A CN 115496720A CN 202211147264 A CN202211147264 A CN 202211147264A CN 115496720 A CN115496720 A CN 115496720A
Authority
CN
China
Prior art keywords
image
gastrointestinal cancer
segmentation
vit
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211147264.0A
Other languages
Chinese (zh)
Inventor
唐智贤
王雪
李镇
胡家祺
张艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai University of Medicine and Health Sciences
Original Assignee
Shanghai University of Medicine and Health Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai University of Medicine and Health Sciences filed Critical Shanghai University of Medicine and Health Sciences
Priority to CN202211147264.0A priority Critical patent/CN115496720A/en
Publication of CN115496720A publication Critical patent/CN115496720A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10056Microscopic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30024Cell structures in vitro; Tissue sections in vitro
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30028Colon; Small intestine
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30092Stomach; Gastric
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)

Abstract

The invention belongs to the technical field of medical image processing, and particularly relates to a gastrointestinal cancer pathological image segmentation method based on a ViT mechanism model and related equipment. A gastrointestinal cancer pathological image segmentation method based on a ViT mechanism model comprises the following steps: acquiring a gastrointestinal cancer pathological image to be segmented, and preprocessing the gastrointestinal cancer pathological image to be segmented; and performing nucleus segmentation on the preprocessed gastrointestinal cancer pathological image by adopting a preset nucleus segmentation model based on the ViT main stem to generate two segmentation predictions, and taking a prediction result obtained by adding the two segmentation predictions as a nucleus segmentation result. Aiming at the conditions of nonuniform staining of pathological images, irregular and overlapped nuclear morphology of gastrointestinal cancer, insufficient segmentation precision of a previous method and the like, the gastrointestinal cancer pathological image segmentation method based on the ViT mechanism model can automatically, efficiently and accurately detect and segment a nuclear region in the gastrointestinal cancer pathological image.

Description

Gastrointestinal cancer pathological image segmentation method based on ViT mechanism model and related equipment
Technical Field
The invention belongs to the technical field of medical image processing, and particularly relates to a gastrointestinal cancer pathological image segmentation method based on a ViT mechanism model and related equipment.
Background
Gastrointestinal cancer is one of the most common malignant tumors in the world, and most patients have middle-advanced stage when obvious symptoms appear because the early symptoms of the gastrointestinal tumors are mostly not obvious. Pathological images are one of the gold standards for diagnosing gastrointestinal cancer, however, manual analysis thereof is time-consuming.
The digital pathological image analysis technology can provide quantitative and objective basis for pathological diagnosis. It is an important aid for pathologists to make objective, accurate and reliable pathological diagnosis. Among them, the localization of cell nuclei in pathological images is one of the basic tasks for cancer detection, tumor grading, survival prediction, etc. Therefore, the rapid and accurate segmentation of cells in pathological images has important clinical significance for gastrointestinal cancer differential diagnosis. However, the nuclear segmentation of pathological images generally has the following difficulties: first, the images of cells are complex, and there are usually a large number of cells, each at a different growth stage, with different nuclear morphologies; secondly, the pathological images are uneven in dyeing, inconsistent in illumination conditions and large in image difference; thirdly, the tissue cells are spatially distributed, so that the overlapping probability of cell nuclei in the image is high.
The pathological image analysis is introduced by an artificial intelligence method, so that the rapid and accurate segmentation of cell nucleuses is realized, the difficulty and the time consumption for a doctor to analyze the pathological image can be reduced, the diagnosis efficiency and the diagnosis accuracy are improved, and the patient can be better served.
At present, researchers also propose some cell nucleus segmentation algorithms, which can be specifically classified into the following categories:
first, a conventional machine learning method. Among them, random forests and SVMs have received great attention because they exhibit good performance in cell and nucleus segmentation. Different from the traditional classification model, the K-means adopts a clustering method, and can also effectively segment a single cell nucleus area. However, the traditional machine learning technology relies on manual functions, requires complex design, has limited representation capability, and has an unsatisfactory segmentation effect.
And secondly, a cell nucleus segmentation method based on a convolutional neural network. The convolutional neural network develops a new segmentation idea, and compared with other network models, the parameter number needing to be optimized is greatly reduced due to the parameter sharing characteristic of convolutional operation, and the training efficiency and the expandability of the models are improved. For example, zhang et al use deep learning models to complete the segmentation of the cell nuclei of pathological images. In addition, many variant models of CNN, such as MDC-Net and GCN, have been successfully applied in the segmentation and detection of cell nuclei in pathological images, and methods have introduced an attention mechanism (heroic army, zhao crystal, a cell nucleus segmentation method based on attention learning, CN 112446892A) to improve the segmentation accuracy. Improved models of U-Net have also achieved good results in the task of nuclear segmentation since the proposal of U-Net in 2015. For example, micro-Net improves the original loss function of U-Net and achieves larger achievement on nucleus segmentation. HoverNet extracts multi-scale features of nuclei to achieve segmentation.
Third, nuclear segmentation based on the Transformer mechanism. The Transformer is a model for the Seq2Seq task proposed in 2017, and is improved to some extent in tasks such as machine translation. In recent years, the image processing apparatus has been gradually shifted to the field of image processing. The core of the Transformer is that a self-attention mechanism and a multi-head attention mechanism are introduced, so that a model can capture long-range dependence of features. For example, a gated attention mechanism is introduced by MedT, four gated heads are added on the basis of axial attention, and the result that the position deviation of a small-scale data set is difficult to learn is concluded. Similarly, zhang et al proposed MBT-Net for nuclear segmentation, which also uses the concept of transformer and multi-branching in its network.
The segmentation algorithm based on deep learning has the advantages of high segmentation speed, high accuracy and the like, however, the current method still has the following problems: the first CNN-based method is insufficient in accuracy of segmentation effect due to lack of long-range dependence of model features as the number of network layers increases, and is particularly difficult to recognize due to a large number of nuclei, overlapping, and a small area. Second, the contribution difference between different levels of features for the encoder-decoder model, and an efficient mechanism to fuse these features is needed. Thirdly, the staining degree of different pathological sections is different, thereby affecting the segmentation effect.
Disclosure of Invention
The invention provides a gastrointestinal cancer pathological image segmentation method and corollary equipment based on a ViT mechanism model, aiming at the problems of limited characterization capability and low efficiency of feature fusion of a cell nucleus segmentation algorithm in the existing gastrointestinal cancer pathological image based on deep learning, and aiming at improving the precision and efficiency of cell nucleus segmentation in the gastrointestinal cancer pathological image and applying the method to clinical auxiliary diagnosis of gastrointestinal cancer.
A gastrointestinal cancer pathological image segmentation method based on a ViT mechanism model comprises the following steps:
acquiring a gastrointestinal cancer pathological image to be segmented, and preprocessing the gastrointestinal cancer pathological image to be segmented;
and performing nucleus segmentation on the preprocessed gastrointestinal cancer pathological image by adopting a preset nucleus segmentation model based on the ViT main stem to generate two segmentation predictions, and taking a prediction result obtained by adding the two segmentation predictions as a nucleus segmentation result.
Preferably, before performing the nucleus segmentation on the preprocessed gastrointestinal cancer pathological image by using the preset nucleus segmentation model based on the ViT stem, the method further includes training the nucleus segmentation model:
acquiring pathological data of gastrointestinal cancer, marking cell nucleuses in the tissue images to obtain a plurality of marked pathological images of the gastrointestinal cancer, and establishing a special pathological disease database of the gastrointestinal cancer;
preprocessing the gastrointestinal cancer pathology image after marking;
constructing a cell nucleus segmentation model based on a ViT main stem, training the cell nucleus segmentation model by utilizing the preprocessed gastrointestinal cancer pathological image, calculating a loss function by using two segmentation predictions generated by the cell nucleus segmentation model and a preset mask (mask) during training, and then adding the two segmentation predictions to serve as final adjustment data to be transmitted to the cell segmentation model.
Preferably, the cell nucleus segmentation model comprises an encoder for feature capture and a decoder for aggregating different hierarchical features;
the encoder adopts a deformable self-attention mechanism, and four levels of features f from different depths are extracted from the gastrointestinal cancer pathological image through the calculation of the encoder i I =1, 2, 3 or 4, wherein f 1 Is a low-order feature, f 2 、f 3 And f 4 High-order characteristics;
the decoder aggregates the low-order features and the high-order features into two split predictions.
Preferably, the decoder includes:
an Aggregation Module (Aggregation Module) for respectively superposing the high-order features of each layer together, and obtaining a result of high-order feature Aggregation after convolution, wherein the result is an Aggregation feature T 1 And generating a partition prediction P therefrom 1
A parallel pooling Attention convolution module (ParNet Attention) based on the low-order feature f 1 Generating feature maps of three levels through a pooling layer, convolving and fusing the feature maps of each level, and finally connecting an average pooling layer and a full connection layer to obtain a low-order feature T 2
A Similarity Aggregation Module (SAM) for aggregating the features T 1 And the characteristic T 2 Fusing and generating another segmentation prediction P 2
Preferably, the gastrointestinal cancer pathological image to be segmented is preprocessed or the marked gastrointestinal cancer pathological image is preprocessed, and a size normalization processing mode and a color normalization processing mode are adopted.
Preferably, the size normalization processing means cuts the gastrointestinal cancer pathology image into a predetermined size.
Preferably, the color normalization processing method includes:
setting the gastrointestinal cancer pathological image as an image s, and setting a preset template image as an image t;
the optical density V of the image s is measured by the following equations (1) and (2) s Conversion to W s H s The optical density V of the image t t Conversion to W t H t Wherein W is s Is a color appearance matrix of the image s, H s Is a density matrix of staining, W, of the image s t Is a color appearance matrix of the image t, H t A staining density matrix for image t;
Figure BDA0003851443290000041
V=WH (2)
wherein V represents the optical density of the image, I 0 Representing the intensity of incident illumination, I representing an original RGB image, W representing a color appearance matrix of the image, and H representing a color density matrix of the image;
registering the staining density matrix of the image s and the staining density matrix of the image t by using the following formula (3) to obtain the registered staining density matrix of the image s
Figure BDA0003851443290000042
Figure BDA0003851443290000043
Wherein H s (j,: is a staining density matrix before registration of the image s, H t (j,: is a staining density matrix before registration of the image t, j is r staining index, RM (-) is used to calculate a Robust Pseudo-maximum (Robust Pseudo maximum) at 99% for each row vectorm);
Multiplying the color appearance matrix of the image t before registration by the dyeing density matrix of the image s after registration to obtain an image s after color normalization by using the following formula (4);
Figure BDA0003851443290000044
wherein the content of the first and second substances,
Figure BDA0003851443290000045
is the color normalized image s, W t For the color appearance matrix of the image t before registration,
Figure BDA0003851443290000046
the density matrix is stained for the registered image s.
A gastrointestinal cancer pathology image segmentation apparatus based on a ViT mechanism model, comprising:
the preprocessing module is used for acquiring a gastrointestinal cancer pathological image to be segmented and preprocessing the gastrointestinal cancer pathological image to be segmented;
and the cell nucleus segmentation module is used for performing cell nucleus segmentation on the preprocessed gastrointestinal cancer pathological image by adopting a preset cell nucleus segmentation model based on the ViT main stem to generate two segmentation predictions, and a prediction result obtained by adding the two segmentation predictions is used as a cell nucleus segmentation result.
A computer device comprising a memory and a processor, the memory having stored therein computer readable instructions which, when executed by the processor, cause the processor to perform the steps of the above-described method for gastrointestinal cancer pathology image segmentation based on a ViT mechanism model.
A storage medium having stored thereon computer readable instructions which, when executed by one or more processors, cause the one or more processors to carry out the steps of the above-described method for gastrointestinal cancer pathology image segmentation based on a ViT mechanism model.
The positive progress effects of the invention are as follows: the invention adopts a gastrointestinal cancer pathological image segmentation method based on a ViT mechanism model and related equipment, and has the following advantages that:
1. aiming at the conditions of nonuniform staining of pathological images, irregular and overlapped nuclear morphology of gastrointestinal cancer, insufficient segmentation precision of a previous method and the like, the gastrointestinal cancer pathological image segmentation method based on the ViT mechanism model can utilize a small amount of training data sets to realize the training of the model, and the trained model can automatically, efficiently and accurately detect and segment nuclear regions in the gastrointestinal cancer pathological images, thereby providing favorable help for tasks such as computer-aided diagnosis, image omics analysis and the like.
2. The gastrointestinal cancer pathological image is coded by using the visual Transformer trunk with deformable attention, image feature information of different layers is sequentially extracted from shallow to deep, and fine feature data of deep layers are subjected to aggregation coding, so that inaccurate and rough estimation can be refined into a more accurate edge prediction image, and the segmentation performance of the model is effectively improved.
3. Non-local operation is introduced under the graph convolution domain to realize similarity aggregation, so that high-order features and low-order features can be combined together more fully.
4. And carrying out color normalization on the pictures with different color distributions and dyeing depths by utilizing a color uniformization technology, so that the color information distribution of the pictures is kept consistent as much as possible.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a model structure diagram of the nuclear segmentation model of the present invention;
FIG. 3 is a schematic diagram of a deformable self-attention module according to the present invention.
Detailed Description
In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further described with the specific drawings.
Referring to fig. 1, a gastrointestinal cancer pathological image segmentation method based on a ViT mechanism model comprises the following steps:
s1, gastrointestinal cancer pathological image preprocessing: and acquiring a gastrointestinal cancer pathological image to be segmented, and preprocessing the gastrointestinal cancer pathological image to be segmented.
The pathological image of the gastrointestinal cancer is derived from a pathological section of the gastrointestinal cancer to be segmented, and is digitized by using a gastrointestinal cancer pathological section scanner, so that the pathological image of the gastrointestinal cancer to be segmented is finally obtained.
Since gastrointestinal cancer pathology images may come from different models of machines, the different parameters may result in differences in size of the images, and differences in color of the images due to inconsistent staining conditions of the pathology sections, preprocessing of the gastrointestinal cancer pathology images is required before proceeding with the subsequent segmentation step.
In some embodiments, size normalization and color normalization processing is used when preprocessing the gastrointestinal cancer pathology image to be segmented.
In some embodiments, the size normalization process is performed by cropping the gastrointestinal cancer pathology image to a predetermined size. For example, a gastrointestinal cancer pathology image to be segmented is cropped to a size of 512 by 512.
In some embodiments, the color normalization processing method may be a color normalization processing method in the prior art, or may also include the following steps:
and setting the gastrointestinal cancer pathological image as an image s and the preset template image as an image t. The image t is a preset image, and all gastrointestinal cancer pathological images to be segmented need to be color-normalized to the color of the image t.
Firstly, converting a gastrointestinal cancer pathological image or a template image to be segmented, namely an original RGB image I into optical density V, and obtaining two dye matrixes by adopting non-negative matrix decomposition, wherein the two dye matrixes are expressed as a formula (1) and a formula (2);
Figure BDA0003851443290000061
V=WH (2)
wherein V represents the optical density of the image, I 0 Representing the incident illumination intensity, I representing the original RGB image, W representing the color appearance matrix of the image, and H representing the color density matrix of the image;
the optical density V of the image s is determined using the equations (1) and (2) s Conversion to W s H s The optical density V of the image t t Conversion to W t H t Wherein W is s Is a color appearance matrix of the image s, H s Is a density matrix of staining of the image s, W t Is a color appearance matrix of the image t, H t A staining density matrix for image t;
secondly, the image s may be obtained by registering the staining density matrix of the image s and the staining density matrix of the image t based on a structure-preserving color migration (SPCN) algorithm to obtain a registered staining density matrix of the image s
Figure BDA0003851443290000062
The registration adopts the following formula (3);
Figure BDA0003851443290000063
wherein H s (j,: is a staining density matrix before registration of the image s, H t (j,: is a stain density matrix before registration of image t, j is r stain indices, RM (-) is used to calculate a Robust Pseudo-Maximum (Robust Pseudo Maximum) at 99% for each row vector;
finally, multiplying the color appearance matrix of the image t before registration by the dyeing density matrix of the image s after registration to obtain an image s after color normalization, wherein the color appearance matrix is expressed by the following formula (4);
Figure BDA0003851443290000064
wherein the content of the first and second substances,
Figure BDA0003851443290000065
for the image s after the color normalization,W t for the color appearance matrix of the image t before registration,
Figure BDA0003851443290000066
the density matrix is stained for the registered image s.
The invention utilizes the color homogenization technology to carry out color (color) normalization on gastrointestinal cancer pathological images with different color distributions and dyeing depths, so that the color information distribution of the gastrointestinal cancer pathological images is kept consistent as much as possible, and the subsequent model segmentation precision is favorably improved.
S2, gastrointestinal cancer pathological image segmentation: and performing nucleus segmentation on the preprocessed gastrointestinal cancer pathological image by adopting a preset nucleus segmentation model based on the ViT main stem to generate two segmentation predictions, and taking a prediction result obtained by adding the two segmentation predictions as a nucleus segmentation result.
Aiming at the conditions of nonuniform staining of pathological images of gastrointestinal cancer, irregular morphology and overlapping of gastrointestinal cancer cell nucleuses, insufficient segmentation precision of the previous method and the like in the prior art, the invention introduces a cell nucleus segmentation model based on ViT main stem to segment the cell nucleuses of the preprocessed pathological images of gastrointestinal cancer, can automatically, efficiently and accurately detect and segment the cell nucleus regions in the pathological images of gastrointestinal cancer, and the result of the cell nucleus segmentation is the segmented cell nucleus regions.
The ViT stem based nucleus segmentation model may convert 2D images into 1D patch sequences, enabling a deformable self-attention module (Transformer) to process the images. In particular, the ViT may stack multiple transform modules together to process non-overlapping image patch sequences, resulting in a non-convolution image classification module. Compared with a CNN model, the model based on the Transformer has larger receptive field, is good at modeling long-term dependence relationship, and achieves excellent performance under the condition of a large amount of training data and model parameters.
In some embodiments, prior to performing a nuclear segmentation on the preprocessed gastrointestinal cancer pathology image using the nuclear segmentation model, the present invention also trains the nuclear segmentation model:
s201, gastrointestinal cancer pathological data are obtained, cell nucleuses in the tissue images are marked, a plurality of marked gastrointestinal cancer pathological images are obtained, and a gastrointestinal cancer pathological special disease database is established.
The pathological data of gastrointestinal cancer comes from a plurality of devices of a plurality of clinical hospitals, pathological sections of gastrointestinal cancer are selected, digitized by a pathological section scanner, named by the existing ImageViewerG software and desensitized to data, and a plurality of pathological images of gastrointestinal cancer are obtained. And then, marking the cell nucleus of the gastrointestinal cancer pathological image by using the existing Labelme toolkit for training and testing a cell nucleus segmentation model. The marked gastrointestinal cancer pathology images are stored together as a gastrointestinal cancer pathology specific database.
S202, preprocessing the marked gastrointestinal cancer pathological image.
Since gastrointestinal cancer pathological images may come from different models of machines, the difference in parameters may result in a difference in size of the image, and a difference in color of the image due to inconsistency in staining conditions of pathological sections. Therefore, the present invention also preprocesses the gastrointestinal cancer pathological images in the gastrointestinal cancer pathological special disease database, and the preprocessing process is the same as the preprocessing process of the gastrointestinal cancer pathological images to be segmented in step S1, that is, the size normalization and color normalization processing modes are also adopted, which are not described herein again.
S203, constructing a cell nucleus segmentation model based on the ViT main stem, training the cell nucleus segmentation model by utilizing the preprocessed gastrointestinal cancer pathological image, calculating a loss function by two segmentation predictions generated by the cell nucleus segmentation model and a preset mask (mask) during training, and then adding the two segmentation predictions to serve as final adjustment data to be transmitted to the cell segmentation model.
When the cell nucleus segmentation model is trained, the invention measures and predicts the quality of the model by calculating a loss function with the mask so as to adjust and optimize the model.
In some embodiments, referring to fig. 2, the ViT stem based nucleus segmentation model in the present invention includes an encoder for feature capture and a decoder for aggregating different levels of features. In fig. 2, a) DAT Encoder is an Encoder, B) Par attachment, C) Aggregation Module and D) SAM Module are decoders.
The encoder is composed of a Deformable self Attention Transformer (DAT), and four levels of features f from different depths are extracted from the gastrointestinal cancer pathological image through calculation of the encoder i I =1, 2, 3 or 4, wherein f 1 Is a low-order feature, f 2 、f 3 And f 4 Are high-order features. The decoder aggregates the low-order features and the high-order features into two split predictions, which are respectively split predictions P 1 And partition prediction P 2
Specifically, the input image with the shape of H × W × 3, i.e., the gastrointestinal cancer pathology image of the present invention, is first subjected to 4 × 4 non-overlapping convolution embedding, and then one normalization layer is performed, resulting in H/4 × W/4 × C patch embedding. In order to construct a hierarchical feature pyramid, the Backbone network (Backbone) comprises 4 stages, with the step size (stride) gradually increasing. Between 2 consecutive stages, there is a non-overlapping 2 × 2 convolution with stride =2 to down-sample the feature map, halving the spatial size and doubling the feature size.
In the first two phases of DAT, the key and value have larger space size, which greatly increases the computation overhead of the dot product and bilinear interpolation of the Deformable Attenttion. Therefore, to achieve a tradeoff between model capacity and computational burden, the feature map is first processed by a Window-based Local Attention mechanism (Local Attention) to locally aggregate the information. Then processed using a moving Window Attention mechanism (Shift Window Attention) to give a better representation at an early stage.
In the third and fourth phases of DAT, the feature maps are first processed by a Window-based Local Attention mechanism (Local Attention) to locally aggregate the information. The global relationship between the local enhanced tokens (tokens) is then modeled by a Deformable self-Attention module (Deformable Attention Block). This alternative design of attention blocks with local and global receptive fields helps model learning to strongly characterize.
Referring to FIG. 3, for a deformable self-attention module, given an input feature map x ∈ R H×W×C A unified grid map p of a point is generated as a reference, specifically, the grid size is down-sampled by a factor from the input feature map size, the values of the reference points are linearly spaced 2D coordinates, and then normalized to a range according to the grid shape. To obtain the offset for each reference point, the feature map is linearly projected onto q and input into a sub-network θ that calculates the offset offset To obtain the final offset theta offset (q) of (d). Computing the feature query (q), formed
Figure BDA0003851443290000091
And deformed
Figure BDA0003851443290000092
The specific formula of (2) is as follows:
Figure BDA0003851443290000093
and (5) performing multi-head attention on q, k and v, and counting the position offset, wherein the characteristics of each head are connected together, and the final output is obtained through projection.
Through the technique of the Encoder described above, features from different depths of the 4 levels f1, f2, f3, and f4 can be extracted from the a) DAT Encoder portion in fig. 2.
In some embodiments, referring to fig. 2, for the decoder, 3 sub-modules are included, respectively an Aggregation Module (Aggregation Module), a parallel pooled Attention convolution Module (ParNet Attention) and a Similarity Aggregation Module (SAM).
The aggregation module respectively superposes the high-order features of all the layers together, and a result of high-order feature aggregation is obtained after convolution, wherein the result is an aggregation feature T 1 And generating a partition prediction P based on the result 1 . As shown in fig. 2, when embodied, for the high order feature f 2 Sequentially undergoing maximum poolingAnd a polymerization module 1-1 to perform polymerization to obtain a first polymerization result. For high order features f 3 After maximum pooling, with f after maximum pooling 2 Polymerized together by the polymerization module 2-1 and then polymerized together with the first polymerization result by the polymerization module 2-2 to obtain a second polymerization result. For high order features f 3 After maximum pooling with f after maximum pooling 2 And f after maximum pooling 3 Polymerizing the first and second polymerization results by a polymerization module 3-1 and a polymerization module 3-2 to obtain a third polymerization result, and convolving the third polymerization result twice to obtain a polymerization characteristic T 1 Partition prediction P 1
Parallel pooled attention convolution module based on low order feature f 1 Generating feature maps of three levels through a pooling layer, convolving and fusing the feature maps of each level, sequentially mining complementary regions and details, and finally connecting an average pooling layer and a full connection layer to obtain a low-order feature T 2 . As shown in FIG. 2, when implemented, the low-order features f 1 The feature maps are divided into three levels by using two times of maximum pooling (Maxpooling), namely a first level feature map, a second level feature map and a third level feature map. And for the first-layer characteristic diagram, after passing through two continuous parallel network modules (ParNet Block), performing maximum pooling once again to obtain a first-layer result. And for the second-level feature map, after passing through three continuous parallel network modules, fusing (Fusion) is carried out on the second-level feature map and the first-level result to obtain a second-level result. And for the third-level feature map, performing maximum pooling again, and fusing with the second-level result after passing through three continuous parallel network modules to obtain a third-level result. The third layer results are processed by an average pooling and full connection layer (Avgpooling + FCN) to obtain a low-order characteristic T 2
The similarity aggregation module aggregates the features T 1 And a characteristic T 2 Fusing and generating another segmentation prediction P 2 . The similarity aggregation module is a non-local operation that can use global attention to inject detailed appearance features into high-level semantic features. The invention introduces non-convolution under the graph volume domainAnd the local operation is used for realizing similarity aggregation, so that high-order characteristics and low-order characteristics can be combined more fully.
Given a feature map T containing high level semantic information 1 And a feature map T with rich appearance details 2 This was fused by self-attention to give the final Q, K and V. Specifically, the method is realized by adopting the following formula:
Figure BDA0003851443290000101
Figure BDA0003851443290000102
Figure BDA0003851443290000103
Z=T 1 +W z (Y′)
Figure BDA0003851443290000104
the above formula is prior art, and the detailed meaning of the formula is not described herein.
Predicting P for segmentation 1 And partition prediction P 2 While training the model, P 1 And P 2 Respectively calculating loss function with preset mask, and calculating P 1 And P 2 The sum is passed as final adjustment data to the cell segmentation model. P obtained by decoder in normal segmentation after model training 1 And P 2 And directly adding the prediction results to obtain a cell nucleus segmentation result.
The visual Transformer trunk with deformable attention is adopted to encode the gastrointestinal cancer pathological image, image feature information of different layers is sequentially extracted from shallow to deep, and fine feature data of deep layers are subjected to aggregate decoding, so that inaccurate and rough estimation can be refined into a more accurate edge prediction image, and the segmentation performance of the model is effectively improved.
In some embodiments, the invention may further comprise a step S3 of demonstrating: and displaying, storing and evaluating the cell nucleus segmentation result.
In some embodiments, the present invention employs Mean Absolute Error (MAE), DICE coefficient, intersection ratio IoU, sensitivity (sen), and specificity (spec) as objective evaluation indicators. The method can randomly extract a plurality of gastrointestinal cancer pathological images in a gastrointestinal cancer pathological special disease database to serve as a test set, and verifies the segmentation precision by adopting the steps S1 to S2 of the method.
The comparative examples are U-net, R2Unet and Att-Unet models in the prior art, and the step S2 of the present invention is replaced by the above prior art model to obtain the segmentation result.
Comparing the present invention with the above comparative examples, using the Mean Absolute Error (MAE), the DICE coefficient, the intersection ratio IoU, the sensitivity (sen), and the specificity (spec) as objective evaluation indexes, the segmentation accuracy of the following table was obtained:
Metic U-net R2Unet Att-Unet the invention
MAE 0.1559 0.1708 0.1376 0.0765
DICE 0.6127 0.5842 0.6084 0.8080
IoU 0.4576 0.4245 0.4566 0.6806
Sen 0.7159 0.6941 0.6525 0.8438
Spec 0.8462 0.8513 0.8853 0.9398
The results show that the method of the invention performed best with a MAE of 0.0765, a DICE coefficient of 0.808, an IoU of 0.6806, a Sen of 0.8438 and a Spec of 0.9398.
In some embodiments, the invention provides a gastrointestinal cancer pathological image segmentation device based on a ViT mechanism model, comprising:
the preprocessing module is used for acquiring a gastrointestinal cancer pathological image to be segmented and preprocessing the gastrointestinal cancer pathological image to be segmented;
and the cell nucleus segmentation module is used for performing cell nucleus segmentation on the preprocessed gastrointestinal cancer pathological image by adopting a preset cell nucleus segmentation model based on the ViT main stem to generate two segmentation predictions, and a prediction result obtained by adding the two segmentation predictions is used as a cell nucleus segmentation result.
In some embodiments, the present invention provides a computer device comprising a memory and a processor, wherein the memory stores computer readable instructions, and the computer readable instructions, when executed by the processor, cause the processor to execute the steps of the above-mentioned gastrointestinal cancer pathology image segmentation method based on the ViT mechanism model.
In some embodiments, the present invention proposes a storage medium storing computer readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the above-described embodiments of a ViT mechanism model-based gastrointestinal cancer pathology image segmentation method.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: read Only Memory (ROM), random Access Memory (RAM), magnetic or optical disks, and the like. The storage medium may be a nonvolatile storage medium.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are given by way of illustration of the principles of the present invention, but that various changes and modifications may be made without departing from the spirit and scope of the invention, and such changes and modifications are within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (10)

1. A gastrointestinal cancer pathological image segmentation method based on a ViT mechanism model is characterized by comprising the following steps:
acquiring a gastrointestinal cancer pathological image to be segmented, and preprocessing the gastrointestinal cancer pathological image to be segmented;
and performing nucleus segmentation on the preprocessed gastrointestinal cancer pathological image by adopting a preset nucleus segmentation model based on a ViT main stem to generate two segmentation predictions, and taking a prediction result obtained by adding the two segmentation predictions as a nucleus segmentation result.
2. The method for segmenting the gastrointestinal cancer pathological image based on the ViT mechanism model according to claim 1, wherein before the pre-processing gastrointestinal cancer pathological image is segmented by the cell nucleus segmentation model based on the ViT main stem, the method further comprises training the cell nucleus segmentation model:
acquiring gastrointestinal cancer pathological data, marking cell nucleuses in the tissue images to obtain a plurality of marked gastrointestinal cancer pathological images, and establishing a gastrointestinal cancer pathological special disease database;
preprocessing the gastrointestinal cancer pathology image after marking;
and constructing a cell nucleus segmentation model based on a ViT main stem, training the cell nucleus segmentation model by utilizing the preprocessed gastrointestinal cancer pathological image, calculating loss functions by using two segmentation predictions generated by the cell nucleus segmentation model and a preset mask respectively during training, and then adding the two segmentation predictions to serve as final adjustment data to be transmitted to the cell segmentation model.
3. The method of gastrointestinal cancer pathology image segmentation based on a ViT mechanism model of claim 1 or 2, wherein the nucleus segmentation model comprises an encoder for feature capture and a decoder for aggregating different hierarchical features;
the encoder adopts a deformable self-attention mechanism, and four levels of features f from different depths are extracted from the gastrointestinal cancer pathological image through the calculation of the encoder i I =1, 2, 3 or 4, wherein f 1 Is of low orderCharacteristic f 2 、f 3 And f 4 High-order characteristics;
the decoder aggregates the low-order features and the high-order features into two split predictions.
4. The method of ViT mechanistic model-based gastrointestinal cancer pathology image segmentation of claim 3, wherein the decoder comprises:
an aggregation module, which respectively superposes the high-order features of each level together, and obtains the result of high-order feature aggregation after convolution, wherein the result is an aggregation feature T 1 And generating a partition prediction P therefrom 1
A parallel pooled attention convolution module based on the low-order features f 1 Generating feature maps of three levels through a pooling layer, fusing the feature maps of each level after convolution, and finally connecting an average pooling layer and a full connection layer to obtain a low-order feature T 2
A similarity aggregation module for aggregating the features T 1 And the characteristic T 2 Fusing and generating another segmentation prediction P 2
5. The method for gastrointestinal cancer pathology image segmentation based on the ViT mechanism model according to claim 1 or 2, wherein size normalization and color normalization are used for preprocessing the gastrointestinal cancer pathology image to be segmented or preprocessing the gastrointestinal cancer pathology image after labeling.
6. The method of ViT mechanism model-based segmentation of gastrointestinal cancer pathology image of claim 5, wherein the size normalization process is to crop the gastrointestinal cancer pathology image to a preset size.
7. The method of ViT mechanistic model-based segmentation of gastrointestinal cancer pathology images of claim 5, wherein the color normalization process comprises:
setting the gastrointestinal cancer pathological image as an image s, and setting a preset template image as an image t;
the optical density V of the image s is measured by the following equations (1) and (2) s Conversion to W s H s The optical density V of the image t t Conversion to W t H t Wherein W is s Is a color appearance matrix of the image s, H s Is a density matrix of staining of the image s, W t Is a color appearance matrix of the image t, H t A staining density matrix for image t;
Figure FDA0003851443280000021
V=WH (2)
wherein V represents the optical density of the image, I 0 Representing the incident illumination intensity, I representing the original RGB image, W representing the color appearance matrix of the image, and H representing the color density matrix of the image;
registering the staining density matrix of the image s and the staining density matrix of the image t by using the following formula (3) to obtain the registered staining density matrix of the image s
Figure FDA0003851443280000022
Figure FDA0003851443280000023
Wherein H s (j,: is a staining density matrix before registration of the image s, H t (j,: is a stain density matrix before registration of image t, j is r stain indices, RM (-) is used to calculate a Robust Pseudo-Maximum (Robust Pseudo Maximum) at 99% for each row vector;
multiplying the color appearance matrix of the image t before registration by the dyeing density matrix of the image s after registration to obtain an image s after color normalization by using the following formula (4);
Figure FDA0003851443280000024
wherein the content of the first and second substances,
Figure FDA0003851443280000031
is the color normalized image s, W t For the color appearance matrix of the image t before registration,
Figure FDA0003851443280000032
the density matrix is stained for the registered images s.
8. A gastrointestinal cancer pathological image segmentation device based on a ViT mechanism model is characterized by comprising:
the preprocessing module is used for acquiring a gastrointestinal cancer pathological image to be segmented and preprocessing the gastrointestinal cancer pathological image to be segmented;
and the cell nucleus segmentation module is used for performing cell nucleus segmentation on the preprocessed gastrointestinal cancer pathological image by adopting a preset cell nucleus segmentation model based on the ViT main stem to generate two segmentation predictions, and a prediction result obtained by adding the two segmentation predictions is used as a cell nucleus segmentation result.
9. A computer device comprising a memory and a processor, the memory having stored therein computer readable instructions which, when executed by the processor, cause the processor to carry out the steps of the ViT mechanism model-based gastrointestinal cancer pathology image segmentation method according to any one of claims 1 to 7.
10. A storage medium having computer readable instructions stored thereon which, when executed by one or more processors, cause the one or more processors to perform the steps of the ViT mechanism model-based gastrointestinal cancer pathology image segmentation method of any one of claims 1 to 7.
CN202211147264.0A 2022-09-19 2022-09-19 Gastrointestinal cancer pathological image segmentation method based on ViT mechanism model and related equipment Pending CN115496720A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211147264.0A CN115496720A (en) 2022-09-19 2022-09-19 Gastrointestinal cancer pathological image segmentation method based on ViT mechanism model and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211147264.0A CN115496720A (en) 2022-09-19 2022-09-19 Gastrointestinal cancer pathological image segmentation method based on ViT mechanism model and related equipment

Publications (1)

Publication Number Publication Date
CN115496720A true CN115496720A (en) 2022-12-20

Family

ID=84470823

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211147264.0A Pending CN115496720A (en) 2022-09-19 2022-09-19 Gastrointestinal cancer pathological image segmentation method based on ViT mechanism model and related equipment

Country Status (1)

Country Link
CN (1) CN115496720A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116342627A (en) * 2023-05-23 2023-06-27 山东大学 Intestinal epithelial metaplasia area image segmentation system based on multi-instance learning
CN116452931A (en) * 2023-04-11 2023-07-18 北京科技大学 Hierarchical sensitive image feature aggregation method
CN117612221A (en) * 2024-01-24 2024-02-27 齐鲁工业大学(山东省科学院) OCTA image blood vessel extraction method combined with attention shift

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116452931A (en) * 2023-04-11 2023-07-18 北京科技大学 Hierarchical sensitive image feature aggregation method
CN116452931B (en) * 2023-04-11 2024-03-19 北京科技大学 Hierarchical sensitive image feature aggregation method
CN116342627A (en) * 2023-05-23 2023-06-27 山东大学 Intestinal epithelial metaplasia area image segmentation system based on multi-instance learning
CN116342627B (en) * 2023-05-23 2023-09-08 山东大学 Intestinal epithelial metaplasia area image segmentation system based on multi-instance learning
CN117612221A (en) * 2024-01-24 2024-02-27 齐鲁工业大学(山东省科学院) OCTA image blood vessel extraction method combined with attention shift
CN117612221B (en) * 2024-01-24 2024-04-26 齐鲁工业大学(山东省科学院) OCTA image blood vessel extraction method combined with attention shift

Similar Documents

Publication Publication Date Title
CN113077471B (en) Medical image segmentation method based on U-shaped network
Zhao et al. Dermoscopy image classification based on StyleGAN and DenseNet201
CN112017198B (en) Right ventricle segmentation method and device based on self-attention mechanism multi-scale features
CN110503630B (en) Cerebral hemorrhage classifying, positioning and predicting method based on three-dimensional deep learning model
WO2021203795A1 (en) Pancreas ct automatic segmentation method based on saliency dense connection expansion convolutional network
CN115496720A (en) Gastrointestinal cancer pathological image segmentation method based on ViT mechanism model and related equipment
CN111325750B (en) Medical image segmentation method based on multi-scale fusion U-shaped chain neural network
Pan et al. Mitosis detection techniques in H&E stained breast cancer pathological images: A comprehensive review
CN106295124A (en) Utilize the method that multiple image detecting technique comprehensively analyzes gene polyadenylation signal figure likelihood probability amount
CN111738363B (en) Alzheimer disease classification method based on improved 3D CNN network
CN106408001A (en) Rapid area-of-interest detection method based on depth kernelized hashing
CN111932529B (en) Image classification and segmentation method, device and system
CN115393269A (en) Extensible multi-level graph neural network model based on multi-modal image data
JP7427080B2 (en) Weakly supervised multitask learning for cell detection and segmentation
CN110782427B (en) Magnetic resonance brain tumor automatic segmentation method based on separable cavity convolution
Skeika et al. Convolutional neural network to detect and measure fetal skull circumference in ultrasound imaging
CN115147600A (en) GBM multi-mode MR image segmentation method based on classifier weight converter
Jiang et al. Unpaired cross-modality educed distillation (CMEDL) for medical image segmentation
CN110930424B (en) Organ contour analysis method and device
CN115222954A (en) Weak perception target detection method and related equipment
Wegmayr et al. Generative aging of brain MR-images and prediction of Alzheimer progression
Liang et al. Weakly supervised deep nuclei segmentation with sparsely annotated bounding boxes for DNA image cytometry
CN113764101A (en) CNN-based breast cancer neoadjuvant chemotherapy multi-modal ultrasonic diagnosis system
CN113610746A (en) Image processing method and device, computer equipment and storage medium
CN117036288A (en) Tumor subtype diagnosis method for full-slice pathological image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination