CN116597285B - Pulmonary tissue pathology image processing model, construction method and image processing method - Google Patents
Pulmonary tissue pathology image processing model, construction method and image processing method Download PDFInfo
- Publication number
- CN116597285B CN116597285B CN202310868244.0A CN202310868244A CN116597285B CN 116597285 B CN116597285 B CN 116597285B CN 202310868244 A CN202310868244 A CN 202310868244A CN 116597285 B CN116597285 B CN 116597285B
- Authority
- CN
- China
- Prior art keywords
- mask
- encoder
- image
- layer
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012545 processing Methods 0.000 title claims abstract description 22
- 230000007170 pathology Effects 0.000 title claims abstract description 21
- 238000003672 processing method Methods 0.000 title abstract description 5
- 238000010276 construction Methods 0.000 title abstract description 4
- 210000004879 pulmonary tissue Anatomy 0.000 title description 2
- 238000011144 upstream manufacturing Methods 0.000 claims abstract description 58
- 238000012549 training Methods 0.000 claims abstract description 49
- 210000004072 lung Anatomy 0.000 claims abstract description 27
- 238000000034 method Methods 0.000 claims abstract description 18
- 238000012360 testing method Methods 0.000 claims abstract description 17
- 230000004927 fusion Effects 0.000 claims abstract description 7
- 239000013598 vector Substances 0.000 claims description 30
- 230000001575 pathological effect Effects 0.000 claims description 15
- 230000000873 masking effect Effects 0.000 claims description 11
- 238000007781 pre-processing Methods 0.000 claims description 8
- 230000008447 perception Effects 0.000 claims description 6
- 230000002685 pulmonary effect Effects 0.000 claims description 4
- 238000003745 diagnosis Methods 0.000 abstract description 5
- 230000007246 mechanism Effects 0.000 abstract description 3
- 230000000694 effects Effects 0.000 abstract description 2
- 210000001519 tissue Anatomy 0.000 description 15
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 10
- 208000020816 lung neoplasm Diseases 0.000 description 10
- 201000005202 lung cancer Diseases 0.000 description 9
- 208000010507 Adenocarcinoma of Lung Diseases 0.000 description 8
- 201000005249 lung adenocarcinoma Diseases 0.000 description 8
- 238000013135 deep learning Methods 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 4
- 206010028980 Neoplasm Diseases 0.000 description 3
- 238000010827 pathological analysis Methods 0.000 description 3
- 201000011510 cancer Diseases 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 206010041823 squamous cell carcinoma Diseases 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 201000005296 lung carcinoma Diseases 0.000 description 1
- 208000037841 lung tumor Diseases 0.000 description 1
- 230000036210 malignancy Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/0895—Weakly supervised learning, e.g. semi-supervised or self-supervised learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention belongs to the technical field of medical image processing, and particularly relates to a lung tissue pathology image processing model, a construction method and an image processing method, wherein a data set of a lung tissue pathology image is obtained, and the data set is preprocessed and then is divided into a training data set and a testing data set according to a proportion; establishing an upstream task, generating a new sample after proportional fusion and addition of each image in a training data set, and training the new sample to obtain a trained mask self-encoder upstream model; the method comprises the steps of establishing a downstream task, utilizing a mask self-encoder upstream model trained by the upstream task, evaluating the performance of the mask self-encoder upstream model by adopting a test data set, solving the problems of complex diagnosis system and the like caused by complex convolution operation and sub-attention mechanisms in the current mainstream algorithm, and having the effects of high efficiency and simplifying the diagnosis system due to long training time of the algorithm.
Description
Technical Field
The invention belongs to the technical field of medical image processing, and particularly relates to a lung tissue pathology image processing model, a construction method and an image processing method.
Background
Lung cancer, one of several major cancers in the world, has been a serious threat to people's life health and safety. Lung cancer is statistically the first and second most common malignancy in men and women. In recent years, the incidence rate of lung squamous carcinoma is in a decreasing trend, and the incidence rate is about 30% -40% of lung carcinoma; the incidence rate of lung adenocarcinoma is in an ascending trend, and the incidence rate is about 40% -55% of lung cancer. Lung adenocarcinoma is thus the major subtype of lung cancer. Lung adenocarcinoma can be classified into invasive lung adenocarcinoma and micro-invasive lung adenocarcinoma.
The most reliable scientific basis for the pathologist to diagnose lung cancer is the pathology examination. The specific subtype of lung cancer is judged by observing the information of the size, the shape, the position and the like of cells in the pathological image. Therefore, accurate pathological diagnosis is of great importance for the treatment and cure of patients. However, since the pathological diagnosis itself has a great workload, it is a complex task for the pathologist, which is not only time-consuming and laborious, but also inevitably causes misdiagnosis.
With the rapid development of artificial intelligence, deep learning is widely applied in different fields. More and more researches show that the reliability of the deep learning algorithm in medical image analysis, especially in the task of identifying lung cancer tissue pathology images. Therefore, the deep learning algorithm can be utilized to solve the problems of great workload and missed diagnosis and misdiagnosis of lung cancer pathological diagnosis of a pathologist.
However, the algorithms used in previous deep learning studies to process lung cancer histopathological images are mostly Convolutional Neural Network (CNN), which is for local information in the image, and Vision Transformer (ViT), viT, which is global information in the captured image. Both the convolution operation in CNN and the self-attention mechanism in ViT are very complex operations, which lead to complexity of the model, which is disadvantageous for deployment in an assisted diagnostic system.
In view of the above dilemma, self-supervised learning is a viable solution, and unsupervised feature learning based on deep learning has been favored in recent years for low-level tasks such as cell and lung tumor detection and classification. Self-supervised learning may learn informative data representations from unlabeled data and has been successfully applied to image classification tasks.
Currently, many studies demonstrate the effectiveness of self-supervised learning approaches in various medical image tasks, such as medical image classification, medical image detection, and medical image segmentation. For the histopathological image, some self-supervision learning methods are also proposed, but the images of important nuances in histopathology are difficult to distinguish, and higher-level semantic information cannot be extracted by a self-supervision algorithm. There is thus currently still a lack of effective self-supervised learning methods to extract visual representations from histopathological images and to accomplish the relevant tasks.
Disclosure of Invention
The invention aims to solve the technical problems of complex convolution operation and sub-attention mechanisms in main stream algorithms Convolutional neural network and Vision Transformer at present, and the problems of complex diagnosis system and the like caused by long training time of the algorithms.
The present invention has been achieved in such a way that,
a method for constructing a lung tissue pathology image processing model comprises the following steps:
acquiring a data set of a lung tissue pathological image, preprocessing the data set, and dividing the data set into a training data set and a testing data set according to a proportion;
establishing an upstream task, establishing a mask self-encoder upstream model in the upstream task, generating a new sample after proportional fusion and addition of each image in a training data set, training the new sample, calculating a loss value between an image sample reconstructed by the mask self-encoder upstream model and an original image sample, and increasing training rounds to minimize the loss value to obtain a trained mask self-encoder upstream model;
and establishing a downstream task, utilizing a mask self-encoder upstream model trained by the upstream task, and evaluating the performance of the mask self-encoder upstream model by adopting a test data set.
Further, the mask self-encoder upstream model comprises a mask mixing layer, an encoder layer and a decoder layer, wherein the mask mixing layer generates a new sample image after fusing and adding each image in a training data set in proportion, divides the new sample image into a mask cutting block and a non-mask cutting block in proportion, inputs the non-mask cutting block into the encoder layer, outputs a linear output block, and adds position codes to the linear output block and the mask cutting block and inputs the linear output block and the mask cutting block into the decoder layer;
the encoder layer encodes the non-mask cut blocks output by the mask mixing layer through linear projection to obtain linear output blocks, adds position codes to the linear output blocks, and sorts the mask cut blocks and the linear output blocks to obtain a two-dimensional list;
the decoder layer recovers and reconstructs all the blocks input by the encoder layer by learning the characteristics of the non-mask cutting blocks according to the non-mask cutting blocks output by the mask mixing layer and the two-dimensional list output by the encoder layer to obtain a reconstructed two-dimensional list, reconstructs the reconstructed two-dimensional list to obtain a decoded restored image, calculates a loss value between the restored image and an original image, increases training rounds, and continuously carries out the encoder layer and the decoder layer, so that the loss value finally reaches a fitting state.
Further, the proportionally dividing the new sample image into masked and unmasked tiles includes: the method comprises the steps of dividing an image into a plurality of square cut blocks, dividing the cut blocks into mask cut blocks and non-mask cut blocks according to the setting of a mask rate, performing masking operation, and sequentially extracting the non-mask cut blocks into a column vector.
Further, the ordering of the masked and unmasked tiles includes: and (3) sorting according to the sequence after dicing, and replacing the non-mask dicing with the linear output blocks to obtain a two-dimensional list.
Further, the fusing includes: each image within the training dataset is proportionally mixed with any other image.
Further, the masking includes:
the transducer encoder adopts a code layer trained by a mask from an upstream model of the encoder to output the cut linear projection as a feature vector, and adds position codes to the feature vector;
the learnable classifier is used for summarizing the feature vectors which are output by the transducer encoder and added with the position codes;
and the multi-head perception machine gathers and completes the classification of the images according to the characteristics output by the learnable classifier.
A model of lung tissue pathology image processing, the model comprising:
the data acquisition module is used for preprocessing a data set of a lung tissue pathological image and dividing the data set into a training data set and a test data set according to a proportion;
the mask self-encoder model is obtained by generating a new sample after proportional fusion and addition of each image in the training data set and training the new sample;
and establishing a downstream task, utilizing a mask self-encoder upstream model trained by the upstream task, and evaluating the performance of the mask self-encoder upstream model by adopting a test data set.
Further, the mask self-encoder upstream model comprises a mask mixing layer, an encoder layer and a decoder layer, wherein the mask mixing layer generates a new sample image after fusing and adding each image in a training data set in proportion, divides the new sample image into a mask cutting block and a non-mask cutting block in proportion, inputs the non-mask cutting block into the encoder layer, outputs a linear output block, and adds position codes to the linear output block and the mask cutting block and inputs the linear output block and the mask cutting block into the decoder layer;
the encoder layer encodes the non-mask cut blocks output by the mask mixing layer through linear projection to obtain linear output blocks, adds position codes to the linear output blocks, and sorts the mask cut blocks and the linear output blocks to obtain a two-dimensional list;
the decoder layer recovers and reconstructs all the blocks input by the encoder layer by learning the characteristics of the non-mask cutting blocks according to the non-mask cutting blocks output by the mask mixing layer and the two-dimensional list output by the encoder layer to obtain a reconstructed two-dimensional list, reconstructs the reconstructed two-dimensional list to obtain a decoded restored image, calculates a loss value between the restored image and an original image, increases training rounds, and continuously carries out the encoder layer and the decoder layer, so that the loss value finally reaches a fitting state.
Further, the masking includes:
the transducer encoder adopts a code layer trained by a mask from an upstream model of the encoder to output the cut linear projection as a feature vector, and adds position codes to the feature vector;
the learnable classifier is used for summarizing the feature vectors which are output by the transducer encoder and added with the position codes;
and the multi-head perception machine gathers and completes the classification of the images according to the characteristics output by the learnable classifier.
A method of processing a lung tissue pathology image, comprising:
dicing the input test dataset;
outputting the linear projection of the cut block as a feature vector, and adding position coding to the feature vector;
feature summarizing the feature vector added with the position codes;
and finishing the classification of the images according to the feature summary.
Compared with the prior art, the invention has the beneficial effects that:
the mask self-encoder model is adopted, the mask self-encoder can be prevented from only reading limited image information, the rapid processing and classification of a large amount of data are realized, the mask self-encoder model is easy to deploy in an auxiliary diagnosis system, and the classification of pathological pictures is realized.
The invention is suitable for the task of classifying histopathological images. And combining a mixup image enhancement technology and MAE self-supervision learning under the condition of a small amount of annotation data to fully mine advanced semantic information in the pathological image field. By mixing the images between two random samples, the overfitting is reduced and the generalization and robustness of the neural network are improved. A hybrid self-supervision visual characterization learning framework is constructed for the histopathological image, and the model is helped to deeply mine high-level semantic information of the pathological image. The method MixMAE has superiority in classification task, and in addition, the expansion experiment proves that the model has the capability of identifying other cancer pathological images.
Drawings
FIG. 1 is a flow chart of an upstream task in a method for constructing a model for processing pathological images of lung tissue according to an embodiment of the present invention;
fig. 2 is a flowchart of a downstream task in a method for constructing a lung tissue pathology image processing model, or a lung tissue pathology image processing method according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1 in combination with fig. 2, a method for constructing a lung tissue pathology image processing model includes:
acquiring a data set of a lung tissue pathological image, preprocessing the data set, and dividing the data set into a training data set and a testing data set according to a proportion; the data set here includes: available pulmonary histopathological data, and is mixed with pulmonary histopathological data in the public data set LC25000 to prepare a pulmonary pathology data set. The preprocessing refers to data preprocessing on the mixed data set, adjusting the image size to a uniform size, and performing data enhancement processing.
The mixed lung pathology data set comprises five types of data of lung squamous carcinoma, lung adenocarcinoma, invasive lung adenocarcinoma, micro-invasive lung adenocarcinoma and normal lung tissue, and the five types of data are adjusted to be uniform 224×224 pixel size. And performing data enhancement on the mixed data set, and performing mirror image overturning, rotation, scaling, height movement and width movement operation, so that the data volume in the data set is increased by five times to the original data volume, the data volume is expanded to five times, and the model is fully trained, so that the model is more fully learned.
Establishing an upstream task, establishing a mask self-encoder upstream model in the upstream task, generating a new sample after proportional fusion and addition of each image in a training data set, and training the new sample to obtain a trained mask self-encoder upstream model;
masking the processed training set pathology image data from an upstream model of the encoder, performing image mixing operation, masking according to a masking rate, inputting the non-masked cut blocks into the encoder, extracting features, inputting the non-masked cut blocks into a decoder together with all masking blocks, and finally generating a target image;
specifically, the mask self-encoder upstream model comprises a mask mixing layer, an encoder layer and a decoder layer, wherein the mask mixing layer generates a new sample image after fusing and adding each image in a training data set in proportion, the new sample image is divided into a mask cutting block and a non-mask cutting block according to proportion, the non-mask cutting block is input into the encoder layer, a linear output block is output, the linear output block and the mask block are added into a position code and input into the decoder layer, a reconstructed image block is output, a loss value between an original image and a reconstructed image is calculated, training rounds are increased, and the encoder layer and the decoder layer are continuously carried out, so that the loss value is finally made to be in a fitting state;
proportionally dividing the new sample image into masked and unmasked tiles includes: the method comprises the steps of dividing an image into a plurality of square cut blocks, dividing the cut blocks into mask cut blocks and non-mask cut blocks according to the setting of a mask rate, performing masking operation, and sequentially extracting the non-mask cut blocks into a column vector.
The encoder layer encodes the non-mask cut blocks output by the mask mixing layer through linear projection to obtain linear output blocks, adds position codes to the linear output blocks, and sorts the mask cut blocks and the linear output blocks to obtain a two-dimensional list; ordering the masked and unmasked tiles includes: and (3) sorting according to the sequence after dicing, and replacing the non-mask dicing with the linear output blocks to obtain a two-dimensional list.
And the decoder layer is used for recovering and reconstructing all the blocks input by the encoder layer by learning the characteristics of the non-mask cutting blocks according to the non-mask cutting blocks output by the mask mixing layer and the two-dimensional list output by the encoder layer to obtain a reconstructed two-dimensional list, reconstructing the reconstructed two-dimensional list to obtain a decoded restored image, calculating a loss value between the restored image and the original image, increasing training rounds, and continuously performing the two steps of the encoder layer and the decoder layer to minimize the loss value.
And establishing a downstream task, utilizing a mask self-encoder upstream model trained by the upstream task, and evaluating the performance of the mask self-encoder upstream model by adopting a test data set.
In the downstream task, the model after learning in the upstream task is stored and loaded into the ViT model, and the classification test is performed according to the image information learned in the upstream task, and the result is evaluated.
A mask self-encoder downstream model, comprising:
the transducer encoder adopts a code layer trained by a mask from an upstream model of the encoder to output the cut linear projection as a feature vector, and adds position codes to the feature vector;
the learnable classifier is used for summarizing the feature vectors which are output by the transducer encoder and added with the position codes;
and the multi-head perception machine gathers and completes the classification of the images according to the characteristics output by the learnable classifier.
In an embodiment, a large amount of local information exists in the pathological data, and in order to enable the model to fully capture the local information, the input pathological image is subjected to image mixing operation, so that the learning efficiency of the model is improved through limited data;
step S1: and the mask carries out pixel-level image mixing on each picture of the training set and any other picture in the training set by using an upstream model of the encoder, wherein the mixing proportion is a fixed value. After the image mixing operation, the number of images of the training set is not changed;
step S2: dividing all 3×224×224 images after mixing into 14×14 tiles of size 16×16, then masking the 14×14 tiles according to the mask ratio, the masked tiles will become grayscale image tiles as the final input of the model;
step S3: generating a column vector according to an initial sequence from a visible block, namely a non-mask block, in the final input of the model, and inputting the column vector into an encoder to obtain an encoded block containing image information as an output of the encoder;
step S4: the encoded blocks of step S3 and the mask blocks of step S2 are sequentially arranged into a column vector as input to the decoder. The decoder reconstructs images of the coding blocks and the mask blocks according to the information contained in the coding blocks and takes the reconstructed images as the output of the decoder;
step S5: the steps S3 and S4 are repeated continuously, so that the loss between the output of the decoder and the original image is reduced, and the model training effect is achieved.
In this embodiment, the performance of the upstream model is evaluated by the downstream task: the performance of the upstream model is evaluated using a training loss indicator. The model performance is evaluated by using four evaluation indexes of accuracy, precision, specific value and sensitivity.
The embodiment of the invention provides a lung tissue pathology image processing model established in the mode, which comprises the following steps:
the data acquisition module is used for preprocessing a data set of a lung tissue pathological image and dividing the data set into a training data set and a test data set according to a proportion;
the mask self-encoder model is obtained by generating a new sample after proportional fusion and addition of each image in the training data set and training the new sample;
and establishing a downstream task, utilizing a mask self-encoder upstream model trained by the upstream task, and evaluating the performance of the mask self-encoder upstream model by adopting a test data set.
The mask self-encoder upstream model comprises a mask mixing layer, an encoder layer and a decoder layer, wherein the mask mixing layer generates a new sample image after fusing and adding each image in a training data set in proportion, the new sample image is divided into a mask cutting block and a non-mask cutting block according to proportion, the non-mask cutting block is input into the encoder layer and output to obtain a linear output block, the linear output block and the mask block are added into a position code and input into the decoder layer and output to obtain a reconstructed image block, a loss value between an original image and a reconstructed image is calculated, training rounds are increased, and two steps of the encoder layer and the decoder layer are continuously carried out, so that the loss value is finally made into a fitting state;
the encoder layer uses the Vision Transformer architecture, but only works for non-masked tiles. Encoding the picture by linear projection to obtain a linear output block, adding position encoding to the linear output block, and sequencing the mask block and the linear output block to obtain a two-dimensional list;
the decoder layer, using a transducer architecture, inputs a set of entire picture slices, including masked slices and non-masked slices. According to the two-dimensional lists output by the non-mask cutting blocks and the encoder layers output by the mask mixing layer, all the cutting blocks input by the encoder layers are restored and rebuilt through learning the characteristics of the non-mask cutting blocks to obtain a rebuilt two-dimensional list, the rebuilt two-dimensional list is rebuilt to obtain a decoded restored image, a loss value between the restored image and an original image is calculated, training rounds are added, and the two steps of the encoder layers and the decoder layers are continuously carried out, so that the loss value is minimum.
The mask is from the downstream model of the encoder, and the mask is a Vision Transformer architecture from the downstream model of the encoder; upstream pre-training models are applied downstream through upstream pre-training to perform classification tasks. Comprising the following steps:
the transducer encoder adopts a code layer trained by a mask from an upstream model of the encoder to output the cut linear projection as a feature vector, and adds position codes to the feature vector;
the learnable classifier is used for summarizing the feature vectors which are output by the transducer encoder and added with the position codes;
and the multi-head perception machine gathers and completes the classification of the images according to the characteristics output by the learnable classifier.
The embodiment of the invention adopts a mask from a model at the downstream of the encoder to classify images, and comprises the following steps: dicing the input test dataset;
outputting the linear projection of the cut block as a feature vector, and adding position coding to the feature vector;
feature summarizing the feature vector added with the position codes;
and finishing the classification of the images according to the feature summary.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.
Claims (7)
1. A method for constructing a lung tissue pathology image processing model, comprising the steps of:
acquiring a data set of a lung tissue pathological image, preprocessing the data set, and dividing the data set into a training data set and a testing data set according to a proportion;
establishing an upstream task, establishing a mask self-encoder upstream model in the upstream task, generating a new sample after proportional fusion and addition of each image in a training data set, training the new sample, calculating a loss value between an image sample reconstructed by the mask self-encoder upstream model and an original image sample, and increasing training rounds to minimize the loss value to obtain a trained mask self-encoder upstream model;
establishing a downstream task, utilizing a mask self-encoder upstream model trained by the upstream task, and evaluating the performance of the mask self-encoder upstream model by adopting a test data set;
the mask self-encoder upstream model comprises a mask mixing layer, an encoder layer and a decoder layer, wherein the mask mixing layer generates a new sample image after fusing and adding each image in a training data set in proportion, the new sample image is divided into a mask cutting block and a non-mask cutting block in proportion, the non-mask cutting block is input to the encoder layer, a linear output block is obtained by outputting, and the linear output block and the mask block are added into a position code and are input to the decoder layer;
the encoder layer encodes the non-mask cut blocks output by the mask mixing layer through linear projection to obtain linear output blocks, adds position codes to the linear output blocks, and sorts the mask cut blocks and the linear output blocks to obtain a two-dimensional list;
the decoder layer recovers and reconstructs all the blocks input by the encoder layer by learning the characteristics of the non-mask cutting blocks according to the non-mask cutting blocks output by the mask mixing layer and the two-dimensional list output by the encoder layer to obtain a reconstructed two-dimensional list, reconstructs the reconstructed two-dimensional list to obtain a decoded restored image, calculates a loss value between the restored image and an original image, increases training rounds, and continuously carries out the encoder layer and the decoder layer, so that the loss value finally reaches a fitting state.
2. The method of constructing a lung histopathological image processing model according to claim 1, wherein the proportionally dividing the new sample image into masked and unmasked tiles comprises: the method comprises the steps of dividing an image into a plurality of square cut blocks, dividing the cut blocks into mask cut blocks and non-mask cut blocks according to the setting of a mask rate, performing masking operation, and sequentially extracting the non-mask cut blocks into a column vector.
3. The method of constructing a lung tissue pathology image processing model according to claim 1, wherein said ordering of masked and unmasked tiles comprises: and (3) sorting according to the sequence after dicing, and replacing the non-mask dicing with the linear output blocks to obtain a two-dimensional list.
4. The method of constructing a model of lung tissue pathology image processing according to claim 1, wherein said fusing comprises: each image within the training dataset is proportionally mixed with any other image.
5. A method of constructing a model of a lung tissue pathology image processing according to claim 1, wherein said masking from the model downstream of the encoder comprises:
the transducer encoder adopts a code layer trained by a mask from an upstream model of the encoder to output the cut linear projection as a feature vector, and adds position codes to the feature vector;
the learnable classifier is used for summarizing the feature vectors which are output by the transducer encoder and added with the position codes;
and the multi-head perception machine gathers and completes the classification of the images according to the characteristics output by the learnable classifier.
6. A model for processing a pathological image of lung tissue, the model comprising:
the data acquisition module is used for preprocessing a data set of a lung tissue pathological image and dividing the data set into a training data set and a test data set according to a proportion;
the mask self-encoder model is obtained by generating a new sample after proportional fusion and addition of each image in the training data set and training the new sample;
establishing a downstream task, utilizing a mask self-encoder upstream model trained by the upstream task, and evaluating the performance of the mask self-encoder upstream model by adopting a test data set;
the mask self-encoder upstream model comprises a mask mixing layer, an encoder layer and a decoder layer, wherein the mask mixing layer generates a new sample image after fusing and adding each image in a training data set in proportion, the new sample image is divided into a mask cutting block and a non-mask cutting block in proportion, the non-mask cutting block is input to the encoder layer, a linear output block is obtained by outputting, and the linear output block and the mask block are added into a position code and are input to the decoder layer;
the encoder layer encodes the non-mask cut blocks output by the mask mixing layer through linear projection to obtain linear output blocks, adds position codes to the linear output blocks, and sorts the mask cut blocks and the linear output blocks to obtain a two-dimensional list;
the decoder layer recovers and reconstructs all the blocks input by the encoder layer by learning the characteristics of the non-mask cutting blocks according to the non-mask cutting blocks output by the mask mixing layer and the two-dimensional list output by the encoder layer to obtain a reconstructed two-dimensional list, reconstructs the reconstructed two-dimensional list to obtain a decoded restored image, calculates a loss value between the restored image and an original image, increases training rounds, and continuously carries out the encoder layer and the decoder layer, so that the loss value finally reaches a fitting state.
7. The pulmonary histopathological image processing model of claim 6, wherein the mask is derived from an encoder downstream model comprising:
the transducer encoder adopts a code layer trained by a mask from an upstream model of the encoder to output the cut linear projection as a feature vector, and adds position codes to the feature vector;
the learnable classifier is used for summarizing the feature vectors which are output by the transducer encoder and added with the position codes;
and the multi-head perception machine gathers and completes the classification of the images according to the characteristics output by the learnable classifier.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310868244.0A CN116597285B (en) | 2023-07-17 | 2023-07-17 | Pulmonary tissue pathology image processing model, construction method and image processing method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310868244.0A CN116597285B (en) | 2023-07-17 | 2023-07-17 | Pulmonary tissue pathology image processing model, construction method and image processing method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116597285A CN116597285A (en) | 2023-08-15 |
CN116597285B true CN116597285B (en) | 2023-09-22 |
Family
ID=87601195
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310868244.0A Active CN116597285B (en) | 2023-07-17 | 2023-07-17 | Pulmonary tissue pathology image processing model, construction method and image processing method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116597285B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117173543B (en) * | 2023-11-02 | 2024-02-02 | 天津大学 | Mixed image reconstruction method and system for lung adenocarcinoma and pulmonary tuberculosis |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109300128A (en) * | 2018-09-29 | 2019-02-01 | 聚时科技(上海)有限公司 | The transfer learning image processing method of structure is implied based on convolutional Neural net |
CN109345508A (en) * | 2018-08-31 | 2019-02-15 | 北京航空航天大学 | A kind of Assessing Standards For Skeletal method based on two stages neural network |
CN112150568A (en) * | 2020-09-16 | 2020-12-29 | 浙江大学 | Magnetic resonance fingerprint imaging reconstruction method based on Transformer model |
CN116030306A (en) * | 2023-02-08 | 2023-04-28 | 吉林大学 | Pulmonary tissue pathology image type auxiliary classification method based on multilayer perceptron |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10333547B2 (en) * | 2012-08-13 | 2019-06-25 | Gurologic Microsystems Oy | Encoder and method for encoding input data using a plurality of different transformations or combinations of transformations |
US11087165B2 (en) * | 2018-11-29 | 2021-08-10 | Nec Corporation | Method and system for contextualizing automatic image segmentation and regression |
-
2023
- 2023-07-17 CN CN202310868244.0A patent/CN116597285B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109345508A (en) * | 2018-08-31 | 2019-02-15 | 北京航空航天大学 | A kind of Assessing Standards For Skeletal method based on two stages neural network |
CN109300128A (en) * | 2018-09-29 | 2019-02-01 | 聚时科技(上海)有限公司 | The transfer learning image processing method of structure is implied based on convolutional Neural net |
CN112150568A (en) * | 2020-09-16 | 2020-12-29 | 浙江大学 | Magnetic resonance fingerprint imaging reconstruction method based on Transformer model |
CN116030306A (en) * | 2023-02-08 | 2023-04-28 | 吉林大学 | Pulmonary tissue pathology image type auxiliary classification method based on multilayer perceptron |
Non-Patent Citations (1)
Title |
---|
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers;Jihao Liu et al.;《arXiv:2205.13137v4 [cs.CV]》;正文第1-3、5章节 * |
Also Published As
Publication number | Publication date |
---|---|
CN116597285A (en) | 2023-08-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111369565B (en) | Digital pathological image segmentation and classification method based on graph convolution network | |
Feng et al. | Residual learning for salient object detection | |
Zhou et al. | Cross-level feature aggregation network for polyp segmentation | |
CN114119638A (en) | Medical image segmentation method integrating multi-scale features and attention mechanism | |
CN111311563A (en) | Image tampering detection method based on multi-domain feature fusion | |
CN113989662B (en) | Remote sensing image fine-grained target identification method based on self-supervision mechanism | |
CN113034505B (en) | Glandular cell image segmentation method and glandular cell image segmentation device based on edge perception network | |
CN116597285B (en) | Pulmonary tissue pathology image processing model, construction method and image processing method | |
EP4276684A1 (en) | Capsule endoscope image recognition method based on deep learning, and device and medium | |
CN115018824A (en) | Colonoscope polyp image segmentation method based on CNN and Transformer fusion | |
CN110930378B (en) | Emphysema image processing method and system based on low data demand | |
Wazir et al. | HistoSeg: Quick attention with multi-loss function for multi-structure segmentation in digital histology images | |
CN111444844A (en) | Liquid-based cell artificial intelligence detection method based on variational self-encoder | |
Wang et al. | FaceFormer: Aggregating global and local representation for face hallucination | |
CN112750132A (en) | White blood cell image segmentation method based on dual-path network and channel attention | |
CN110782427A (en) | Magnetic resonance brain tumor automatic segmentation method based on separable cavity convolution | |
Jiang et al. | Forest-CD: Forest change detection network based on VHR images | |
Li et al. | Image segmentation based on improved unet | |
CN115862120A (en) | Separable variation self-encoder decoupled face action unit identification method and equipment | |
CN116012395A (en) | Multi-scale fusion smoke segmentation method based on depth separable convolution | |
CN111027440A (en) | Crowd abnormal behavior detection device and method based on neural network | |
CN114511798A (en) | Transformer-based driver distraction detection method and device | |
CN117934824A (en) | Target region segmentation method and system for ultrasonic image and electronic equipment | |
CN116935044B (en) | Endoscopic polyp segmentation method with multi-scale guidance and multi-level supervision | |
Sabnam et al. | Application of generative adversarial networks in image, face reconstruction and medical imaging: challenges and the current progress |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |