CN117522693A - Method and system for enhancing machine vision of medical images using super resolution techniques - Google Patents
Method and system for enhancing machine vision of medical images using super resolution techniques Download PDFInfo
- Publication number
- CN117522693A CN117522693A CN202311580384.4A CN202311580384A CN117522693A CN 117522693 A CN117522693 A CN 117522693A CN 202311580384 A CN202311580384 A CN 202311580384A CN 117522693 A CN117522693 A CN 117522693A
- Authority
- CN
- China
- Prior art keywords
- resolution
- image
- super
- network
- loss
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 78
- 230000002708 enhancing effect Effects 0.000 title claims abstract description 24
- 238000012549 training Methods 0.000 claims abstract description 61
- 238000012360 testing method Methods 0.000 claims abstract description 14
- 238000005516 engineering process Methods 0.000 claims abstract description 12
- 238000004519 manufacturing process Methods 0.000 claims abstract description 3
- 230000008569 process Effects 0.000 claims description 14
- 238000000605 extraction Methods 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 12
- 238000005520 cutting process Methods 0.000 claims description 11
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 10
- 230000036541 health Effects 0.000 claims description 9
- 238000004873 anchoring Methods 0.000 claims description 8
- 201000010099 disease Diseases 0.000 claims description 6
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 6
- 238000002360 preparation method Methods 0.000 claims description 6
- 230000009286 beneficial effect Effects 0.000 claims description 5
- 238000009826 distribution Methods 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 5
- 230000009466 transformation Effects 0.000 claims description 5
- 238000005457 optimization Methods 0.000 claims description 4
- 238000012805 post-processing Methods 0.000 claims description 4
- 238000011282 treatment Methods 0.000 claims description 4
- 230000000007 visual effect Effects 0.000 abstract description 3
- 210000004072 lung Anatomy 0.000 description 16
- 238000013461 design Methods 0.000 description 15
- 206010035664 Pneumonia Diseases 0.000 description 7
- 238000013135 deep learning Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000002059 diagnostic imaging Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000012952 Resampling Methods 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000010827 pathological analysis Methods 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000011269 treatment regimen Methods 0.000 description 1
- 210000001835 viscera Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method and a system for enhancing machine vision of medical images by using a super-resolution technology, which comprises the following steps of S1, manufacturing a low-resolution-high-resolution image pair; step S2, training a classifier by using the medical image in the image pair; step S3, designing a super-resolution network on the basis of a classical network; step S4, training and optimizing the super-resolution network to obtain a pre-training super-resolution network; s5, designing an image super-resolution model, and fine-tuning parameters of a super-resolution network; and 6, testing the image super-resolution model. By adopting the method and the system, the result of utilizing the machine vision task can be realized, the super-resolution of the medical image is promoted to more pertinently improve the quality of the reconstructed image, and the problems that the single consideration of the super-resolution result of the medical image improves the human visual effect but neglects the influence on the machine vision task in the prior art are solved.
Description
Technical Field
The present invention relates to the field of image recognition, and more particularly, to a method and system for enhancing machine vision of medical images using super resolution techniques.
Background
Medical imaging refers to the use of medical imaging equipment to obtain detailed information of internal organs or tissues of a patient without invading the body. These images contain rich physiological, anatomical and pathological information, and are of great significance for diagnosing the pathology of patients, formulating treatment regimens and conducting medical research. Particularly, high-quality medical images can provide more accurate detail information, obviously improve the accuracy of pathological diagnosis, accurately identify focus positions and deeply understand the specific conditions of affected parts. Early applications of super-resolution technology in the image field were mainly natural images. With the continuous progress of deep learning methods in natural image super-resolution technology, the methods also start to be applied to classical networks such as EDSR, SRGAN, RCAN, swinIR and the like of medical images, and have demonstrated great application potential.
By using a convolution-based method in deep learning, the medical image super-resolution technology has achieved remarkable performance improvement, and the method is excellent in human visual effect and also excellent in objective evaluation index. In the medical field, however, the goal of image processing is not just an improvement in image quality, but rather focuses more on the contribution of images to the exact diagnostic task, which results in that the super-resolution of current deep learning-based medical images still faces some challenges, such as:
1. most of the current techniques still focus too much on reconstructing high quality images, but neglecting whether these images actually contribute to improving the performance of machine vision tasks (specific medical diagnostic analysis, such as classification, segmentation, detection, etc.);
2. medical images and natural images have significant differences in image characteristics, image modes, signal to noise ratios and the like, so that the differences are required to be used for design more pertinently when a network is designed;
3. due to increasingly complex network designs and smaller scale public medical image data, slow convergence problems may be encountered in training models of a large number of parameters, which not only affects the performance of the model, but also reduces experimental efficiency.
Disclosure of Invention
In order to overcome the technical defects, the technical problems to be solved by the invention are as follows:
1. the patent provides a two-stage medical image super-resolution frame design for machine vision enhancement aiming at the problem that the medical image super-resolution lacks attention for improving the performance of a machine vision task, and the machine vision effect is enhanced on the basis of keeping the human vision effect of the reconstructed image, so that the performance of the machine vision task is improved;
2. aiming at the problem of difference between medical images and natural images, the patent proposes the design of fine adjustment and frames on the basis of a super-resolution network based on medical image pre-training, and meanwhile, the self-adaptive multi-scale information extraction design is added, so that the model can learn the characteristics of the medical images more pertinently;
3. the patent provides an improved multi-sample comparison learning strategy aiming at the problem of slow convergence speed of model training with large parameters, and is also beneficial to improving the super-resolution network reconstruction quality.
To this end, one aspect of the present invention provides a method of enhancing machine vision of medical images using super resolution techniques, comprising the steps of:
step 1: producing a low-resolution-high-resolution image pair, wherein the image pair consists of a low-resolution image, a high-resolution image and a label, the label is 0 or 1, and the method comprises the following steps of:
step 1.1: acquiring a medical image original image and a corresponding health condition, and marking labels of two types of health and diseases as 0 and 1 respectively as real labels;
step 1.2: cutting an original image of a medical image and keeping key information which is helpful for a machine vision task, obtaining a low-resolution image by using a bicubic downsampling method according to a given scale as a high-resolution image, combining a real label to form an image pair A of a low-resolution image, a high-resolution image and a label;
step 1.3: taking the original image of the medical image as a high-resolution image, obtaining a low-resolution image by using a bicubic downsampling method according to a given scale, combining with a real label to form an image pair of 'low-resolution image, high-resolution image and label', and recording the image pair B;
step 2: training a classifier for predicting content labels of the input image by using the medical image in the image pair A, and outputting a result as a disease probability;
step 3: designing a super-resolution network on the basis of a classical network;
step 3.1: adding a multi-scale information extraction module to extract characteristics at the network head, and reserving and enhancing local information and whole information of an input image;
step 3.2: connecting a super-resolution image generated by a network with an input low-resolution image in a channel dimension at the tail of the network, respectively obtaining two parameters of dynamic normalization by utilizing two convolution branches, and correcting the information distribution of the super-resolution image by using the parameters to obtain a final super-resolution image;
step 4: inputting the low-resolution image in the image pair A, B into a super-resolution network for training, calculating super-resolution network loss by using the generated super-resolution image and the corresponding high-resolution image, and optimizing the super-resolution network to obtain a pre-training super-resolution network, wherein the method comprises the following steps of:
step 4.1: inputting the low-resolution image of the image pair A into a super-resolution network for post-processing to obtain a corresponding super-resolution image, wherein key information in the image is reserved;
step 4.2: cutting the low-resolution image in the image pair B into image blocks with proper sizes, and inputting the image blocks into a super-resolution network in batches after data enhancement operation treatment to obtain a corresponding super-resolution image;
step 4.3: calculating super-resolution network loss by utilizing the super-resolution images generated by the image pair A and the image pair B and the corresponding high-resolution images, and optimizing the super-resolution network by using the loss to obtain a pre-trained super-resolution network;
step 5: designing an image super-resolution model, wherein the image super-resolution model comprises a super-resolution network pre-trained in the step 4 and a classifier pre-trained in the step 2; inputting the low-resolution image of the image pair A, B into a model, calculating total loss by the generated super-resolution image, the corresponding high-resolution image, the predicted category and the real label, and fine-tuning the parameters of the super-resolution network, wherein the method comprises the following steps of:
step 5.1: inputting the low-resolution image of the image pair A into a super-resolution network to obtain a corresponding super-resolution image SRA, inputting the SRA into a classifier, and calculating classification loss by using a prediction label and a real label;
step 5.2: the image pair B is cut into image blocks, the image blocks are input into a super-resolution network after data enhancement operation processing, and super-resolution network loss is calculated by utilizing the super-resolution image and a corresponding high-resolution image;
step 5.3: using a high-resolution image in the image pair A as a positive sample, using the SRA generated in the current round as an anchoring sample, using the generated SRA with fixed round number as a negative sample, respectively extracting high-frequency information by wavelet transformation, converting each convolution layer of the classifier into a corresponding characteristic representation, reserving the characteristic representation of each layer of the corresponding sample in the process, and calculating multiple sample comparison losses among a plurality of positive sample characteristics, anchoring sample characteristics and negative sample characteristics;
step 5.4: optimizing the super-resolution network using a weighted total loss of the classification loss, the super-resolution network loss, and the multiple sample comparison loss;
step 6: in the test stage, the low-resolution image used for testing in the image pair A is input into an image super-resolution model, a super-resolution image is obtained by the super-resolution network in the first stage, and the super-resolution image is input into the classifier in the second stage to obtain a prediction category.
Further, in the training process in the step 2, the low resolution and the corresponding label are input into a classifier, a cross entropy loss function is calculated by using a prediction label and a real label, parameters of the classifier are optimized through an Adam algorithm, and a classified classifier is obtained and stored as a pre-trained classifier.
Furthermore, in the step 3.1, the multi-scale information extraction module is an acceptance module in google net, and the characteristics obtained by the input image and the module are weighted and fused in the channel dimension by using the learnable parameters by using residual connection, so as to obtain the extracted characteristics.
Further, in the process of training the super-resolution network in the step 4, the super-resolution network loss is calculated by utilizing the high-resolution image in the image pair and the generated super-resolution image, the network parameters are optimized by the Adam algorithm, and the pre-trained super-resolution network is stored.
Further, the super-resolution network loss in the step 5.2 includes an L1 loss, a perceptual loss, but does not include a contrast loss.
Another aspect of the invention provides a system for enhancing machine vision of medical images using super-resolution techniques, comprising a preparation module, a classifier module, a super-resolution network module, a network training module, a model training module, and a test module, wherein:
the preparation module is used for manufacturing a low resolution-high resolution image pair, the image pair consists of a low resolution image, a high resolution image and a label, wherein the label is 0 or 1, and the preparation module comprises the following components:
the original data component takes a medical image original image as a high-resolution image, a low-resolution image is manufactured by using a downsampling method, and the labels of two types of health and diseases are respectively marked as 0 and 1 and are taken as real labels;
the image pair A component is used for cutting the original image size of the medical image to retain key information which is beneficial to machine vision tasks, and taking the medical image as a high-resolution image, obtaining a low-resolution image according to a given scale by using a bicubic downsampling method, and combining with a real label to form an image pair A, wherein the image pair A is a low-resolution image, a high-resolution image and a label;
the image pair B component takes the original image of the medical image as a high-resolution image, obtains a low-resolution image by using a bicubic downsampling method according to a given scale, and combines a real label to form an image pair of 'low-resolution image, high-resolution image and label' as an image pair B;
the classifier module is used for training a classifier for the medical image in the image pair A, the classifier is used for predicting the content label of the input image, and the output result is the illness probability;
the super-resolution network module is used for designing a super-resolution network and comprises the following components:
the head component is used for adding a multi-scale information extraction module to the network head to extract characteristics, and reserving and enhancing local information and overall information of an input image;
the tail component is used for connecting the super-resolution image generated by the network and the input low-resolution image in the channel dimension at the tail of the network, respectively obtaining two parameters of dynamic normalization by utilizing two convolution branches, and correcting the information distribution of the super-resolution image by using the parameters to obtain a final super-resolution image;
the network training module is used for inputting the low-resolution image in the image pair A, B into a super-resolution network for training, calculating super-resolution network loss by utilizing the generated super-resolution image and the corresponding high-resolution image, and optimizing the super-resolution network to obtain a pre-training super-resolution network, and comprises the following components:
the processing A component inputs the low-resolution image of the image pair A into a super-resolution network for post-processing to obtain a corresponding super-resolution image, wherein key information in the image is reserved;
the processing component B is used for cutting the low-resolution image in the image pair B into a patch with a proper size, and inputting the patch into a super-resolution network in batches after data enhancement operation processing to obtain a corresponding super-resolution image;
the computing component is used for computing super-resolution network loss by utilizing the super-resolution images generated by the image pair A and the image pair B and the corresponding high-resolution images, and optimizing the super-resolution network by using the loss to obtain a pre-trained super-resolution network;
the model training module is used for designing an image super-resolution model and comprises a super-resolution network pre-trained in the network training module and a classifier pre-trained in the classifier module; inputting the low-resolution image of the image pair A, B into a model, calculating total loss by the generated super-resolution image, the corresponding high-resolution image, the prediction category and the real label, and fine-tuning parameters of the super-resolution network, wherein the method comprises the following components:
the classification loss component inputs the low-resolution image of the image pair A into a super-resolution network to obtain a corresponding super-resolution image SRA, inputs the SRA into a classifier of a second stage, and calculates classification loss by using a prediction label and a real label;
the network loss component is used for cutting the image pair B into image blocks, inputting the image pair B into a super-resolution network after data enhancement operation treatment, and calculating super-resolution network loss by utilizing the super-resolution image and a corresponding high-resolution image;
the contrast loss component uses the high-resolution image in the image pair A as a positive sample, the SRA generated in the current round is used as an anchoring sample, the SRA generated at intervals of a fixed round number is used as a negative sample, the high-frequency information is respectively extracted by utilizing wavelet transformation, each convolution layer of the classifier is then converted into corresponding characteristic representation, the characteristic representation of each layer of the corresponding sample is reserved in the process, and the multiple sample contrast loss among a plurality of positive sample characteristics, anchoring sample characteristics and negative sample characteristics is calculated;
an optimization component that optimizes the super-resolution network using a weighted total loss of classification loss, super-resolution network loss, multiple sample comparison loss;
the testing module is used for inputting the low-resolution image used for testing in the image pair A into the image super-resolution model, obtaining the super-resolution image by the super-resolution network in the first stage, and inputting the super-resolution image into the classifier in the second stage to obtain the prediction category.
Further, in the training process in the classifier module, the low resolution and the corresponding labels are input into the classifier, the cross entropy loss function is calculated by using the prediction labels and the real labels, the classifier parameters are optimized through the Adam algorithm, and a classified classifier is obtained and stored as a pre-trained classifier.
Furthermore, the multi-scale information extraction module in the head component is an acceptance module in GoogleNet, and the characteristics obtained by the input image and the module are weighted and fused in the channel dimension by using the learnable parameters by using residual connection, so that the extracted characteristics are obtained.
Further, in the process of training the super-resolution network in the network training module, the super-resolution network loss is calculated by utilizing the high-resolution image in the image pair and the generated super-resolution image, and the network parameters are optimized by the Adam algorithm, so that a pre-training network is finally saved.
Further, in the network loss component, the super-resolution network loss includes an L1 loss, a perceived loss, but does not include a contrast loss.
After the technical scheme is adopted, compared with the prior art, the method has the following beneficial effects:
1) The technology of the two-stage medical image super-resolution frame design is adopted, so that the performance of a machine vision task can be considered by the medical image super-resolution, the technical aim of pertinently improving the quality of a reconstructed image is provided for medical diagnosis analysis, and the technical problem of lack of the medical image super-resolution for reconstructing the image according to the actual application effect is solved;
2) The technical aim of enabling the model to learn the characteristics of the medical image more pertinently can be achieved by adopting the technology of fine tuning by combining the downstream task on the basis of the pre-training super-resolution network with the self-adaptive multi-scale information extraction design, and the technical problem that the natural image super-resolution model is possibly not adapted when the natural image super-resolution model is migrated to the medical image task due to the difference between the medical image and the natural image is solved;
3) The improved multi-sample comparison learning strategy technology is adopted, so that the technical purposes of enabling the super-resolution image to be faster and better approximate to the high-resolution image, improving the training convergence speed and improving the reconstruction quality in the optimization process can be realized, and the technical problems of slow convergence and low efficiency possibly existing in the high-performance model with large training parameters are solved;
in summary, the invention adopts the combination of the technologies 1-3, thereby promoting the medical image super-resolution to more pertinently improve the quality of the reconstructed image by utilizing the result of the machine vision task, and overcoming the problems that the single consideration medical image super-resolution result improves the human visual effect but neglects the influence on the machine vision task in the prior art.
Drawings
FIG. 1 is a flow chart of a method for enhancing machine vision of medical images using super resolution techniques.
Fig. 2 shows an X-ray image of the lung, (a) shows a high resolution image, and (b) shows a low resolution image.
FIG. 3 is a flowchart of method step 2 for enhancing machine vision of medical images using super resolution techniques.
Fig. 4 is a multi-scale information extraction design.
FIG. 5 is a flowchart of method step 4 for enhancing machine vision of medical images using super resolution techniques.
Fig. 6 is a diagram of an image super-resolution model structure.
FIG. 7 is a flowchart of method steps 5 for enhancing machine vision of medical images using super resolution techniques.
FIG. 8 is a flowchart of method step 6 for enhancing machine vision of medical images using super resolution techniques.
Detailed Description
Advantages of the invention are further illustrated in the following description, taken in conjunction with the accompanying drawings and detailed description. It is to be understood by persons skilled in the art that the following detailed description is illustrative and not restrictive, and that this invention is not limited to the details given herein.
As shown in fig. 1, the flow chart of the method comprises the following steps:
step 1: using the lung X-ray image data for classification to produce a low-resolution-high-resolution image pair which can be used for image super-resolution tasks, wherein the image pair consists of a low-resolution image, a high-resolution image and a label, and the label is 0 or 1;
step 1.1: in the X-ray image data of the lung, the lung is approximately positioned in the central area of the images, and the size of each image are different; the lung X-ray images attached with corresponding patient numbers are divided into two types of 'health' and 'pneumonia', and in order to simplify experimental results, the labels of the two types of 'health' and 'pneumonia' are respectively marked as 0 and 1 (0 represents health and 1 represents pneumonia), the labels are used for judging whether a patient corresponding to a given image has pneumonia in a machine vision task (the machine vision task of the design takes medical image classification as an example), and the corresponding 'lung X-ray images and the labels' form data for the machine vision task; as shown in fig. 2, taking a lung X-ray image original image as a high-resolution image, and using a downsampling method to prepare a low-resolution image with given scale and corresponding original image, wherein the corresponding low-resolution image and high-resolution image form data for medical image super-resolution tasks; finally, "low resolution image, high resolution image, label" make up an image pair for use in the present invention;
step 1.2: in order to reduce the influence of image deformation on the performance of a machine vision task (the machine vision task of the design takes medical image classification as an example) as much as possible, after scaling an original image of a lung X-ray image in an equal proportion by a short side, cutting the original image of the lung X-ray image to an equal width and equal height (224 multiplied by 224) suitable for an input image of a classifier, only lung key information which is beneficial to the machine vision task is reserved in the image, then taking the image as a high-resolution image, obtaining a low-resolution image by using a bicubic downsampling method according to a given scale, combining corresponding 0-1 labels to form a low-resolution image, a high-resolution image and a label image pair, and recording an image pair A;
step 1.3: meanwhile, as the cut image only reserves the area which is conducive to the machine vision task, incomplete image information can influence the performance of the super-resolution task of the image, the original image of the lung X-ray image is used as a high-resolution image, a low-resolution image is obtained by using a bicubic downsampling method according to a given scale, and a corresponding 0-1 label is combined to form a 'low-resolution image, a high-resolution image and label' image pair, and an image pair B is recorded;
step 2: training a classifier (the machine vision task of the design takes medical image classification as an example) by using the lung X-ray image in the image pair A (the image pair subjected to size clipping and resampling), wherein the classifier is used for predicting that the input image content is healthy lung or lung suffering from pneumonia, and the output result is the pneumonia illness probability;
as shown in fig. 3, which is a schematic flow chart of step 2, in the training process, a low-resolution image and a corresponding label are input into a classifier, the probability of the image on both normal and pneumonia is predicted, a cross entropy loss function is calculated by using a predicted label and a real label, the classifier is optimized by using the cross entropy loss function, classifier parameters are optimized by Adam algorithm, a classifier with two classes is obtained through training, and the training result is saved as a pre-trained classifier model;
step 3: designing a super-resolution network in the first stage on the basis of a classical network, such as RCAN (Very deep residual channel attention networks, very deep residual channel convolutional network);
step 3.1: adding a Multi-scale information extraction design at the network head, wherein the design uses an acceptance module in GoogleNet, uses residual connection to carry out weighted fusion on the characteristics obtained by the input image and the module in the channel dimension by using the Learnable parameters to obtain the extracted characteristics, the characteristics retain and strengthen the local information and the whole information of the input image, and the LMIFB (Learnable Multi-scale information fusion module Learnable Multi-scale Information Fusion Block, LMIFB) design is shown in figure 4 in the attached drawings, and then carries out subsequent operation on the characteristics input network;
step 3.2: connecting a super-resolution image generated by a network with an input low-resolution image in a channel dimension at the tail of the network, respectively obtaining two parameters of dynamic normalization by utilizing two convolution branches, and correcting the information distribution of the super-resolution image by using the parameters to obtain a final super-resolution image;
step 4: inputting the low-resolution image in the image pair A, B into a super-resolution network for training, calculating super-resolution network loss by using the generated super-resolution image and the corresponding high-resolution image, and optimizing the super-resolution network to obtain a pre-training super-resolution network;
step 4.1: the low-resolution image of the image pair A (the image pair subjected to size clipping and downsampling) is subjected to a super-resolution network to obtain a corresponding super-resolution image, wherein key information in the image is relatively completely reserved;
step 4.2: when the low-resolution image of the image pair B (the image pair of which the original image is directly downsampled) is input into a super-resolution network, the low-resolution image in the image pair B is cut into a patch (generally 48 or a multiple of 48) with a proper size according to the conventional data preprocessing operation of the super-resolution network, and the patch is input into the super-resolution network in batches after the data enhancement operation processing to obtain a corresponding super-resolution image;
step 4.3: calculating super-resolution network loss by utilizing the super-resolution images generated by the image pair A and the image pair B and the corresponding high-resolution images, optimizing the super-resolution network by using the loss, and storing the training result as a pre-trained super-resolution network;
as shown in fig. 5, a flow chart of step 4 is shown.
Two sets of image pairs A, B are made from the lung X-ray image, the super-resolution image generated by the image pair A (the image pair subjected to size clipping and downsampling) after training the super-resolution network is input into a downstream task model, and the image pair B (the image pair subjected to direct downsampling of the original image) is specially used for training the super-resolution network. The image pair A, B is input to the super-resolution network, in which since the image pair B is not resized to a uniform size, it is necessary to input the super-resolution network after performing data enhancement operation processing in a patch (image blocking) form. In the process of training the super-resolution network, the super-resolution network loss is calculated by utilizing the high-resolution image in the image pair and the generated super-resolution image, network parameters are optimized by an Adam algorithm, and finally a pre-training network is saved.
Step 5: the whole super-resolution model is designed into a two-stage medical image super-resolution frame, and comprises a first-stage super-resolution network and a second-stage classifier, wherein the first-stage super-resolution network and the second-stage classifier use a pre-training network at the beginning of training, the first-stage pre-training classifier is used in the step 2, and the second-stage pre-training super-resolution network is used in the step 4; the method comprises the steps that a model is input by utilizing a low-resolution image of an image pair A, B, total loss is calculated by the generated super-resolution image, a corresponding high-resolution image, a prediction category and a real label, then fine adjustment is carried out only on parameters of a super-resolution network, the parameters of a classifier are not updated, the classification loss is calculated by utilizing the classifier, the super-resolution network is optimized, and the model design is shown in fig. 6;
step 5.1: the low-resolution image of the image pair A (the image pair subjected to size clipping and downsampling) is not subjected to clipping patch and data enhancement operation, a corresponding super-resolution image SRA is obtained after the low-resolution image is directly input into a super-resolution network, and key information in the image is relatively completely reserved by the SRA, so that the SRA is input into a classifier of a second stage, and classification loss is calculated by using a prediction label and a real label;
step 5.2: the image pair B (image pair of original image direct downsampling) is cut into a patch and is input into a super-resolution network after being subjected to data enhancement operation processing, but does not pass through a classifier of a second stage, and super-resolution network loss (comprising L1 loss and perception loss but not contrast loss) is calculated by only utilizing the super-resolution image and a corresponding high-resolution image;
step 5.3: in the optimization process, using a high-resolution image in an image pair A (an image pair subjected to size clipping and downsampling) as a positive sample, using SRA (a high-resolution image generated by the image pair A) in a current epoch (round) as an anchor sample, using SRA in the previous epoch which is separated by a fixed epoch number as a negative sample, respectively extracting high-frequency information by wavelet transformation, converting each convolution layer of the classifier into a corresponding characteristic representation, reserving the characteristic representation of each layer of the corresponding sample in the process, and finally calculating multiple sample contrast losses among a plurality of positive sample characteristics, anchor sample characteristics and negative sample characteristics;
step 5.4: optimizing the super-resolution network by using the weighted total loss of the super-resolution network loss, the multi-sample comparison loss and the classification loss;
as shown in fig. 7, a flow chart of step 5 is shown.
In the training stage, the super-resolution network (SR network) performs fine adjustment on the basis of the pre-training network, the classifier uses a pre-training classifier with fixed parameters, an image pair A, B is input, an image pair B (image pair directly downsampled by original image) is cut into a patch, a low-resolution image (LR) subjected to data enhancement operation is input into a super-resolution image (SR) generated by the super-resolution network and a corresponding high-resolution image, and a super-resolution network Loss (SR Loss, which comprises improved multi-sample contrast Loss) is calculated; the LR image of the image pair A directly generates an SR image through an SR network, then the classification is predicted through a classifier, classification Loss (classification Loss) is calculated by using a prediction label and a real label, and finally the parameter of the super-resolution network is finely tuned and optimized by using the weighted total Loss of the SR Loss and the classification Loss.
Step 6: in the test stage, the low-resolution image used for testing in the image pair A (the image pair subjected to size clipping and downsampling) is input into an image super-resolution model, a super-resolution image is obtained by a first stage network, and then a prediction category is obtained by a second stage classifier by using the super-resolution image.
As shown in fig. 8, a flow chart of step 6 is shown, in the test stage, the lung X-ray image is input into the super-resolution model, the super-resolution image is obtained by the first stage network, and the prediction category is obtained by the second stage classifier.
It should be noted that the embodiments of the present invention are preferred and not limited in any way, and any person skilled in the art may make use of the above-disclosed technical content to change or modify the same into equivalent effective embodiments without departing from the technical scope of the present invention, and any modification or equivalent change and modification of the above-described embodiments according to the technical substance of the present invention still falls within the scope of the technical scope of the present invention.
Claims (10)
1. A method for enhancing machine vision of medical images using super-resolution techniques, comprising the steps of:
step 1: producing a low-resolution-high-resolution image pair, wherein the image pair consists of a low-resolution image, a high-resolution image and a label, the label is 0 or 1, and the method comprises the following steps of:
step 1.1: acquiring a medical image original image and a corresponding health condition, and marking labels of two types of health and diseases as 0 and 1 respectively as real labels;
step 1.2: cutting an original image of a medical image and keeping key information which is helpful for a machine vision task, obtaining a low-resolution image by using a bicubic downsampling method according to a given scale as a high-resolution image, combining a real label to form an image pair A of a low-resolution image, a high-resolution image and a label;
step 1.3: taking the original image of the medical image as a high-resolution image, obtaining a low-resolution image by using a bicubic downsampling method according to a given scale, combining with a real label to form an image pair of 'low-resolution image, high-resolution image and label', and recording the image pair B;
step 2: training a classifier by utilizing a medical image in the image pair, wherein the classifier is used for predicting an input image content label, and outputting a result as a disease probability;
step 3: designing a super-resolution network on the basis of a classical network;
step 3.1: adding a multi-scale information extraction module to extract characteristics at the network head, and reserving and enhancing local information and whole information of an input image;
step 3.2: connecting a super-resolution image generated by a network with an input low-resolution image in a channel dimension at the tail of the network, respectively obtaining two parameters of dynamic normalization by utilizing two convolution branches, and correcting the information distribution of the super-resolution image by using the parameters to obtain a final super-resolution image;
step 4: inputting the low-resolution image in the image pair A, B into a super-resolution network for training, calculating super-resolution network loss by using the generated super-resolution image and the corresponding high-resolution image, and optimizing the super-resolution network to obtain a pre-training super-resolution network, wherein the method comprises the following steps of:
step 4.1: inputting the low-resolution image of the image pair A into a super-resolution network for post-processing to obtain a corresponding super-resolution image, wherein key information in the image is reserved;
step 4.2: cutting the low-resolution image in the image pair B into image blocks with proper sizes, and inputting the image blocks into a super-resolution network in batches after data enhancement operation treatment to obtain a corresponding super-resolution image;
step 4.3: calculating super-resolution network loss by utilizing the super-resolution images generated by the image pair A and the image pair B and the corresponding high-resolution images, and optimizing the super-resolution network by using the loss to obtain a pre-trained super-resolution network;
step 5: designing an image super-resolution model, wherein the image super-resolution model comprises a super-resolution network pre-trained in the step 4 and a classifier pre-trained in the step 2; inputting the low-resolution image of the image pair A, B into a model, calculating total loss by the generated super-resolution image, the corresponding high-resolution image, the predicted category and the real label, and fine-tuning the parameters of the super-resolution network, wherein the method comprises the following steps of:
step 5.1: inputting the low-resolution image of the image pair A into a super-resolution network to obtain a corresponding super-resolution image SRA, inputting the SRA into a classifier, and calculating classification loss by using a prediction label and a real label;
step 5.2: the image pair B is cut into image blocks, the image blocks are input into a super-resolution network after data enhancement operation processing, and super-resolution network loss is calculated by utilizing the super-resolution image and a corresponding high-resolution image;
step 5.3: using a high-resolution image in the image pair A as a positive sample, using the SRA generated in the current round as an anchoring sample, using the generated SRA with fixed round number as a negative sample, respectively extracting high-frequency information by wavelet transformation, converting each convolution layer of the classifier into a corresponding characteristic representation, reserving the characteristic representation of each layer of the corresponding sample in the process, and calculating multiple sample comparison losses among a plurality of positive sample characteristics, anchoring sample characteristics and negative sample characteristics;
step 5.4: optimizing the super-resolution network using a weighted total loss of the classification loss, the super-resolution network loss, and the multiple sample comparison loss;
step 6: in the test stage, the low-resolution image used for testing in the image pair A is input into an image super-resolution model, a super-resolution image is obtained by the super-resolution network in the first stage, and the super-resolution image is input into the classifier in the second stage to obtain a prediction category.
2. The method for enhancing machine vision of medical images using super resolution technology according to claim 1, wherein in the training process in step 2, low resolution and corresponding labels are input into a classifier, cross entropy loss function is calculated by using predictive labels and real labels, classifier parameters are optimized by Adam algorithm, a classifier of two classes is obtained, and the classifier is stored as a pre-trained classifier.
3. The method for enhancing machine vision of medical images using super resolution technology according to claim 1, wherein the multi-scale information extraction module in step 3.1 is an acceptance module in google net, and the characteristics obtained by the input image and the module are weighted and fused in channel dimension by using a learnable parameter by using residual connection, so as to obtain the extracted characteristics.
4. The method for enhancing machine vision of medical images using super resolution technology according to claim 1, wherein in the training of super resolution network in step 4, super resolution network loss is calculated by using the high resolution image in the image pair and the generated super resolution image, network parameters are optimized by Adam algorithm, and pre-trained super resolution network is saved.
5. The method of claim 1, wherein the super-resolution network loss in step 5.2 comprises L1 loss, perceptual loss, but not contrast loss.
6. A system for enhancing machine vision of medical images using super-resolution techniques, comprising a preparation module, a classifier module, a super-resolution network module, a network training module, a model training module, and a test module, wherein:
the preparation module is used for manufacturing a low resolution-high resolution image pair, the image pair consists of a low resolution image, a high resolution image and a label, wherein the label is 0 or 1, and the preparation module comprises the following components:
the original data component takes a medical image original image as a high-resolution image, a low-resolution image is manufactured by using a downsampling method, and the labels of two types of health and diseases are respectively marked as 0 and 1 and are taken as real labels;
the image pair A component is used for cutting the original image size of the medical image to retain key information which is beneficial to machine vision tasks, and taking the medical image as a high-resolution image, obtaining a low-resolution image according to a given scale by using a bicubic downsampling method, and combining with a real label to form an image pair A, wherein the image pair A is a low-resolution image, a high-resolution image and a label;
the image pair B component takes the original image of the medical image as a high-resolution image, obtains a low-resolution image by using a bicubic downsampling method according to a given scale, and combines a real label to form an image pair of 'low-resolution image, high-resolution image and label' as an image pair B;
the classifier module is used for training a classifier for the medical image in the image pair A, the classifier is used for predicting the content label of the input image, and the output result is the illness probability;
the super-resolution network module is used for designing a super-resolution network and comprises the following components:
the head component is used for adding a multi-scale information extraction module to the network head to extract characteristics, and reserving and enhancing local information and overall information of an input image;
the tail component is used for connecting the super-resolution image generated by the network and the input low-resolution image in the channel dimension at the tail of the network, respectively obtaining two parameters of dynamic normalization by utilizing two convolution branches, and correcting the information distribution of the super-resolution image by using the parameters to obtain a final super-resolution image;
the network training module is used for inputting the low-resolution image in the image pair A, B into a super-resolution network for training, calculating super-resolution network loss by utilizing the generated super-resolution image and the corresponding high-resolution image, and optimizing the super-resolution network to obtain a pre-training super-resolution network, and comprises the following components:
the processing A component inputs the low-resolution image of the image pair A into a super-resolution network for post-processing to obtain a corresponding super-resolution image, wherein key information in the image is reserved;
the processing component B is used for cutting the low-resolution image in the image pair B into a patch with a proper size, and inputting the patch into a super-resolution network in batches after data enhancement operation processing to obtain a corresponding super-resolution image;
the computing component is used for computing super-resolution network loss by utilizing the super-resolution images generated by the image pair A and the image pair B and the corresponding high-resolution images, and optimizing the super-resolution network by using the loss to obtain a pre-trained super-resolution network;
the model training module is used for designing an image super-resolution model and comprises a super-resolution network pre-trained in the network training module and a classifier pre-trained in the classifier module; inputting the low-resolution image of the image pair A, B into a model, calculating total loss by the generated super-resolution image, the corresponding high-resolution image, the prediction category and the real label, and fine-tuning parameters of the super-resolution network, wherein the method comprises the following components:
the classification loss component inputs the low-resolution image of the image pair A into a super-resolution network to obtain a corresponding super-resolution image SRA, inputs the SRA into a classifier of a second stage, and calculates classification loss by using a prediction label and a real label;
the network loss component is used for cutting the image pair B into image blocks, inputting the image pair B into a super-resolution network after data enhancement operation treatment, and calculating super-resolution network loss by utilizing the super-resolution image and a corresponding high-resolution image;
the contrast loss component uses the high-resolution image in the image pair A as a positive sample, the SRA generated in the current round is used as an anchoring sample, the SRA generated at intervals of a fixed round number is used as a negative sample, the high-frequency information is respectively extracted by utilizing wavelet transformation, each convolution layer of the classifier is then converted into corresponding characteristic representation, the characteristic representation of each layer of the corresponding sample is reserved in the process, and the multiple sample contrast loss among a plurality of positive sample characteristics, anchoring sample characteristics and negative sample characteristics is calculated;
an optimization component that optimizes the super-resolution network using a weighted total loss of classification loss, super-resolution network loss, multiple sample comparison loss;
the testing module is used for inputting the low-resolution image used for testing in the image pair A into the image super-resolution model, obtaining the super-resolution image by the super-resolution network in the first stage, and inputting the super-resolution image into the classifier in the second stage to obtain the prediction category.
7. The method and system for enhancing machine vision of medical images using super-resolution techniques of claim 6, wherein the classifier module inputs low resolution and corresponding labels into the classifier during training, calculates cross entropy loss functions using predictive and true labels, optimizes classifier parameters by Adam algorithm to obtain a classified classifier, and saves the classified classifier as a pre-trained classifier.
8. The method and system for enhancing machine vision of medical images using super resolution technology of claim 6, wherein the multi-scale information extraction module in the header component is an acceptance module in google net, and the characteristics obtained by the input image and the module are weighted and fused in channel dimension by using a residual connection by using a learnable parameter, so as to obtain the extracted characteristics.
9. The method and system for enhancing machine vision of medical images using super resolution techniques of claim 6, wherein in the training of super resolution networks in the network training module, super resolution network loss is calculated using the high resolution images in the image pairs and the generated super resolution images, network parameters are optimized by Adam algorithm, and finally a pre-training network is saved.
10. The method and system for enhancing machine vision of medical images using super-resolution techniques of claim 6, wherein in the network loss component, the super-resolution network loss comprises an L1 loss, a perceptual loss, but does not comprise a contrast loss.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311580384.4A CN117522693A (en) | 2023-11-24 | 2023-11-24 | Method and system for enhancing machine vision of medical images using super resolution techniques |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311580384.4A CN117522693A (en) | 2023-11-24 | 2023-11-24 | Method and system for enhancing machine vision of medical images using super resolution techniques |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117522693A true CN117522693A (en) | 2024-02-06 |
Family
ID=89766038
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311580384.4A Pending CN117522693A (en) | 2023-11-24 | 2023-11-24 | Method and system for enhancing machine vision of medical images using super resolution techniques |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117522693A (en) |
-
2023
- 2023-11-24 CN CN202311580384.4A patent/CN117522693A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108898175B (en) | Computer-aided model construction method based on deep learning gastric cancer pathological section | |
CN111798462B (en) | Automatic delineation method of nasopharyngeal carcinoma radiotherapy target area based on CT image | |
CN110930416B (en) | MRI image prostate segmentation method based on U-shaped network | |
CN112132833A (en) | Skin disease image focus segmentation method based on deep convolutional neural network | |
WO2022121100A1 (en) | Darts network-based multi-modal medical image fusion method | |
CN113139974B (en) | Focus segmentation model training and application method based on semi-supervised learning | |
CN112215291A (en) | Method for extracting and classifying medical image features under cascade neural network | |
CN112884788B (en) | Cup optic disk segmentation method and imaging method based on rich context network | |
CN111178499B (en) | Medical image super-resolution method based on generation countermeasure network improvement | |
CN110991254B (en) | Ultrasonic image video classification prediction method and system | |
CN112950644B (en) | Neonatal brain image segmentation method and model construction method based on deep learning | |
CN115375711A (en) | Image segmentation method of global context attention network based on multi-scale fusion | |
CN115147600A (en) | GBM multi-mode MR image segmentation method based on classifier weight converter | |
Jiang et al. | CT image super resolution based on improved SRGAN | |
CN116758336A (en) | Medical image intelligent analysis system based on artificial intelligence | |
CN114881105A (en) | Sleep staging method and system based on transformer model and contrast learning | |
CN114565601A (en) | Improved liver CT image segmentation algorithm based on DeepLabV3+ | |
CN114332910A (en) | Human body part segmentation method for similar feature calculation of far infrared image | |
CN117522693A (en) | Method and system for enhancing machine vision of medical images using super resolution techniques | |
CN112750097B (en) | Multi-modal medical image fusion based on multi-CNN combination and fuzzy neural network | |
CN114419015A (en) | Brain function fusion analysis method based on multi-modal registration | |
CN111640126A (en) | Artificial intelligence diagnosis auxiliary method based on medical image | |
Nguyen et al. | Comparative study on super resolution techniques for upper gastrointestinal endoscopic images | |
CN116597041B (en) | Nuclear magnetic image definition optimization method and system for cerebrovascular diseases and electronic equipment | |
CN117726642B (en) | High reflection focus segmentation method and device for optical coherence tomography image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |