CN117522693A - Method and system for enhancing machine vision of medical images using super resolution techniques - Google Patents

Method and system for enhancing machine vision of medical images using super resolution techniques Download PDF

Info

Publication number
CN117522693A
CN117522693A CN202311580384.4A CN202311580384A CN117522693A CN 117522693 A CN117522693 A CN 117522693A CN 202311580384 A CN202311580384 A CN 202311580384A CN 117522693 A CN117522693 A CN 117522693A
Authority
CN
China
Prior art keywords
resolution
image
super
network
loss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311580384.4A
Other languages
Chinese (zh)
Inventor
黄飞跃
朱立峰
柏志安
孟煜祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ruinjin Hospital Affiliated to Shanghai Jiaotong University School of Medicine Co Ltd
Original Assignee
Ruinjin Hospital Affiliated to Shanghai Jiaotong University School of Medicine Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ruinjin Hospital Affiliated to Shanghai Jiaotong University School of Medicine Co Ltd filed Critical Ruinjin Hospital Affiliated to Shanghai Jiaotong University School of Medicine Co Ltd
Priority to CN202311580384.4A priority Critical patent/CN117522693A/en
Publication of CN117522693A publication Critical patent/CN117522693A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method and a system for enhancing machine vision of medical images by using a super-resolution technology, which comprises the following steps of S1, manufacturing a low-resolution-high-resolution image pair; step S2, training a classifier by using the medical image in the image pair; step S3, designing a super-resolution network on the basis of a classical network; step S4, training and optimizing the super-resolution network to obtain a pre-training super-resolution network; s5, designing an image super-resolution model, and fine-tuning parameters of a super-resolution network; and 6, testing the image super-resolution model. By adopting the method and the system, the result of utilizing the machine vision task can be realized, the super-resolution of the medical image is promoted to more pertinently improve the quality of the reconstructed image, and the problems that the single consideration of the super-resolution result of the medical image improves the human visual effect but neglects the influence on the machine vision task in the prior art are solved.

Description

Method and system for enhancing machine vision of medical images using super resolution techniques
Technical Field
The present invention relates to the field of image recognition, and more particularly, to a method and system for enhancing machine vision of medical images using super resolution techniques.
Background
Medical imaging refers to the use of medical imaging equipment to obtain detailed information of internal organs or tissues of a patient without invading the body. These images contain rich physiological, anatomical and pathological information, and are of great significance for diagnosing the pathology of patients, formulating treatment regimens and conducting medical research. Particularly, high-quality medical images can provide more accurate detail information, obviously improve the accuracy of pathological diagnosis, accurately identify focus positions and deeply understand the specific conditions of affected parts. Early applications of super-resolution technology in the image field were mainly natural images. With the continuous progress of deep learning methods in natural image super-resolution technology, the methods also start to be applied to classical networks such as EDSR, SRGAN, RCAN, swinIR and the like of medical images, and have demonstrated great application potential.
By using a convolution-based method in deep learning, the medical image super-resolution technology has achieved remarkable performance improvement, and the method is excellent in human visual effect and also excellent in objective evaluation index. In the medical field, however, the goal of image processing is not just an improvement in image quality, but rather focuses more on the contribution of images to the exact diagnostic task, which results in that the super-resolution of current deep learning-based medical images still faces some challenges, such as:
1. most of the current techniques still focus too much on reconstructing high quality images, but neglecting whether these images actually contribute to improving the performance of machine vision tasks (specific medical diagnostic analysis, such as classification, segmentation, detection, etc.);
2. medical images and natural images have significant differences in image characteristics, image modes, signal to noise ratios and the like, so that the differences are required to be used for design more pertinently when a network is designed;
3. due to increasingly complex network designs and smaller scale public medical image data, slow convergence problems may be encountered in training models of a large number of parameters, which not only affects the performance of the model, but also reduces experimental efficiency.
Disclosure of Invention
In order to overcome the technical defects, the technical problems to be solved by the invention are as follows:
1. the patent provides a two-stage medical image super-resolution frame design for machine vision enhancement aiming at the problem that the medical image super-resolution lacks attention for improving the performance of a machine vision task, and the machine vision effect is enhanced on the basis of keeping the human vision effect of the reconstructed image, so that the performance of the machine vision task is improved;
2. aiming at the problem of difference between medical images and natural images, the patent proposes the design of fine adjustment and frames on the basis of a super-resolution network based on medical image pre-training, and meanwhile, the self-adaptive multi-scale information extraction design is added, so that the model can learn the characteristics of the medical images more pertinently;
3. the patent provides an improved multi-sample comparison learning strategy aiming at the problem of slow convergence speed of model training with large parameters, and is also beneficial to improving the super-resolution network reconstruction quality.
To this end, one aspect of the present invention provides a method of enhancing machine vision of medical images using super resolution techniques, comprising the steps of:
step 1: producing a low-resolution-high-resolution image pair, wherein the image pair consists of a low-resolution image, a high-resolution image and a label, the label is 0 or 1, and the method comprises the following steps of:
step 1.1: acquiring a medical image original image and a corresponding health condition, and marking labels of two types of health and diseases as 0 and 1 respectively as real labels;
step 1.2: cutting an original image of a medical image and keeping key information which is helpful for a machine vision task, obtaining a low-resolution image by using a bicubic downsampling method according to a given scale as a high-resolution image, combining a real label to form an image pair A of a low-resolution image, a high-resolution image and a label;
step 1.3: taking the original image of the medical image as a high-resolution image, obtaining a low-resolution image by using a bicubic downsampling method according to a given scale, combining with a real label to form an image pair of 'low-resolution image, high-resolution image and label', and recording the image pair B;
step 2: training a classifier for predicting content labels of the input image by using the medical image in the image pair A, and outputting a result as a disease probability;
step 3: designing a super-resolution network on the basis of a classical network;
step 3.1: adding a multi-scale information extraction module to extract characteristics at the network head, and reserving and enhancing local information and whole information of an input image;
step 3.2: connecting a super-resolution image generated by a network with an input low-resolution image in a channel dimension at the tail of the network, respectively obtaining two parameters of dynamic normalization by utilizing two convolution branches, and correcting the information distribution of the super-resolution image by using the parameters to obtain a final super-resolution image;
step 4: inputting the low-resolution image in the image pair A, B into a super-resolution network for training, calculating super-resolution network loss by using the generated super-resolution image and the corresponding high-resolution image, and optimizing the super-resolution network to obtain a pre-training super-resolution network, wherein the method comprises the following steps of:
step 4.1: inputting the low-resolution image of the image pair A into a super-resolution network for post-processing to obtain a corresponding super-resolution image, wherein key information in the image is reserved;
step 4.2: cutting the low-resolution image in the image pair B into image blocks with proper sizes, and inputting the image blocks into a super-resolution network in batches after data enhancement operation treatment to obtain a corresponding super-resolution image;
step 4.3: calculating super-resolution network loss by utilizing the super-resolution images generated by the image pair A and the image pair B and the corresponding high-resolution images, and optimizing the super-resolution network by using the loss to obtain a pre-trained super-resolution network;
step 5: designing an image super-resolution model, wherein the image super-resolution model comprises a super-resolution network pre-trained in the step 4 and a classifier pre-trained in the step 2; inputting the low-resolution image of the image pair A, B into a model, calculating total loss by the generated super-resolution image, the corresponding high-resolution image, the predicted category and the real label, and fine-tuning the parameters of the super-resolution network, wherein the method comprises the following steps of:
step 5.1: inputting the low-resolution image of the image pair A into a super-resolution network to obtain a corresponding super-resolution image SRA, inputting the SRA into a classifier, and calculating classification loss by using a prediction label and a real label;
step 5.2: the image pair B is cut into image blocks, the image blocks are input into a super-resolution network after data enhancement operation processing, and super-resolution network loss is calculated by utilizing the super-resolution image and a corresponding high-resolution image;
step 5.3: using a high-resolution image in the image pair A as a positive sample, using the SRA generated in the current round as an anchoring sample, using the generated SRA with fixed round number as a negative sample, respectively extracting high-frequency information by wavelet transformation, converting each convolution layer of the classifier into a corresponding characteristic representation, reserving the characteristic representation of each layer of the corresponding sample in the process, and calculating multiple sample comparison losses among a plurality of positive sample characteristics, anchoring sample characteristics and negative sample characteristics;
step 5.4: optimizing the super-resolution network using a weighted total loss of the classification loss, the super-resolution network loss, and the multiple sample comparison loss;
step 6: in the test stage, the low-resolution image used for testing in the image pair A is input into an image super-resolution model, a super-resolution image is obtained by the super-resolution network in the first stage, and the super-resolution image is input into the classifier in the second stage to obtain a prediction category.
Further, in the training process in the step 2, the low resolution and the corresponding label are input into a classifier, a cross entropy loss function is calculated by using a prediction label and a real label, parameters of the classifier are optimized through an Adam algorithm, and a classified classifier is obtained and stored as a pre-trained classifier.
Furthermore, in the step 3.1, the multi-scale information extraction module is an acceptance module in google net, and the characteristics obtained by the input image and the module are weighted and fused in the channel dimension by using the learnable parameters by using residual connection, so as to obtain the extracted characteristics.
Further, in the process of training the super-resolution network in the step 4, the super-resolution network loss is calculated by utilizing the high-resolution image in the image pair and the generated super-resolution image, the network parameters are optimized by the Adam algorithm, and the pre-trained super-resolution network is stored.
Further, the super-resolution network loss in the step 5.2 includes an L1 loss, a perceptual loss, but does not include a contrast loss.
Another aspect of the invention provides a system for enhancing machine vision of medical images using super-resolution techniques, comprising a preparation module, a classifier module, a super-resolution network module, a network training module, a model training module, and a test module, wherein:
the preparation module is used for manufacturing a low resolution-high resolution image pair, the image pair consists of a low resolution image, a high resolution image and a label, wherein the label is 0 or 1, and the preparation module comprises the following components:
the original data component takes a medical image original image as a high-resolution image, a low-resolution image is manufactured by using a downsampling method, and the labels of two types of health and diseases are respectively marked as 0 and 1 and are taken as real labels;
the image pair A component is used for cutting the original image size of the medical image to retain key information which is beneficial to machine vision tasks, and taking the medical image as a high-resolution image, obtaining a low-resolution image according to a given scale by using a bicubic downsampling method, and combining with a real label to form an image pair A, wherein the image pair A is a low-resolution image, a high-resolution image and a label;
the image pair B component takes the original image of the medical image as a high-resolution image, obtains a low-resolution image by using a bicubic downsampling method according to a given scale, and combines a real label to form an image pair of 'low-resolution image, high-resolution image and label' as an image pair B;
the classifier module is used for training a classifier for the medical image in the image pair A, the classifier is used for predicting the content label of the input image, and the output result is the illness probability;
the super-resolution network module is used for designing a super-resolution network and comprises the following components:
the head component is used for adding a multi-scale information extraction module to the network head to extract characteristics, and reserving and enhancing local information and overall information of an input image;
the tail component is used for connecting the super-resolution image generated by the network and the input low-resolution image in the channel dimension at the tail of the network, respectively obtaining two parameters of dynamic normalization by utilizing two convolution branches, and correcting the information distribution of the super-resolution image by using the parameters to obtain a final super-resolution image;
the network training module is used for inputting the low-resolution image in the image pair A, B into a super-resolution network for training, calculating super-resolution network loss by utilizing the generated super-resolution image and the corresponding high-resolution image, and optimizing the super-resolution network to obtain a pre-training super-resolution network, and comprises the following components:
the processing A component inputs the low-resolution image of the image pair A into a super-resolution network for post-processing to obtain a corresponding super-resolution image, wherein key information in the image is reserved;
the processing component B is used for cutting the low-resolution image in the image pair B into a patch with a proper size, and inputting the patch into a super-resolution network in batches after data enhancement operation processing to obtain a corresponding super-resolution image;
the computing component is used for computing super-resolution network loss by utilizing the super-resolution images generated by the image pair A and the image pair B and the corresponding high-resolution images, and optimizing the super-resolution network by using the loss to obtain a pre-trained super-resolution network;
the model training module is used for designing an image super-resolution model and comprises a super-resolution network pre-trained in the network training module and a classifier pre-trained in the classifier module; inputting the low-resolution image of the image pair A, B into a model, calculating total loss by the generated super-resolution image, the corresponding high-resolution image, the prediction category and the real label, and fine-tuning parameters of the super-resolution network, wherein the method comprises the following components:
the classification loss component inputs the low-resolution image of the image pair A into a super-resolution network to obtain a corresponding super-resolution image SRA, inputs the SRA into a classifier of a second stage, and calculates classification loss by using a prediction label and a real label;
the network loss component is used for cutting the image pair B into image blocks, inputting the image pair B into a super-resolution network after data enhancement operation treatment, and calculating super-resolution network loss by utilizing the super-resolution image and a corresponding high-resolution image;
the contrast loss component uses the high-resolution image in the image pair A as a positive sample, the SRA generated in the current round is used as an anchoring sample, the SRA generated at intervals of a fixed round number is used as a negative sample, the high-frequency information is respectively extracted by utilizing wavelet transformation, each convolution layer of the classifier is then converted into corresponding characteristic representation, the characteristic representation of each layer of the corresponding sample is reserved in the process, and the multiple sample contrast loss among a plurality of positive sample characteristics, anchoring sample characteristics and negative sample characteristics is calculated;
an optimization component that optimizes the super-resolution network using a weighted total loss of classification loss, super-resolution network loss, multiple sample comparison loss;
the testing module is used for inputting the low-resolution image used for testing in the image pair A into the image super-resolution model, obtaining the super-resolution image by the super-resolution network in the first stage, and inputting the super-resolution image into the classifier in the second stage to obtain the prediction category.
Further, in the training process in the classifier module, the low resolution and the corresponding labels are input into the classifier, the cross entropy loss function is calculated by using the prediction labels and the real labels, the classifier parameters are optimized through the Adam algorithm, and a classified classifier is obtained and stored as a pre-trained classifier.
Furthermore, the multi-scale information extraction module in the head component is an acceptance module in GoogleNet, and the characteristics obtained by the input image and the module are weighted and fused in the channel dimension by using the learnable parameters by using residual connection, so that the extracted characteristics are obtained.
Further, in the process of training the super-resolution network in the network training module, the super-resolution network loss is calculated by utilizing the high-resolution image in the image pair and the generated super-resolution image, and the network parameters are optimized by the Adam algorithm, so that a pre-training network is finally saved.
Further, in the network loss component, the super-resolution network loss includes an L1 loss, a perceived loss, but does not include a contrast loss.
After the technical scheme is adopted, compared with the prior art, the method has the following beneficial effects:
1) The technology of the two-stage medical image super-resolution frame design is adopted, so that the performance of a machine vision task can be considered by the medical image super-resolution, the technical aim of pertinently improving the quality of a reconstructed image is provided for medical diagnosis analysis, and the technical problem of lack of the medical image super-resolution for reconstructing the image according to the actual application effect is solved;
2) The technical aim of enabling the model to learn the characteristics of the medical image more pertinently can be achieved by adopting the technology of fine tuning by combining the downstream task on the basis of the pre-training super-resolution network with the self-adaptive multi-scale information extraction design, and the technical problem that the natural image super-resolution model is possibly not adapted when the natural image super-resolution model is migrated to the medical image task due to the difference between the medical image and the natural image is solved;
3) The improved multi-sample comparison learning strategy technology is adopted, so that the technical purposes of enabling the super-resolution image to be faster and better approximate to the high-resolution image, improving the training convergence speed and improving the reconstruction quality in the optimization process can be realized, and the technical problems of slow convergence and low efficiency possibly existing in the high-performance model with large training parameters are solved;
in summary, the invention adopts the combination of the technologies 1-3, thereby promoting the medical image super-resolution to more pertinently improve the quality of the reconstructed image by utilizing the result of the machine vision task, and overcoming the problems that the single consideration medical image super-resolution result improves the human visual effect but neglects the influence on the machine vision task in the prior art.
Drawings
FIG. 1 is a flow chart of a method for enhancing machine vision of medical images using super resolution techniques.
Fig. 2 shows an X-ray image of the lung, (a) shows a high resolution image, and (b) shows a low resolution image.
FIG. 3 is a flowchart of method step 2 for enhancing machine vision of medical images using super resolution techniques.
Fig. 4 is a multi-scale information extraction design.
FIG. 5 is a flowchart of method step 4 for enhancing machine vision of medical images using super resolution techniques.
Fig. 6 is a diagram of an image super-resolution model structure.
FIG. 7 is a flowchart of method steps 5 for enhancing machine vision of medical images using super resolution techniques.
FIG. 8 is a flowchart of method step 6 for enhancing machine vision of medical images using super resolution techniques.
Detailed Description
Advantages of the invention are further illustrated in the following description, taken in conjunction with the accompanying drawings and detailed description. It is to be understood by persons skilled in the art that the following detailed description is illustrative and not restrictive, and that this invention is not limited to the details given herein.
As shown in fig. 1, the flow chart of the method comprises the following steps:
step 1: using the lung X-ray image data for classification to produce a low-resolution-high-resolution image pair which can be used for image super-resolution tasks, wherein the image pair consists of a low-resolution image, a high-resolution image and a label, and the label is 0 or 1;
step 1.1: in the X-ray image data of the lung, the lung is approximately positioned in the central area of the images, and the size of each image are different; the lung X-ray images attached with corresponding patient numbers are divided into two types of 'health' and 'pneumonia', and in order to simplify experimental results, the labels of the two types of 'health' and 'pneumonia' are respectively marked as 0 and 1 (0 represents health and 1 represents pneumonia), the labels are used for judging whether a patient corresponding to a given image has pneumonia in a machine vision task (the machine vision task of the design takes medical image classification as an example), and the corresponding 'lung X-ray images and the labels' form data for the machine vision task; as shown in fig. 2, taking a lung X-ray image original image as a high-resolution image, and using a downsampling method to prepare a low-resolution image with given scale and corresponding original image, wherein the corresponding low-resolution image and high-resolution image form data for medical image super-resolution tasks; finally, "low resolution image, high resolution image, label" make up an image pair for use in the present invention;
step 1.2: in order to reduce the influence of image deformation on the performance of a machine vision task (the machine vision task of the design takes medical image classification as an example) as much as possible, after scaling an original image of a lung X-ray image in an equal proportion by a short side, cutting the original image of the lung X-ray image to an equal width and equal height (224 multiplied by 224) suitable for an input image of a classifier, only lung key information which is beneficial to the machine vision task is reserved in the image, then taking the image as a high-resolution image, obtaining a low-resolution image by using a bicubic downsampling method according to a given scale, combining corresponding 0-1 labels to form a low-resolution image, a high-resolution image and a label image pair, and recording an image pair A;
step 1.3: meanwhile, as the cut image only reserves the area which is conducive to the machine vision task, incomplete image information can influence the performance of the super-resolution task of the image, the original image of the lung X-ray image is used as a high-resolution image, a low-resolution image is obtained by using a bicubic downsampling method according to a given scale, and a corresponding 0-1 label is combined to form a 'low-resolution image, a high-resolution image and label' image pair, and an image pair B is recorded;
step 2: training a classifier (the machine vision task of the design takes medical image classification as an example) by using the lung X-ray image in the image pair A (the image pair subjected to size clipping and resampling), wherein the classifier is used for predicting that the input image content is healthy lung or lung suffering from pneumonia, and the output result is the pneumonia illness probability;
as shown in fig. 3, which is a schematic flow chart of step 2, in the training process, a low-resolution image and a corresponding label are input into a classifier, the probability of the image on both normal and pneumonia is predicted, a cross entropy loss function is calculated by using a predicted label and a real label, the classifier is optimized by using the cross entropy loss function, classifier parameters are optimized by Adam algorithm, a classifier with two classes is obtained through training, and the training result is saved as a pre-trained classifier model;
step 3: designing a super-resolution network in the first stage on the basis of a classical network, such as RCAN (Very deep residual channel attention networks, very deep residual channel convolutional network);
step 3.1: adding a Multi-scale information extraction design at the network head, wherein the design uses an acceptance module in GoogleNet, uses residual connection to carry out weighted fusion on the characteristics obtained by the input image and the module in the channel dimension by using the Learnable parameters to obtain the extracted characteristics, the characteristics retain and strengthen the local information and the whole information of the input image, and the LMIFB (Learnable Multi-scale information fusion module Learnable Multi-scale Information Fusion Block, LMIFB) design is shown in figure 4 in the attached drawings, and then carries out subsequent operation on the characteristics input network;
step 3.2: connecting a super-resolution image generated by a network with an input low-resolution image in a channel dimension at the tail of the network, respectively obtaining two parameters of dynamic normalization by utilizing two convolution branches, and correcting the information distribution of the super-resolution image by using the parameters to obtain a final super-resolution image;
step 4: inputting the low-resolution image in the image pair A, B into a super-resolution network for training, calculating super-resolution network loss by using the generated super-resolution image and the corresponding high-resolution image, and optimizing the super-resolution network to obtain a pre-training super-resolution network;
step 4.1: the low-resolution image of the image pair A (the image pair subjected to size clipping and downsampling) is subjected to a super-resolution network to obtain a corresponding super-resolution image, wherein key information in the image is relatively completely reserved;
step 4.2: when the low-resolution image of the image pair B (the image pair of which the original image is directly downsampled) is input into a super-resolution network, the low-resolution image in the image pair B is cut into a patch (generally 48 or a multiple of 48) with a proper size according to the conventional data preprocessing operation of the super-resolution network, and the patch is input into the super-resolution network in batches after the data enhancement operation processing to obtain a corresponding super-resolution image;
step 4.3: calculating super-resolution network loss by utilizing the super-resolution images generated by the image pair A and the image pair B and the corresponding high-resolution images, optimizing the super-resolution network by using the loss, and storing the training result as a pre-trained super-resolution network;
as shown in fig. 5, a flow chart of step 4 is shown.
Two sets of image pairs A, B are made from the lung X-ray image, the super-resolution image generated by the image pair A (the image pair subjected to size clipping and downsampling) after training the super-resolution network is input into a downstream task model, and the image pair B (the image pair subjected to direct downsampling of the original image) is specially used for training the super-resolution network. The image pair A, B is input to the super-resolution network, in which since the image pair B is not resized to a uniform size, it is necessary to input the super-resolution network after performing data enhancement operation processing in a patch (image blocking) form. In the process of training the super-resolution network, the super-resolution network loss is calculated by utilizing the high-resolution image in the image pair and the generated super-resolution image, network parameters are optimized by an Adam algorithm, and finally a pre-training network is saved.
Step 5: the whole super-resolution model is designed into a two-stage medical image super-resolution frame, and comprises a first-stage super-resolution network and a second-stage classifier, wherein the first-stage super-resolution network and the second-stage classifier use a pre-training network at the beginning of training, the first-stage pre-training classifier is used in the step 2, and the second-stage pre-training super-resolution network is used in the step 4; the method comprises the steps that a model is input by utilizing a low-resolution image of an image pair A, B, total loss is calculated by the generated super-resolution image, a corresponding high-resolution image, a prediction category and a real label, then fine adjustment is carried out only on parameters of a super-resolution network, the parameters of a classifier are not updated, the classification loss is calculated by utilizing the classifier, the super-resolution network is optimized, and the model design is shown in fig. 6;
step 5.1: the low-resolution image of the image pair A (the image pair subjected to size clipping and downsampling) is not subjected to clipping patch and data enhancement operation, a corresponding super-resolution image SRA is obtained after the low-resolution image is directly input into a super-resolution network, and key information in the image is relatively completely reserved by the SRA, so that the SRA is input into a classifier of a second stage, and classification loss is calculated by using a prediction label and a real label;
step 5.2: the image pair B (image pair of original image direct downsampling) is cut into a patch and is input into a super-resolution network after being subjected to data enhancement operation processing, but does not pass through a classifier of a second stage, and super-resolution network loss (comprising L1 loss and perception loss but not contrast loss) is calculated by only utilizing the super-resolution image and a corresponding high-resolution image;
step 5.3: in the optimization process, using a high-resolution image in an image pair A (an image pair subjected to size clipping and downsampling) as a positive sample, using SRA (a high-resolution image generated by the image pair A) in a current epoch (round) as an anchor sample, using SRA in the previous epoch which is separated by a fixed epoch number as a negative sample, respectively extracting high-frequency information by wavelet transformation, converting each convolution layer of the classifier into a corresponding characteristic representation, reserving the characteristic representation of each layer of the corresponding sample in the process, and finally calculating multiple sample contrast losses among a plurality of positive sample characteristics, anchor sample characteristics and negative sample characteristics;
step 5.4: optimizing the super-resolution network by using the weighted total loss of the super-resolution network loss, the multi-sample comparison loss and the classification loss;
as shown in fig. 7, a flow chart of step 5 is shown.
In the training stage, the super-resolution network (SR network) performs fine adjustment on the basis of the pre-training network, the classifier uses a pre-training classifier with fixed parameters, an image pair A, B is input, an image pair B (image pair directly downsampled by original image) is cut into a patch, a low-resolution image (LR) subjected to data enhancement operation is input into a super-resolution image (SR) generated by the super-resolution network and a corresponding high-resolution image, and a super-resolution network Loss (SR Loss, which comprises improved multi-sample contrast Loss) is calculated; the LR image of the image pair A directly generates an SR image through an SR network, then the classification is predicted through a classifier, classification Loss (classification Loss) is calculated by using a prediction label and a real label, and finally the parameter of the super-resolution network is finely tuned and optimized by using the weighted total Loss of the SR Loss and the classification Loss.
Step 6: in the test stage, the low-resolution image used for testing in the image pair A (the image pair subjected to size clipping and downsampling) is input into an image super-resolution model, a super-resolution image is obtained by a first stage network, and then a prediction category is obtained by a second stage classifier by using the super-resolution image.
As shown in fig. 8, a flow chart of step 6 is shown, in the test stage, the lung X-ray image is input into the super-resolution model, the super-resolution image is obtained by the first stage network, and the prediction category is obtained by the second stage classifier.
It should be noted that the embodiments of the present invention are preferred and not limited in any way, and any person skilled in the art may make use of the above-disclosed technical content to change or modify the same into equivalent effective embodiments without departing from the technical scope of the present invention, and any modification or equivalent change and modification of the above-described embodiments according to the technical substance of the present invention still falls within the scope of the technical scope of the present invention.

Claims (10)

1. A method for enhancing machine vision of medical images using super-resolution techniques, comprising the steps of:
step 1: producing a low-resolution-high-resolution image pair, wherein the image pair consists of a low-resolution image, a high-resolution image and a label, the label is 0 or 1, and the method comprises the following steps of:
step 1.1: acquiring a medical image original image and a corresponding health condition, and marking labels of two types of health and diseases as 0 and 1 respectively as real labels;
step 1.2: cutting an original image of a medical image and keeping key information which is helpful for a machine vision task, obtaining a low-resolution image by using a bicubic downsampling method according to a given scale as a high-resolution image, combining a real label to form an image pair A of a low-resolution image, a high-resolution image and a label;
step 1.3: taking the original image of the medical image as a high-resolution image, obtaining a low-resolution image by using a bicubic downsampling method according to a given scale, combining with a real label to form an image pair of 'low-resolution image, high-resolution image and label', and recording the image pair B;
step 2: training a classifier by utilizing a medical image in the image pair, wherein the classifier is used for predicting an input image content label, and outputting a result as a disease probability;
step 3: designing a super-resolution network on the basis of a classical network;
step 3.1: adding a multi-scale information extraction module to extract characteristics at the network head, and reserving and enhancing local information and whole information of an input image;
step 3.2: connecting a super-resolution image generated by a network with an input low-resolution image in a channel dimension at the tail of the network, respectively obtaining two parameters of dynamic normalization by utilizing two convolution branches, and correcting the information distribution of the super-resolution image by using the parameters to obtain a final super-resolution image;
step 4: inputting the low-resolution image in the image pair A, B into a super-resolution network for training, calculating super-resolution network loss by using the generated super-resolution image and the corresponding high-resolution image, and optimizing the super-resolution network to obtain a pre-training super-resolution network, wherein the method comprises the following steps of:
step 4.1: inputting the low-resolution image of the image pair A into a super-resolution network for post-processing to obtain a corresponding super-resolution image, wherein key information in the image is reserved;
step 4.2: cutting the low-resolution image in the image pair B into image blocks with proper sizes, and inputting the image blocks into a super-resolution network in batches after data enhancement operation treatment to obtain a corresponding super-resolution image;
step 4.3: calculating super-resolution network loss by utilizing the super-resolution images generated by the image pair A and the image pair B and the corresponding high-resolution images, and optimizing the super-resolution network by using the loss to obtain a pre-trained super-resolution network;
step 5: designing an image super-resolution model, wherein the image super-resolution model comprises a super-resolution network pre-trained in the step 4 and a classifier pre-trained in the step 2; inputting the low-resolution image of the image pair A, B into a model, calculating total loss by the generated super-resolution image, the corresponding high-resolution image, the predicted category and the real label, and fine-tuning the parameters of the super-resolution network, wherein the method comprises the following steps of:
step 5.1: inputting the low-resolution image of the image pair A into a super-resolution network to obtain a corresponding super-resolution image SRA, inputting the SRA into a classifier, and calculating classification loss by using a prediction label and a real label;
step 5.2: the image pair B is cut into image blocks, the image blocks are input into a super-resolution network after data enhancement operation processing, and super-resolution network loss is calculated by utilizing the super-resolution image and a corresponding high-resolution image;
step 5.3: using a high-resolution image in the image pair A as a positive sample, using the SRA generated in the current round as an anchoring sample, using the generated SRA with fixed round number as a negative sample, respectively extracting high-frequency information by wavelet transformation, converting each convolution layer of the classifier into a corresponding characteristic representation, reserving the characteristic representation of each layer of the corresponding sample in the process, and calculating multiple sample comparison losses among a plurality of positive sample characteristics, anchoring sample characteristics and negative sample characteristics;
step 5.4: optimizing the super-resolution network using a weighted total loss of the classification loss, the super-resolution network loss, and the multiple sample comparison loss;
step 6: in the test stage, the low-resolution image used for testing in the image pair A is input into an image super-resolution model, a super-resolution image is obtained by the super-resolution network in the first stage, and the super-resolution image is input into the classifier in the second stage to obtain a prediction category.
2. The method for enhancing machine vision of medical images using super resolution technology according to claim 1, wherein in the training process in step 2, low resolution and corresponding labels are input into a classifier, cross entropy loss function is calculated by using predictive labels and real labels, classifier parameters are optimized by Adam algorithm, a classifier of two classes is obtained, and the classifier is stored as a pre-trained classifier.
3. The method for enhancing machine vision of medical images using super resolution technology according to claim 1, wherein the multi-scale information extraction module in step 3.1 is an acceptance module in google net, and the characteristics obtained by the input image and the module are weighted and fused in channel dimension by using a learnable parameter by using residual connection, so as to obtain the extracted characteristics.
4. The method for enhancing machine vision of medical images using super resolution technology according to claim 1, wherein in the training of super resolution network in step 4, super resolution network loss is calculated by using the high resolution image in the image pair and the generated super resolution image, network parameters are optimized by Adam algorithm, and pre-trained super resolution network is saved.
5. The method of claim 1, wherein the super-resolution network loss in step 5.2 comprises L1 loss, perceptual loss, but not contrast loss.
6. A system for enhancing machine vision of medical images using super-resolution techniques, comprising a preparation module, a classifier module, a super-resolution network module, a network training module, a model training module, and a test module, wherein:
the preparation module is used for manufacturing a low resolution-high resolution image pair, the image pair consists of a low resolution image, a high resolution image and a label, wherein the label is 0 or 1, and the preparation module comprises the following components:
the original data component takes a medical image original image as a high-resolution image, a low-resolution image is manufactured by using a downsampling method, and the labels of two types of health and diseases are respectively marked as 0 and 1 and are taken as real labels;
the image pair A component is used for cutting the original image size of the medical image to retain key information which is beneficial to machine vision tasks, and taking the medical image as a high-resolution image, obtaining a low-resolution image according to a given scale by using a bicubic downsampling method, and combining with a real label to form an image pair A, wherein the image pair A is a low-resolution image, a high-resolution image and a label;
the image pair B component takes the original image of the medical image as a high-resolution image, obtains a low-resolution image by using a bicubic downsampling method according to a given scale, and combines a real label to form an image pair of 'low-resolution image, high-resolution image and label' as an image pair B;
the classifier module is used for training a classifier for the medical image in the image pair A, the classifier is used for predicting the content label of the input image, and the output result is the illness probability;
the super-resolution network module is used for designing a super-resolution network and comprises the following components:
the head component is used for adding a multi-scale information extraction module to the network head to extract characteristics, and reserving and enhancing local information and overall information of an input image;
the tail component is used for connecting the super-resolution image generated by the network and the input low-resolution image in the channel dimension at the tail of the network, respectively obtaining two parameters of dynamic normalization by utilizing two convolution branches, and correcting the information distribution of the super-resolution image by using the parameters to obtain a final super-resolution image;
the network training module is used for inputting the low-resolution image in the image pair A, B into a super-resolution network for training, calculating super-resolution network loss by utilizing the generated super-resolution image and the corresponding high-resolution image, and optimizing the super-resolution network to obtain a pre-training super-resolution network, and comprises the following components:
the processing A component inputs the low-resolution image of the image pair A into a super-resolution network for post-processing to obtain a corresponding super-resolution image, wherein key information in the image is reserved;
the processing component B is used for cutting the low-resolution image in the image pair B into a patch with a proper size, and inputting the patch into a super-resolution network in batches after data enhancement operation processing to obtain a corresponding super-resolution image;
the computing component is used for computing super-resolution network loss by utilizing the super-resolution images generated by the image pair A and the image pair B and the corresponding high-resolution images, and optimizing the super-resolution network by using the loss to obtain a pre-trained super-resolution network;
the model training module is used for designing an image super-resolution model and comprises a super-resolution network pre-trained in the network training module and a classifier pre-trained in the classifier module; inputting the low-resolution image of the image pair A, B into a model, calculating total loss by the generated super-resolution image, the corresponding high-resolution image, the prediction category and the real label, and fine-tuning parameters of the super-resolution network, wherein the method comprises the following components:
the classification loss component inputs the low-resolution image of the image pair A into a super-resolution network to obtain a corresponding super-resolution image SRA, inputs the SRA into a classifier of a second stage, and calculates classification loss by using a prediction label and a real label;
the network loss component is used for cutting the image pair B into image blocks, inputting the image pair B into a super-resolution network after data enhancement operation treatment, and calculating super-resolution network loss by utilizing the super-resolution image and a corresponding high-resolution image;
the contrast loss component uses the high-resolution image in the image pair A as a positive sample, the SRA generated in the current round is used as an anchoring sample, the SRA generated at intervals of a fixed round number is used as a negative sample, the high-frequency information is respectively extracted by utilizing wavelet transformation, each convolution layer of the classifier is then converted into corresponding characteristic representation, the characteristic representation of each layer of the corresponding sample is reserved in the process, and the multiple sample contrast loss among a plurality of positive sample characteristics, anchoring sample characteristics and negative sample characteristics is calculated;
an optimization component that optimizes the super-resolution network using a weighted total loss of classification loss, super-resolution network loss, multiple sample comparison loss;
the testing module is used for inputting the low-resolution image used for testing in the image pair A into the image super-resolution model, obtaining the super-resolution image by the super-resolution network in the first stage, and inputting the super-resolution image into the classifier in the second stage to obtain the prediction category.
7. The method and system for enhancing machine vision of medical images using super-resolution techniques of claim 6, wherein the classifier module inputs low resolution and corresponding labels into the classifier during training, calculates cross entropy loss functions using predictive and true labels, optimizes classifier parameters by Adam algorithm to obtain a classified classifier, and saves the classified classifier as a pre-trained classifier.
8. The method and system for enhancing machine vision of medical images using super resolution technology of claim 6, wherein the multi-scale information extraction module in the header component is an acceptance module in google net, and the characteristics obtained by the input image and the module are weighted and fused in channel dimension by using a residual connection by using a learnable parameter, so as to obtain the extracted characteristics.
9. The method and system for enhancing machine vision of medical images using super resolution techniques of claim 6, wherein in the training of super resolution networks in the network training module, super resolution network loss is calculated using the high resolution images in the image pairs and the generated super resolution images, network parameters are optimized by Adam algorithm, and finally a pre-training network is saved.
10. The method and system for enhancing machine vision of medical images using super-resolution techniques of claim 6, wherein in the network loss component, the super-resolution network loss comprises an L1 loss, a perceptual loss, but does not comprise a contrast loss.
CN202311580384.4A 2023-11-24 2023-11-24 Method and system for enhancing machine vision of medical images using super resolution techniques Pending CN117522693A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311580384.4A CN117522693A (en) 2023-11-24 2023-11-24 Method and system for enhancing machine vision of medical images using super resolution techniques

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311580384.4A CN117522693A (en) 2023-11-24 2023-11-24 Method and system for enhancing machine vision of medical images using super resolution techniques

Publications (1)

Publication Number Publication Date
CN117522693A true CN117522693A (en) 2024-02-06

Family

ID=89766038

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311580384.4A Pending CN117522693A (en) 2023-11-24 2023-11-24 Method and system for enhancing machine vision of medical images using super resolution techniques

Country Status (1)

Country Link
CN (1) CN117522693A (en)

Similar Documents

Publication Publication Date Title
CN108898175B (en) Computer-aided model construction method based on deep learning gastric cancer pathological section
CN111798462B (en) Automatic delineation method of nasopharyngeal carcinoma radiotherapy target area based on CT image
CN110930416B (en) MRI image prostate segmentation method based on U-shaped network
CN112132833A (en) Skin disease image focus segmentation method based on deep convolutional neural network
WO2022121100A1 (en) Darts network-based multi-modal medical image fusion method
CN113139974B (en) Focus segmentation model training and application method based on semi-supervised learning
CN112215291A (en) Method for extracting and classifying medical image features under cascade neural network
CN112884788B (en) Cup optic disk segmentation method and imaging method based on rich context network
CN111178499B (en) Medical image super-resolution method based on generation countermeasure network improvement
CN110991254B (en) Ultrasonic image video classification prediction method and system
CN112950644B (en) Neonatal brain image segmentation method and model construction method based on deep learning
CN115375711A (en) Image segmentation method of global context attention network based on multi-scale fusion
CN115147600A (en) GBM multi-mode MR image segmentation method based on classifier weight converter
Jiang et al. CT image super resolution based on improved SRGAN
CN116758336A (en) Medical image intelligent analysis system based on artificial intelligence
CN114881105A (en) Sleep staging method and system based on transformer model and contrast learning
CN114565601A (en) Improved liver CT image segmentation algorithm based on DeepLabV3+
CN114332910A (en) Human body part segmentation method for similar feature calculation of far infrared image
CN117522693A (en) Method and system for enhancing machine vision of medical images using super resolution techniques
CN112750097B (en) Multi-modal medical image fusion based on multi-CNN combination and fuzzy neural network
CN114419015A (en) Brain function fusion analysis method based on multi-modal registration
CN111640126A (en) Artificial intelligence diagnosis auxiliary method based on medical image
Nguyen et al. Comparative study on super resolution techniques for upper gastrointestinal endoscopic images
CN116597041B (en) Nuclear magnetic image definition optimization method and system for cerebrovascular diseases and electronic equipment
CN117726642B (en) High reflection focus segmentation method and device for optical coherence tomography image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination