WO2022041222A1 - Procédé et système de classification d'image - Google Patents

Procédé et système de classification d'image Download PDF

Info

Publication number
WO2022041222A1
WO2022041222A1 PCT/CN2020/112608 CN2020112608W WO2022041222A1 WO 2022041222 A1 WO2022041222 A1 WO 2022041222A1 CN 2020112608 W CN2020112608 W CN 2020112608W WO 2022041222 A1 WO2022041222 A1 WO 2022041222A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
subject
instances
acquired
graph
Prior art date
Application number
PCT/CN2020/112608
Other languages
English (en)
Inventor
Zhouwang YANG
Yanzhi SONG
Original Assignee
Top Team Technology Development Limited
Master Dynamic Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Top Team Technology Development Limited, Master Dynamic Limited filed Critical Top Team Technology Development Limited
Priority to PCT/CN2020/112608 priority Critical patent/WO2022041222A1/fr
Publication of WO2022041222A1 publication Critical patent/WO2022041222A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/778Active pattern-learning, e.g. online learning of image or video features
    • G06V10/7784Active pattern-learning, e.g. online learning of image or video features based on feedback from supervisors
    • G06V10/7788Active pattern-learning, e.g. online learning of image or video features based on feedback from supervisors the supervisor being a human, e.g. interactive learning with a human teacher
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/84Arrangements for image or video recognition or understanding using pattern recognition or machine learning using probabilistic graphical models from image or video features, e.g. Markov models or Bayesian networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10056Microscopic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10116X-ray image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10132Ultrasound image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30008Bone
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30024Cell structures in vitro; Tissue sections in vitro
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30088Skin; Dermal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Definitions

  • the present invention relates to a process and system for image classification, and more particularly to a process and system for the classification of a images acquired from a subject for analysis.
  • this method in which a medical image with an AI determination is displayed in an image together with the symptom level, is characterized by a AI diagnosis which, in a captured image of an image area indicated by a doctor at the time of examination or diagnosis of a nonhuman animal, makes it possible, with an AI determination, to discover site symptoms other than those indicated, and moreover to prevent the overlooking or misdiagnosis of symptoms in an image, by means of software obtained by performing machine learning with artificially created training data and defined correct answers, said training data obtained by a specialist doctor performing a manipulation to place a symptomatic site in a captured image within a circle and inputting a numerical value indicating the degree of progression in the encircled area
  • MIL multi-instance learning
  • ROI region of interest
  • patch-level annotations are not easy to obtain, and professional doctors need to spend a lot of time to re-label the patches. Therefore, many researchers use the MIL method to train patch-level classifier, and aggregate the prediction of patch-level into image-level score.
  • the present invention provides a process for determining the classification status of an acquired image from a subject according to a pre-determined classification status, wherein the process comprising the steps of:
  • step (ii) converting said acquired image to a graph according to the spatial relationship between the plurality of instances of step (i) ;
  • step (iv) Inputting said graph acquired in step (ii) and said initial feature representations acquired in step (iii) into a pre-trained graph convolutional network, for transforming the instances into low-dimensional embeddings,
  • pre-trained graph convolutional network has been pre-trained utilising one or more training data input sets, wherein said data inputs include a plurality of training images each of which is labeled according to a predetermined classification status;
  • step (v) computing a weighting vector for each instance utilising the graph of step (ii) and the low-dimensional embeddings of step (iv) by a graph attention mechanism which integrates the spatial relationship of the plurality of instances into the attention mechanism;
  • step (vi) and calculating a weighting sum of the plurality of instances based on said lower-dimensional embeddings acquired in step (iv) and the and the weighting vector for each instance of step (v) ;
  • weighting sum of the plurality of instances is converted into a class score indicating the classification status of the acquired image.
  • the images of said one or more training images have preferably been classified by at least one clinician.
  • the initial feature representation of each of the plurality of instances may be the averaged colour pixel of said instance of the acquired image.
  • the initial feature representations of each of the plurality of instances may be extracted by a neural network.
  • the initial feature representations of each of the plurality of instances may be extracted by a convolutional neural network, a deep neural network, or the like.
  • the acquired image may be a medical image, for example a CT (computer tomography) scans, X-ray, MRI (magnetic resonance imaging, CBCT (Cone beam computed tomography) , a DEXA (Dual-energy X-ray absorptiometry) , ultrasound image, PET/CT scan or the like.
  • CT computer tomography
  • X-ray X-ray
  • MRI magnetic resonance imaging
  • CBCT Cone beam computed tomography
  • DEXA Dual-energy X-ray absorptiometry
  • the acquired image may an optical image.
  • the optical image may be a fundus image acquired from the retina of a subject.
  • the process may provide classification to said fundus image of a subject, to determine whether said fundus image indicates diabetic retinopathy within the subject.
  • the optical image may be a histological section image.
  • the optical image may be a histological slide section,
  • the optical image may be a photographic of skin tissue.
  • the optical image may be wan arthroscopic image.
  • the process may provide for classification to an image of tissue cells of a subject, to determine whether said image of tissue cells indicate the presence of cancerous cells within the subject.
  • the process may provide for classification to an image of tissue cells of a subject, to determine whether said image of tissue cells indicate the presence of cancerous cells within the subject, said presence of cancerous cells within a subject refer to breast cancer.
  • the process may provide for classification to an image of tissue cells of a subject, to determine whether said image of tissue cells indicate the presence of cancerous cells within the subject, said presence of cancerous cells within a subject refer to colon cancer.
  • the present invention provides a system for determining the classification status of an acquired image from a subject according to a pre-determined classification status, wherein the process comprising the steps of:
  • an image processing module dividing said acquired image from a subject into a plurality of instances and converting said acquired image to a graph according to the spatial relationship between the plurality of instances a pre-trained graph convolutional network for receiving graph and said initial feature representations, and for transforming the instances into low-dimensional embeddings
  • pre-trained graph convolutional network has been pre-trained utilising one or more training data input sets, wherein said data inputs include a plurality of training images each of which is labeled according to a predetermined classification status;
  • a graph attention module for computing a weighting vector for each instance utilising the graph obtained at the image processing module and the low-dimensional embeddings obtained at the graph convolution network, by integrating the spatial relationship of the plurality of instances into the attention mechanism;
  • a scoring module wherein a weighting sum of the plurality of instances is calculated based on said lower-dimensional embeddings acquired at the graph convolution network and the and the weighting vector for each instance acquired at the graph attention module;
  • weighting sum of the plurality of instances is converted into a class score indicating the classification status of the acquired image.
  • the images of said one or more training images have preferably been classified by at least one clinician.
  • the initial feature representation of each of the plurality of instances may be the averaged colour pixel of said instance of the acquired image.
  • the initial feature representations of each of the plurality of instances may be extracted by a neural network.
  • the initial feature representations of each of the plurality of instances may be extracted by a convolutional neural network, a deep neural network, or the like.
  • the acquired image may be a medical image, for example a CT (computer tomography) scans, X-ray, MRI (magnetic resonance imaging, CBCT (Cone beam computed tomography) , a DEXA (Dual-energy X-ray absorptiometry) , ultrasound image, PET/CT scan or the like.
  • CT computer tomography
  • X-ray X-ray
  • MRI magnetic resonance imaging
  • CBCT Cone beam computed tomography
  • DEXA Dual-energy X-ray absorptiometry
  • the acquired image may an optical image.
  • the optical image may be a fundus image acquired from the retina of a subject.
  • the process may provide classification to said fundus image of a subject, to determine whether said fundus image indicates diabetic retinopathy within the subject.
  • the optical image may be a histological section image.
  • the optical image may be a histological slide section,
  • the optical image may be a photographic of skin tissue.
  • the optical image may be wan arthroscopic image.
  • the system may provide for classification to an image of tissue cells of a subject, to determine whether said image of tissue cells indicate the presence of cancerous cells within the subject.
  • the system may provide for classification to an image of tissue cells of a subject, to determine whether said image of tissue cells indicate the presence of cancerous cells within the subject, said presence of cancerous cells within a subject refer to breast cancer.
  • the system may provide for classification to an image of tissue cells of a subject, to determine whether said image of tissue cells indicate the presence of cancerous cells within the subject, said presence of cancerous cells within a subject refer to colon cancer.
  • Figure 1 shows a flow chart of an embodiment of a medical image classification process according to the present invention
  • Figure 2 shows a flow chart of a further embodiment of the medical image classification process according to the present invention.
  • Figure 3 shows a systematic representation of an embodiment of the image classification system according to the present invention
  • Figure 4 shows the five MIL benchmark datasets utilized during an experiment assessing the medical image classification process according to the present invention
  • Figure 5 shows the results on classical MIL datasets during an experiment assessing the medical image classification process according to the present invention
  • Figure 6 shows the results on breast cancer datasets during an experiment assessing the medical image classification process according to the present invention
  • Figure 7 shows the results on colon cancer datasets during an experiment assessing the medical image classification process according to the present invention
  • Figure 8 (a) shows a histopathological image
  • Figure 8 (b) shows the region of interest identified by the process of the present invention
  • Figure 8 (c) shows the region of interest identified by the Prior Art
  • Figure 10 shows the results on ICIAR dataset.
  • the present invention provides for a process and system for which ameliorates at least some of the deficiencies present in the prior art.
  • the present invention provides an improved an enhanced image analysis process and system for diagnosis, screening and pathological examination of human tissue related disorders, as well as disease, tumors and cancers for example.
  • pathology for example, whether a pathologist has a cancer annotation for a given histopathological image, weakly supervised learning algorithms hope to automatically detect and segment cancer tissue based on a set of histopathological images annotated by histopathologists.
  • the “acquired image” from a subject is an image which has been obtained for review, analysis, diagnosis, screening and the like, were in such analysis requires knowledge and experience and training for determination of variance which may or may not indicate a particular state.
  • Such review, analysis, diagnosis, screening and the like may be performed in “real time” whilst the image is being acquired from a subject, or subsequently, depending upon the application.
  • the “acquired image” may be acquired from a subject by several ways, for example:
  • Medical Imaging For example X-ray, CT scan, MRI scan mammography scan, DEXA (Dual-energy X-ray absorptiometry) , ultrasound image, PET/CT scan.
  • X-ray CT scan
  • MRI scan mammography scan MRI scan mammography scan
  • DEXA Dual-energy X-ray absorptiometry
  • ultrasound image PET/CT scan.
  • Optical Image The acquired image may also be acquired by optical means, for example where in the image is an optical image for analysis: -
  • Such an optical image maybe for example a fundus image acquired from the retina of a subject which is indicative of that issue and variation, for diagnosis
  • Such as a photographic image of a histological section such as a histological slide section from a frozen block of paraffin embedded section, which may have been stained or treated for image enhancement or contrasting for pathological and histological examination.
  • the “acquired image” may be noninvasive such as in traditional medical imaging type procedures as listed and recited above, as well as an optical image such as a fundus image which is noninvasive.
  • Other acquired images may be required to be acquired of tissue which has been removed by way of biopsy for histological analysis such as cancer screening, cancer detection or the like, which may be considered invasive to the extent that a histological sample or biopsy is required to be taken from a subject.
  • an optical image which may be an acquired image of a subject, may be an acquired image of skin tissue from external of the body or from the body without the necessity to remove tissue, such as an external image including a photographic image of the skin or dermis of a patient for skin cancer or tumor diagnosis, or screening of such cancers.
  • Another optical image which may be acquired from a subject being noninvasive maybe that for example which is acquired by way of arthroscopic examination within the body of a patient website tissue may not necessarily be required to be removed, but rather an image acquired for subsequent analysis thereof.
  • GCN Graph Convolution Network
  • MIL Multiple Instance Learning
  • Graph Convolution Network based Multiple Instance Learning may be utilised in embodiments of the present invention, and is explained as follows.
  • Multi-instance learning processes weakly annotated data, wherein each data sample (often called a bag) has multiple instances but merely one label.
  • the MIL can be expressed by a formula as a supervised learning task, a bag as input and bag-level label as target.
  • the set of bags is defined as ⁇ X 1 , X 2 , ..., X N ⁇ and each bag X 1 contains K instances
  • the purpose of the MIL is to learn a mapping function from N bags to corresponding labels ⁇ Y 1 , Y 2 , ..., Y N ⁇ .
  • mapping function S (X) For a typical two-category MIL problem, if the bag contains an instance of a positive class, this bag is a positive sample; otherwise, this bag is a negative sample (as shown in Equation 1) .
  • GCN simultaneously carry out end-to-end learning of node feature information and structure information, and is suitable for nodes and graphs of arbitrary topology.
  • the present invention as provided by the present inventors focuses on the graphical representation learning of MIL based on GCN, and the present inventors have provided a new theoretical perspective to explain the MIL based on GCN.
  • the technique used in the present invention includes a method for constructing a graph structure by utilizing the spatial relationship of the instances and the graph-attention mechanism.
  • the first technique is to establish the graph structure of the bag by using the similarity between eigenvectors of instances in Euclidean space. This technique is able to handle the interdependent instances and therefore solves the problem that usual multi instance learning approach cannot.
  • the first technique cannot directly take advantage of the patches' structural relationships on the original spatial domain; and also as noted by present inventors, the second technique cannot integrate the relationships between the instances.
  • the present invention can solve the above two problems and deficiencies as identified by the present inventors and as associated with the prior art.
  • the image classification process 100 of the present invention may include the following steps:
  • First Step 110 Divide an input image (bag) into a plurality of patches (instances) of equal dimensions.
  • the graph G (V i , E i ) can be represented by an adjacency matrix A i
  • Step 130 Extract initial feature vectors of the patches and summarize them into a feature matrix.
  • Such initial feature vector can, for example, be simply represented by a pixel, or can be extracted by using a convolutional neural network, or a deep neural network, or other strategies in other embodiments.
  • Step 140 Forth Step 140 -Input the adjacency matrix obtained in Step 120, together with the feature matrix of step 130 into a trained graph convolutional network, for transforming the instances into low-dimensional embeddings.
  • the graph convolution network is trained with a set of weakly labeled images before use.
  • Step 160 Sixth Step 160 -Calculate a weighted sum of the instance embeddings to yield a feature vector of the image, then convert the feature vector into a class score of the image which indicates the classification of the image.
  • attention mechanism operators mentioned in the prior art have a clear disadvantage that they are operated on feature representation of single instance.
  • the attention mechanism of the prior art utilizes a weighted average of instances (low dimensional embeddings) where weights are determined by a neural network.
  • a new graph attention mechanism is proposed in and provided by the present invention, wherein the weighting vector of each instance is obtained based on the graph structure of the input image and the feature representations of instances.
  • the graph attention mechanism of the present invention is modified based on the ordinary attention mechanism of the prior art.
  • the attention mechanism used in the multi instance learning approach is based on independent instances. For high resolution image classification tasks, patches have a significant structural relationship in the original image spatial domain, but the usual attention mechanism cannot deal with such a structural relationship.
  • the present invention builds a new graph attention mechanism by integrating such a structural relationship into the attention mechanism.
  • the input bag X i is converted into a graph G (V i , E i ) , wherein V i is a set of nodes (patches) , E i is a set of edges.
  • the adjacency matrix A i is computed using the original spatial structural relationship (1-neighborhood) between patches, as shown in Equation 2.
  • the initial feature vector of the instance is extracted by using a convolutional neural network, or a deep neural network, or other strategies.
  • the initial feature representation of patches can also be directly represented by a pixel. And these initial feature vectors are summarized into a feature matrix
  • this feature vector is F 1
  • the transformation function is f.
  • the low-dimensional embedding of the instance is further extracted by using the graph convolution network with the adjacency matrix in the first step 210 and the feature matrix in the second step 220 as input. And the dimension of is F 2 .
  • the graph attention is similar to the attention mechanism in llse, Tomczak, and Welling, which incorporates the graph structure into the attention mechanism to calculate the weighting factor for each instance Then the whole instances are combined to obtain the low-dimensional embeddingZ i of the bag X i .
  • the low-dimensional embedding of the bag is converted into a class score of the bag through a fully-connected layer.
  • dist is the Euclidean distance between the center of m-th and n-th patch (instance) in image (bag) X i
  • is the threshold to decide whether there is a edge between two instances based on their distance.
  • N i (k) represents the set of neighborhood nodes for the k instance in bag X i
  • w ⁇ R L ⁇ 1 represents the parameter variables to learn
  • tanh function is an activation function
  • FIG. 3 there is shown a schematic representation of an image classification system 300, which is particularly designed for the diagnosis of diabetic retinopathy from the fundus images of a person, and application of an embodiment of the present invention
  • Diabetic retinopathy often has no early warning signs. Even macular edema, which can cause rapid vision loss, may not have any warning signs for some time.
  • a person with macular edema is likely to have blurred vision, making it hard to do things such as reading or driving. In some cases, the vision will get better or worse during the day.
  • the first stage which is called non-proliferative diabetic retinopathy (NPDR) , has no symptoms. Patients may not notice the signs or indications of a problem, and have 20/20 vision.
  • NPDR non-proliferative diabetic retinopathy
  • NPDR neurodegenerative disease
  • abnormal new blood vessels form at the back of the eye as part of proliferative diabetic retinopathy (PDR) ; these can burst and bleed (vitreous hemorrhage) and blur the vision, because these new blood vessels are fragile.
  • PDR proliferative diabetic retinopathy
  • the existing diagnosis process of diabetic retinopathy of the prior art requires a surgical doctor or a trained professional to analyze each fundus image of a subject, to see if certain features such as the microscopic blood-filled bulges are present in the artery walls. If yes, the person is tested positive for diabetic retinopathy.
  • the image classification system 300 of the present invention is developed to provide an automatic diagnostic system to test whether a subject suffers from diabetic retinopathy with the use of graph convolution network.
  • the image classification system 300 is required to first be trained with a plurality of fundus images of patient which are marked positive or negative to diabetic retinopathy by professional personnel.
  • the total number of training data is preferably more than 400 marked images.
  • the image classification system 300 includes an image processing module 320, wherein upon the input of the fundus image 310 of the subject, the image processing module 320 divides the input fundus image 310 into a plurality of patches (instances) of equal dimensions.
  • the plurality of instances of the input fundus image 310 are then converted into a graph G (V i , E i ) by graph generating module 330 according to the spatial relationship of the patches, wherein V i is the set of vertices or nodes, and E i is the set of edges.
  • the graph structure can be represented using an adjacency matrix A i ⁇ ⁇ 0, 1 ⁇ K ⁇ K .
  • the initial feature vectors of the instances (patches) are extracted by using a pre-trained convolutional neural network (CNN) 340, and the instances belonging to the same fundus image are grouped into an MIL bag.
  • CNN convolutional neural network
  • the initial feature vector can be simply represented by a pixel, or can be extracted by using a deep neural network, or other strategies in other embodiments.
  • the adjacency matrix A i generated from module 330 and the initial feature vectors generated from module 340 are then fed to graph convolutional network (GCN) 350 for extracting the low-dimensional embeddings (feature vectors) of the instances.
  • GCN graph convolutional network
  • the instance embeddings which are extracted by the graph convolutional network (GCN) 350 are then passed to the attention mechanism module 360, wherein the graph structure G (V i , E i ) generated by the graph generating module 330 is also incorporated to the attention mechanism module to calculate the weighting factor of each instance, and then do weighted sum 365 of instance embeddings, which yield a fixed-dimensional embedding for each bag (image) Z i .
  • the feature vector Z i of the image is then converted to a class score by the scoring system 370, which indicates the diagnosis result 380 of whether the subject is tested positive for the diabetic retinopathy.
  • MUSK1 and MUSK2 are datasets for drug activity prediction
  • FOX, TIGER and ELEPHANT are image datasets.
  • the total number of bags, the total number of instances, the number of positive and negative samples, and the feature vector dimensions they contain are shown in Figure 4, respectively.
  • This consists of 58 weakly labeled 896 x 768Hematoxylin and Eosin (H&E) photographic images. If the image contains breast cancer cells, it is marked as malignant, otherwise it is benign.
  • Each image is divided into 32 x 32 patches and move each patch in steps of 32. This results in 672 patches per bag.
  • the images were derived from the various tissue appearances of normal and malignant areas. For each image, most of the nucleus of each cell was labeled.
  • a total of 22, 444 nuclei have associated class tags, namely epithelial cells, inflammatory cells, fibroblasts, and miscellaneous cells.
  • the ICIAR dataset consists of 400 microscopy images, divided into four categories: normal, benign, invasive, and carcinoma in situ.
  • H&E Hematoxylin and Eosin
  • each experiment 160 images are trained and 40 images are used for testing. These patches have been previously cropped from the whole-slide tissue image that is labeled, so from a MIL perspective, the large number of patches (instance) were extracted from one image (one bag) is consistent with the image label.
  • each image is randomly flipped and then cropped to a patch of size 256 x 256, and finally sliced to a patch of size 224 x 224.
  • a patch of size 256 x 256 is cropped for each image and finally scaled directly to a patch of size 224 x 224.
  • IDRID DATASET (ISBI 2018) .
  • the IDRiD dataset is published as a challenge dataset in ISBI 2018.
  • One purpose of this challenge is to evaluate the automatic disease grading algorithm for diabetic retinopathy (DR) and diabetic macular edema (DME) by using fundus images.
  • the training data contains 413 images and the test data contains 103 images.
  • the resolution of each image is 4288 x 2848.
  • a common evaluation method is used, that is, five times for each experiment.
  • each image is randomly flipped and then cropped to patches of size 256 x 256, and finally sliced to patches of size 224 x 224.
  • patches of size 256 x 256 are cropped for each image and scaled directly to patches of size 224 x 224.
  • the basic architecture of the deep network used is the same as Wang et al., but the fully-connected layer except the last layer is replaced by the graph convolutional layer.
  • the super parameter L in the graph attention mechanism is 64.
  • the basic architecture of the deep network used is the same as lisa, Tomczak, and Welling, but the fully-connected layer except the last layer is replaced by the graph convolutional layer.
  • the super parameter L in the graph attention mechanism is 64.
  • the purpose was to verify whether the present image classification method can surpass other MIL methods on five MIL benchmark datasets.
  • the MUSK1 and MUSK2 datasets were used to predict drug activity. Since the molecules can have a variety of shapes, the bag is composed of shapes belonging to the same molecule (Dietterich, Lathrop, and Lozano-P'erez 1997) .
  • the remaining three datasets, FOX, TIGER and ELEPHANT, contain features extracted from the image. Each bag consists of a set of image segments.
  • the positive bag is an image containing objective animal
  • the negative bag is an image containing other animals (Andrews, Tsochantaridis, and Hofmann 2003) .
  • Each dataset is divided into ten folds, and nine folds for training and one fold for testing are used.
  • Equation 2 is utilized to create the adjacent matrix A in the graph.
  • the present image classification method outperforms other MIL algorithms, which include DNN-based MIL algorithms, traditional non-DNN MIL algorithms, and GNN-based MIL algorithms, in five benchmark datasets.
  • the experimental results verify that the present image classification method of the present invention commendably integrate the relationship between instances into the embedded learning of the bag, and also verify that the GCN is better integrated into the graph structure than the GNN.
  • Equation 2 is utilized to create the adjacent matrix A of the graph.
  • the present image classification method has been shown to have the high recall.
  • the high recall is especially important in the medical field because false negative can lead to serious consequences including patient death, and the present invention seeks to reduce this incidence.
  • an image is used to verify that the present image classification method can provide the ROI of the disease.
  • a histopathological image is segmented into patches containing individual cells.
  • a heatmap is created by multiplying the patch by the corresponding attention weight.
  • Figure 8 shows a histopathological image using H&E (Hematoxylin and eosin stain) histology photographic image.
  • Figure 8 (b) shows the region of interest identified by the process of the present invention
  • Figure 8 (c) shows the region of interest identified by the prior art.
  • Equation 2 was still used to create the adjacent matrix A of the graph.
  • convolutional layers in ResNet50 He et al. 2016
  • All models are trained using the Adam optimization algorithm (Kingma and Ba 2014) .
  • the present approach of the present invention outperforms other methods, including the best MIL algorithm (Combalia and Vilaplana 2018) that they focus on the most relevant regions of the high-resolution image through the Monte Carlo sampling strategy, and then iterates to get the most related patches.
  • MIL Combalia and Vilaplana 2018
  • the first technique is to establish a graph structure of the bag by using the similarity between eigenvectors of instances in Euclidean space. This allows the handling of interdependent instances which similar technologies of the prior art cannot; while the second technique utilises independent instances to find key instances, enabling it to be more focused on multi-instance learning than the prior art.
  • the first technique cannot directly take advantage of the patches' structural relationships on the original spatial domain, and the second technique cannot integrate the relationships between the instances. And our invention can solve the above two problems.
  • the present inventors therefore have provided the present invention, modifying based on the attention mechanism of the prior art.
  • the attention mechanism of the prior art only utilizes a weighted average of independent instances wherein the corresponding weights are determined by a neural network. Structural relationship between instances within an image is not considered in the prior art.
  • the present inventors utilize a graph attention mechanism by integrating the structural relationship of instances within an image into the attention mechanism.
  • arXiv preprint arXiv: 1906.06023.
  • arXiv preprint arXiv: 1712.00310.
  • arXiv preprint arXiv: 1906.04881.
  • Em-dd An improved multiple-instance learning technique. In Advances in neural information processing systems, 1073-1080.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention concerne un procédé pour un état de classification d'une image acquise à partir d'un sujet selon un état de classification prédéterminé, le procédé consistant (i) à diviser une image acquise à partir d'un sujet en une pluralité d'instances; (ii) à convertir ladite image acquise en un graphe selon la relation spatiale entre la pluralité d'instances de l'étape (i); (iii) à extraire des représentations de caractéristiques initiales à partir de la pluralité d'instances; (iv) à entrer ledit graphe acquis à l'étape (ii) et lesdites représentations de caractéristiques initiales acquises à l'étape (iii) dans un réseau convolutif de graphe pré-entraîné, pour transformer les instances en plongements de petites dimensions, le réseau convolutif de graphe pré-entraîné ayant été pré-entraîné en utilisant un ou plusieurs ensembles d'entrée de données d'apprentissage, lesdites entrées de données comprenant une pluralité d'images d'apprentissage dont chacune est étiquetée selon un état de classification prédéterminé; (v) à calculer un vecteur de pondération pour chaque instance en utilisant le graphe de l'étape (ii) et les plongements de petites dimensions de l'étape (iv) par un mécanisme d'attention de graphe qui intègre la relation spatiale de la pluralité d'instances dans le mécanisme d'attention; (vi) à calculer une somme de pondération de la pluralité d'instances sur la base desdits plongements de petites dimensions acquis à l'étape (iv) et du vecteur de pondération pour chaque instance de l'étape (v); la somme de pondération de la pluralité d'instances étant convertie en un score de classe indiquant l'état de classification de l'image acquise.
PCT/CN2020/112608 2020-08-31 2020-08-31 Procédé et système de classification d'image WO2022041222A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/112608 WO2022041222A1 (fr) 2020-08-31 2020-08-31 Procédé et système de classification d'image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/112608 WO2022041222A1 (fr) 2020-08-31 2020-08-31 Procédé et système de classification d'image

Publications (1)

Publication Number Publication Date
WO2022041222A1 true WO2022041222A1 (fr) 2022-03-03

Family

ID=80354297

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/112608 WO2022041222A1 (fr) 2020-08-31 2020-08-31 Procédé et système de classification d'image

Country Status (1)

Country Link
WO (1) WO2022041222A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115641955A (zh) * 2022-10-19 2023-01-24 哈尔滨工业大学 基于深度学习的胃癌分期判别系统及存储介质
CN117038023A (zh) * 2023-10-08 2023-11-10 中国医学科学院北京协和医院 基于结直肠癌病理图像的dMMR胚系突变亚型分类方法及系统
CN117636064A (zh) * 2023-12-21 2024-03-01 浙江大学 一种基于儿童病理切片的神经母细胞瘤智能分类系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130148881A1 (en) * 2011-12-12 2013-06-13 Alibaba Group Holding Limited Image Classification
US20180012107A1 (en) * 2015-12-11 2018-01-11 Tencent Technology (Shenzhen) Company Limited Image classification method, electronic device, and storage medium
CN107918782A (zh) * 2016-12-29 2018-04-17 中国科学院计算技术研究所 一种生成描述图像内容的自然语言的方法与系统
CN108549876A (zh) * 2018-04-20 2018-09-18 重庆邮电大学 基于目标检测和人体姿态估计的坐姿检测方法
US20200074243A1 (en) * 2017-11-30 2020-03-05 Tencent Technology (Shenzhen) Company Limited Image classification method, personalized recommendation method, computer device and storage medium
CN111414962A (zh) * 2020-03-19 2020-07-14 创新奇智(重庆)科技有限公司 一种引入物体关系的图像分类方法
CN111461258A (zh) * 2020-04-26 2020-07-28 武汉大学 耦合卷积神经网络和图卷积网络的遥感影像场景分类方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130148881A1 (en) * 2011-12-12 2013-06-13 Alibaba Group Holding Limited Image Classification
US20180012107A1 (en) * 2015-12-11 2018-01-11 Tencent Technology (Shenzhen) Company Limited Image classification method, electronic device, and storage medium
CN107918782A (zh) * 2016-12-29 2018-04-17 中国科学院计算技术研究所 一种生成描述图像内容的自然语言的方法与系统
US20200074243A1 (en) * 2017-11-30 2020-03-05 Tencent Technology (Shenzhen) Company Limited Image classification method, personalized recommendation method, computer device and storage medium
CN108549876A (zh) * 2018-04-20 2018-09-18 重庆邮电大学 基于目标检测和人体姿态估计的坐姿检测方法
CN111414962A (zh) * 2020-03-19 2020-07-14 创新奇智(重庆)科技有限公司 一种引入物体关系的图像分类方法
CN111461258A (zh) * 2020-04-26 2020-07-28 武汉大学 耦合卷积神经网络和图卷积网络的遥感影像场景分类方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115641955A (zh) * 2022-10-19 2023-01-24 哈尔滨工业大学 基于深度学习的胃癌分期判别系统及存储介质
CN117038023A (zh) * 2023-10-08 2023-11-10 中国医学科学院北京协和医院 基于结直肠癌病理图像的dMMR胚系突变亚型分类方法及系统
CN117636064A (zh) * 2023-12-21 2024-03-01 浙江大学 一种基于儿童病理切片的神经母细胞瘤智能分类系统
CN117636064B (zh) * 2023-12-21 2024-05-28 浙江大学 一种基于儿童病理切片的神经母细胞瘤智能分类系统

Similar Documents

Publication Publication Date Title
Tong et al. Application of machine learning in ophthalmic imaging modalities
Parvaiz et al. Vision transformers in medical computer vision—A contemplative retrospection
JP6998474B2 (ja) 生体組織のコンピュータ分類
WO2022041222A1 (fr) Procédé et système de classification d'image
Kharazmi et al. A computer-aided decision support system for detection and localization of cutaneous vasculature in dermoscopy images via deep feature learning
Ghassemi et al. Automatic diagnosis of covid-19 from ct images using cyclegan and transfer learning
Abuared et al. Skin cancer classification model based on VGG 19 and transfer learning
Merjulah et al. Classification of myocardial ischemia in delayed contrast enhancement using machine learning
Manic et al. Extraction and evaluation of corpus callosum from 2D brain MRI slice: a study with cuckoo search algorithm
Rodríguez et al. Computer aided detection and diagnosis in medical imaging: a review of clinical and educational applications
Kaliyugarasan et al. Pulmonary nodule classification in lung cancer from 3D thoracic CT scans using fastai and MONAI
Seetha et al. The Smart Detection and Analysis on Skin Tumor Disease Using Bio Imaging Deep Learning Algorithm
Özbay et al. Brain tumor detection with mRMR-based multimodal fusion of deep learning from MR images using Grad-CAM
Meswal et al. A weighted ensemble transfer learning approach for melanoma classification from skin lesion images
Nair et al. Investigation of breast melanoma using hybrid image-processing-tool
Barin et al. Hybrid Convolutional neural network-based diagnosis system for intracranial hemorrhage
Wang et al. AVDNet: Joint coronary artery and vein segmentation with topological consistency
Jathanna et al. Diagnostic utility of artificial intelligence for left ventricular scar identification using cardiac magnetic resonance imaging—A systematic review
Abd Hamid et al. Incorporating attention mechanism in enhancing classification of alzheimer’s disease
Farrag et al. An Explainable AI System for Medical Image Segmentation With Preserved Local Resolution: Mammogram Tumor Segmentation
Sajiv et al. Machine Learning based Analysis of Histopathological Images of Breast Cancer Classification using Decision Tree Classifier
Bandyopadhyay et al. Artificial-intelligence-based diagnosis of brain tumor diseases
Gowri et al. An improved classification of MR images for cervical cancer using convolutional neural networks
Kantheti et al. Medical Image Classification for Disease Prediction with the aid of Deep Learning approaches
US20230401697A1 (en) Radiogenomics for cancer subtype feature visualization

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20950891

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20950891

Country of ref document: EP

Kind code of ref document: A1