WO2023101564A1 - Skin lesion classification system and method - Google Patents
Skin lesion classification system and method Download PDFInfo
- Publication number
- WO2023101564A1 WO2023101564A1 PCT/NZ2022/050154 NZ2022050154W WO2023101564A1 WO 2023101564 A1 WO2023101564 A1 WO 2023101564A1 NZ 2022050154 W NZ2022050154 W NZ 2022050154W WO 2023101564 A1 WO2023101564 A1 WO 2023101564A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- classification
- classifiers
- skin lesion
- feature
- transformer
- Prior art date
Links
- 206010040882 skin lesion Diseases 0.000 title claims abstract description 43
- 231100000444 skin lesion Toxicity 0.000 title claims abstract description 43
- 238000000034 method Methods 0.000 title claims abstract description 20
- 238000000605 extraction Methods 0.000 claims abstract description 9
- 239000013598 vector Substances 0.000 description 47
- 238000013140 knowledge distillation Methods 0.000 description 13
- 230000006870 function Effects 0.000 description 11
- 238000013527 convolutional neural network Methods 0.000 description 7
- 230000007246 mechanism Effects 0.000 description 7
- 238000012549 training Methods 0.000 description 7
- 230000003902 lesion Effects 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 3
- 230000003211 malignant effect Effects 0.000 description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 206010004146 Basal cell carcinoma Diseases 0.000 description 2
- 206010053717 Fibrous histiocytoma Diseases 0.000 description 2
- 206010027145 Melanocytic naevus Diseases 0.000 description 2
- 208000009077 Pigmented Nevus Diseases 0.000 description 2
- 208000000453 Skin Neoplasms Diseases 0.000 description 2
- 208000009621 actinic keratosis Diseases 0.000 description 2
- 208000001119 benign fibrous histiocytoma Diseases 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 201000010305 cutaneous fibrous histiocytoma Diseases 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 201000001441 melanoma Diseases 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 201000000849 skin cancer Diseases 0.000 description 2
- 206010041823 squamous cell carcinoma Diseases 0.000 description 2
- 231100000216 vascular lesion Toxicity 0.000 description 2
- 206010020649 Hyperkeratosis Diseases 0.000 description 1
- 208000001126 Keratosis Diseases 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013434 data augmentation Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000012774 diagnostic algorithm Methods 0.000 description 1
- 238000004821 distillation Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000012886 linear function Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 241000514897 mixed subtypes Species 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 238000012502 risk assessment Methods 0.000 description 1
- 208000017520 skin disease Diseases 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/809—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of classification results, e.g. where the classifiers operate on the same input data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/096—Transfer learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H30/00—ICT specially adapted for the handling or processing of medical images
- G16H30/40—ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30088—Skin; Dermal
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30096—Tumor; Lesion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
Definitions
- the invention relates to methods and systems for performing a risk assessment on a skin lesion.
- both settings have limitation: 1) outputting probability of a lesion being non- cancerous or cancerous provides little information and may confuse dermatologists; 2) giving fine level predictions helps a dermatologist to comprehensively understand a lesion, but learning a flat model to distinguish mixed sub-types of lesions ignores the separability and correlation among different classes which may decrease the model's performance.
- HD-CNN Hierarchical Deep Convolutional Neural Networks
- B-CNN Branch Convolutional Neural Network
- An additional or alternative object is to at least provide the public with a useful choice.
- a system for classifying skin lesions comprises: a feature extraction module that determines feature information from an image of a skin lesion; a transformer that determines relationships between classifiers classifying distinct parameters of the skin lesion; and a hierarchical classifier having at least two of the classifiers ordered based on the number of classification categories in each classifier.
- the classifiers classify distinct parameters in parallel with each other based on the feature information and the relationships between classifiers.
- the transformer further comprises: at least two encoders that determine the global context of the feature information; and at least two decoders that determines the dependencies of the classifiers in the hierarchical classifier.
- the classifiers classify distinct parameters in parallel with each other based on the global context and the dependencies.
- a computer implemented method for classifying skin lesions comprises: determining feature information from an image of a skin lesion; determining relationships between classifiers classifying distinct parameters of the skin lesion; classifying distinct parameters with at least two of the classifiers in parallel, wherein the classifiers are ordered in a hierarchy based on the number of classification categories in each classifier.
- the invention in one aspect comprises several steps. The relation of one or more of such steps with respect to each of the others, the apparatus embodying features of construction, and combinations of elements and arrangement of parts that are adapted to affect such steps, are all exemplified in the following detailed disclosure.
- '(s)' following a noun means the plural and/or singular forms of the noun.
- 'and/or' means 'and' or 'or' or both.
- Figure 1 shows an example of classification levels in a hierarchical skin lesion classification
- Figure 2 shows an embodiment system for hierarchical skin lesion classification
- Figure 3 shows a schematic view of an embodiment transformer encoder in the system of figure 2;
- Figure 4 shows another schematic view of the embodiment transformer encoder in figure 3;
- Figure 5 shows a schematic view of an embodiment transformer encoder block and transformer decoder block in the system of figure 2;
- Figure 6 shows an example of results obtained from the hierarchical skin lesion classification
- Figure 7 shows a schematic view of a hierarchical knowledge distillation training strategy
- Figure 8 shows an embodiment method for hierarchical skin lesion classification
- Figure 9 shows an example of skin lesion classification classes organised in a 3-level hierarchical semantic tree.
- Disclosed herein is a system and method for hierarchical skin lesion classification that reflects hierarchical structure on skin diseases and improves the inference of classification results.
- FIG. 1 shows an example of the different classification levels involved in a hierarchical skin lesion classification.
- Each classification level 102, 104, 106 classifies a distinct parameter of the skin lesion.
- classification level 102 classifies a skin lesion in input image 202 into the benign or malignant categories/classes.
- Classification level 104 classifies the skin lesion in input image 202 according to 8 categories/classes.
- Classification level 106 classifies the skin lesion in input image 202 according to 65 categories/classes.
- the lower or finer classification levels 104, 106 have more classification categories than their previous (coarser) classification levels.
- FIG. 2 shows a system 200.
- the system 200 is a CNN Transformer model.
- the system has a feature extraction module 204 that is a convolutional neural network (CNN) for example.
- the feature extraction module 204 determines feature information from an image of a skin lesion.
- the feature information is a feature map 212 extracted from an input image 202 using feature extraction module 204.
- the input image is an image of a skin lesion.
- the feature map 212 is the output of the feature extraction module 204.
- Each input image 202 can be denoted with a symbol Xi, where i represents that the image is the i-th input image.
- the feature extraction module 204 backbone extracts a feature map 212 representation .
- Feature map 212 is a 3D feature map with a shape of H x W x C.
- the feature map 212 is provided as inputs to the transformer encoder 206.
- the transformer encoder 206 uses a sequence as input.
- the feature map 212 can be flattened or collapsed to a 2D feature representation along the spatial dimensions (H x W), resulting in a F°*G K feature sequence 214 represented as ’ A?XC .
- Feature sequence 214 consists of H x W vectors and each of them is C-dimensional.
- Corresponding position encodings are obtained for each feature sequence or collapsed feature map. This allows the spatial information of the feature map 212 to be retained when it is collapsed to a feature sequence 214. In other words, the pixel relationship in an image is preserved.
- the position encodings and the feature sequence 214 form feature vectors that . can be represented as: .
- Each feature vector is a three-dimensional vector that corresponds to each pixel of input image.
- the feature information of the input image can be represented in more than one way, as a feature map 212, feature sequence 214 or feature vector for example.
- a transformer determines relationships between classifiers 210 classifying distinct parameters of the skin lesion.
- the transformer has a transformer encoder 206 and transformer decoder 208.
- the transformer encoder 206 has at least two encoding blocks that determines the global context of the feature information.
- the transformer decoder 208 has at least two decoding blocks that determines the dependencies of the classifiers in the hierarchical classifier.
- the transformer encoder 206 receives the feature sequence 214 and positional encoding as the feature vector to learn global context.
- Transformer encoder 206 obtains global image context from local image features.
- Figure 3 shows an example of the transformer encoder
- each of the multiple encoding blocks has a multi-head self-attention layer (MSA), multi-layer preceptor layer (MLP), patch attention layer (PA), and class attention layer (CA).
- MSA multi-head self-attention layer
- MLP multi-layer preceptor layer
- PA patch attention layer
- CA class attention layer
- Each encoding block may also have a feed forward layer (FF).
- the MSA layer first normalises feature vectors using a layer normalization function LNQ, then maps normalized features into separate vectors using linear projections. In an example, these vectors are query vectors
- V C R ⁇ x ⁇ fr y and value vectors ’ are used so that self-attention does not have be performed by multiplying normalized feature vectors by itself. However, in another example normalized feature vectors do not have to be separated into separate vectors.
- the MSA layer then generates new feature vectors 216 represented as ' using a self-attention function:
- LN() and AQ represent a layer normalization function and the attention function respectively.
- the attention function is used for the self-attention procedure in each encoding block.
- Selfattention is a kind of mechanism for computing relation between input sequences. The selfattention compares each feature vector to the rest and generates a group of weights. Each weight is a number indicating the similarity of each feature vector to all other feature vectors. Each feature vector can be re-computed by multiplying its weights by the other feature vectors.
- the pairwise dot product measures how similar each pair of query vector and key vector is.
- the softmax function (SoftmaxO) computes a group of attention weights for every query vector and key vector.
- the output of attention function where a value vector gets a larger weight if its corresponding key vector has larger dot product with its corresponding query vector.
- a larger weight is based on the attention weight of the softmax function.
- the transformer encoder mechanism enables global context of the feature vectors to be obtained by relating every feature vector to the others.
- Selfattention is used to calculate relationships between all feature vectors that is projected to query, key and value vectors.
- the weight is based on the dot product of the query and key vectors and indicates the strength of correlation between a feature vector and others. The larger the weight, the stronger the correlation of the feature vector.
- the weight denotes the global context representing the semantic information of the entire image.
- a sequence containing multiple feature vectors is represented as Z-.
- Self-attention calculates relation of each feature vector within Z- to the others.
- a set of local image features represented by the feature vectors are extracted by the CNN.
- the transformer encoder then aggregates the local features by determining the relationship between all the local features using global context.
- the transformer decoder 208 receives feature vectors 216 from the transformer encoder 206 and classification task queries to compute output embeddings for each classification task.
- the transformer decoder 208 is similar to the transformer encoder 206 and contains multiple decoding blocks which consist of two MSA layers and an FF layer. The difference between the transformer encoder 206 and the transformer decoder 208 lies in its inputs.
- the transformer decoder 208 receives features vectors 216 from the encoder 206 pF c along with classification task queries
- the classification task queries are randomly initialized vectors and they are updated during model training. Each task has a task query vector that interacts with global image features for final classification.
- the classification task queries are learnable parameters that can be updated to compute output embeddings for each classification task respectively by the transformer decoder 208.
- Hierarchical classification classifiers 210 have more than one level of classification that ranges from a coarse level of classification (level-0) to a finer level of classification (level-2).
- the classification task queries are used to determine the output embedding for each level of classification 102, 104, 106.
- Each classification level has one of classifiers 210 to carry out the classification.
- each classification task query is associated to the features vectors 216 from the transformer encoder 206 which captures the global image context for generating final prediction for the task using cross-attention.
- the three levels of classification have three corresponding tasks.
- the first task is to classify whether a skin lesion in input image 202 is benign or malignant.
- the second task is to classify the skin lesion in input image 202 according to 8 classes in classification level-1.
- the third task is to classify the skin lesion in input image 202 according to 65 classes in classification level-2. Therefore, each of the three levels of classification has a corresponding task and classification task query.
- Each classification task query is a task query vector used for cross attention.
- the transformer decoder 208 performs the following calculations: )) I where U° is a zero-initialized tensor with same size as the classification task query.
- self-attention is a kind of mechanism for computing relation between input sequence of different classification tasks in the transformer decoder 208.
- the self-attention will compare each classification task query to the rest and generate a group of weights.
- Each weight is a number that indicates the relationship between each classification task query to the other classification task queries.
- Each classification task query vector can be re-computed by multiplying its weights to the other classification task query vectors.
- Figure 5 shows a visual representation of the interactions between the equations of each encoder and decoder block.
- Hierarchical classifiers 210 include at least two of classifiers ordered based on the number of classification categories in each classifier.
- the classifiers 210 classify distinct parameters in parallel with each other.
- the classifiers 210 classify distinct parameters in parallel with each other based on the global context of the feature information of an input image and dependencies between the classifiers in the hierarchical classifier.
- the output embeddings 218 of the transformer decoder 208 allow multiple classification tasks in a hierarchical classification to be performed simultaneously. For example, all three classification levels shown in Figure 1 can be carried out simultaneously once the output embeddings for each corresponding classification level are passed into the corresponding classifiers 210 for calculating coarse to fine prediction scores.
- Figure 6 shows an example of the results obtained from the hierarchical classification.
- Results 602 correspond to the classification level 102
- results 604 correspond to the classification level 104
- results 606 correspond to the classification level 106.
- the self-attention mechanism used on the task query vectors determines the relationships between different classification tasks. Obtaining the relationship between different classification tasks allows all classification tasks in a hierarchical classification process to be performed concurrently or in parallel to each other. This provides more accurate classification results of a skin Jesion than if classification tasks were performed separately without regard to the relationship between classification tasks.
- the final prediction is computed by three separate classifiers 102, 104, 106.
- Each classifier may consist of a dropout layer, a linear projection layer and a Soft-max layer for example.
- the outputs of the separate classifiers can be represented in an array of hierarchical from a coarse level to a fine level of classification.
- the system 200 can be used to perform a computer implemented method for classifying skin lesions using hierarchical classification as shown in Figure 8.
- the hierarchical classification method 800 uses feature extraction module 204 to determine 802 feature information from an input image 202 of a skin lesion.
- the feature information is a spatial feature map for example.
- the spatial features maps are used in a transformer model with the transformer encoder 206 and a transformer decoder 208.
- the transformer encoder 206 uses a self-attention mechanism to generate discriminative features. As described previously, the transformer encoder 206 uses self-attention to re-represent feature map 212 by considering global context. This results in the more discriminative features to be outputted of transformer encoder 216.
- the discriminative features are inputted into the transformer decoder 208 along with classification task queries representing the different levels of classification tasks from coarse-to-fine level.
- the transformer decoder 208 also uses a self-attention mechanism and a cross-attention mechanism to determine the relationships between the feature information and classifiers and determine 804 relationships between classifiers classifying distinct parameters of the skin lesion. At least two separate classifiers are used to classify 806 distinct parameters of the skin lesion in parallel. The classifiers are ordered in a hierarchy based on the number of classification categories in each classifier. The hierarchical classification method is capable of capturing relationships among different skin classes while performing hierarchical classification in parallel.
- the generalization performance can be improved and interaction between different levels of classifiers can be encouraged using a hierarchical knowledge distillation training strategy as shown in Figure 7.
- the hierarchical knowledge distillation training strategy may consist of ensemble knowledge distillation and mutual knowledge distillation.
- Ensemble knowledge distillation aggregates the multiple outputs from the multiple classifiers.
- the aggregated outputs/predictions are aligned with each of the multiple outputs from each classifier during training.
- the ensemble knowledge distillation utilises both ensemble learning and knowledge distillation.
- the aggregated outputs or ensemble predictions can provide a better result by combing outputs from different classification levels while knowledge distillation is capable of generating soft target with the ensemble prediction to guide the learning of individual classifiers and improve their performance.
- logits from each level of classifier can be represented as . . . ,. t .
- Pens Softmax (ggpen) where w e ® ° represents the parameters of the linear layer. and P® 11 denote ensembled logit and probability respectively.
- the ensemble prediction can be optimised with cross entropy loss. Since outputs from different classification heads vary in the dimension which is undesired for performing distillation with the ensemble prediction, we then map all of them into same dimension of the coarsest level along the path of leaf nodes to root nodes in the hierarchical structure: [g
- ->0,g2- ), ...,g -l- O] map(gi, g2, ...,gAf-l) where map() performs logits mapping by summing all logits of fine level classes belonging to the same coarse level class. Kullback Leibler (KL) divergence loss among the ensemble prediction and all mapped predictions of each heads are calculated as follows:
- Ensembled outputs act as a strong teacher for distilling knowledge to each classification layer, but it ignores the relationship between different classification layers. Therefore, mutual knowledge distillation can be used to align outputs from consecutive level of classifiers for maintaining the consistency. Because a hierarchical model with multiple classification layers should favor functions that give consistent output for same inputs. The consensus of outputs from different classification layers on the same input provides supplementary information and regularization to each classifier.
- the final objective function for optimizing the hierarchical model includes cross entropy loss on outputs from each classifier as well as the ensemble prediction, ensemble knowledge distillation, and mutual knowledge distillation.
- the present system and method for hierarchical skin lesion classification is used for skin image datasets. Skin image datasets may be collected from a clinical environment and follows tele-dermatology labelling standards. Skin image datasets may also be collected from publicly available International Skin Imaging Collaboration (ISIC) archives for example.
- ISIC International Skin Imaging Collaboration
- dataset with tele-dermatology labelling standards includes 235,268 teledermatology verified clinical and dermoscopic images.
- the dataset may contain a total of 65 skin conditions organised in a 3-level hierarchical semantic tree as shown in Figure 9.
- data from ISIC archives have 25,331 dermatoscopic images across 8 different categories: melanoma (MEL), melanocytic nevus (MN), basal cell carcinoma (BCC), actinic keratosis (AK), benign keratosis (BKL), dermatofibroma (DF), vascular lesion (VASC), and squamous cell carcinoma (SCC).
- MEL melanoma
- MN basal cell carcinoma
- AK actinic keratosis
- BKL benign keratosis
- DF dermatofibroma
- VASC vascular lesion
- SCC squamous cell carcinoma
- a dataset can be split into training, validation and testing datasets with a ratio of 7:1:2.
- the standard data augmentation techniques such as random resized cropping, colour transformation, and flipping can be used on the datasets.
- Each dermoscopic image is resized to a fixed size of 384x384 for example.
- Image Net pre-trained ResNet-34 He, K Continue Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image recognition, pp. 770-778. doi:10.1109/CVPR.2016.90.
- ADAM optimizer Kingma, D., Ba, J., 2014. Adam: A method for stochastic optimization. International Conference on Learning Representations .
- batch size equal to 128 and an initial learning rates of 1 x 10 A -5 and 3 10 A -4 for the backbone and new added layers, respectively.
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Public Health (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Mathematical Physics (AREA)
- Radiology & Medical Imaging (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- General Engineering & Computer Science (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Pathology (AREA)
- Quality & Reliability (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2022400601A AU2022400601A1 (en) | 2021-12-02 | 2022-11-25 | Skin lesion classification system and method |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
NZ782937 | 2021-12-02 | ||
NZ78293721 | 2021-12-02 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023101564A1 true WO2023101564A1 (en) | 2023-06-08 |
Family
ID=86612806
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/NZ2022/050154 WO2023101564A1 (en) | 2021-12-02 | 2022-11-25 | Skin lesion classification system and method |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU2022400601A1 (en) |
WO (1) | WO2023101564A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116721302A (en) * | 2023-08-10 | 2023-09-08 | 成都信息工程大学 | Ice and snow crystal particle image classification method based on lightweight network |
CN117636064A (en) * | 2023-12-21 | 2024-03-01 | 浙江大学 | Intelligent neuroblastoma classification system based on pathological sections of children |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10878567B1 (en) * | 2019-09-18 | 2020-12-29 | Triage Technologies Inc. | System to collect and identify skin conditions from images and expert knowledge |
CN112330621A (en) * | 2020-10-30 | 2021-02-05 | 康键信息技术(深圳)有限公司 | Method and device for carrying out abnormity classification on skin image based on artificial intelligence |
-
2022
- 2022-11-25 WO PCT/NZ2022/050154 patent/WO2023101564A1/en active Application Filing
- 2022-11-25 AU AU2022400601A patent/AU2022400601A1/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10878567B1 (en) * | 2019-09-18 | 2020-12-29 | Triage Technologies Inc. | System to collect and identify skin conditions from images and expert knowledge |
CN112330621A (en) * | 2020-10-30 | 2021-02-05 | 康键信息技术(深圳)有限公司 | Method and device for carrying out abnormity classification on skin image based on artificial intelligence |
Non-Patent Citations (5)
Title |
---|
"Color Medical Image Analysis", vol. 6, 30 November 2012, SPRINGER, ISBN: 978-94-007-5388-4, article BALLERINI LUCIA, FISHER ROBERT, ALDRIDGE BEN, REES JONATHAN, CELEBI M. EMRE, SCHAEFER GERALD: "A Color and Texture Based Hierarchical K- NN Approach to the Classification of Non-melanoma Skin Lesions", pages: 63 - 86, XP009546955, DOI: 10.1007/978-94-007-5389-1_4 * |
ANDRE ESTEVA, BRETT KUPREL, ROBERTO A. NOVOA, JUSTIN KO, SUSAN M. SWETTER, HELEN M. BLAU, SEBASTIAN THRUN: "Dermatologist-level classification of skin cancer with deep neural networks", NATURE, NATURE PUBLISHING GROUP UK, LONDON, vol. 542, no. 7639, 1 February 2017 (2017-02-01), London, pages 115 - 118, XP055536881, ISSN: 0028-0836, DOI: 10.1038/nature21056 * |
BARATA CATARINA; CELEBI M. EMRE; MARQUES JORGE S.: "Explainable skin lesion diagnosis using taxonomies", PATTERN RECOGNITION., ELSEVIER., GB, vol. 110, 16 May 2020 (2020-05-16), GB , XP086328204, ISSN: 0031-3203, DOI: 10.1016/j.patcog.2020.107413 * |
BARATA CATARINA; MARQUES JORGE S.: "Deep Learning For Skin Cancer Diagnosis With Hierarchical Architectures", 2019 IEEE 16TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2019), IEEE, 8 April 2019 (2019-04-08), pages 841 - 845, XP033576692, DOI: 10.1109/ISBI.2019.8759561 * |
SHIMIZU, K ET AL.: "Four-Class Classification of Skin Lesions With Task Decomposition Strategy", IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, vol. 62, no. 1, January 2015 (2015-01-01), XP011568255, Retrieved from the Internet <URL:https://ieeexplore.ieee.org/abstract/document/6879310> [retrieved on 20230223], DOI: 10.1109/TBME.2014.2348323 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116721302A (en) * | 2023-08-10 | 2023-09-08 | 成都信息工程大学 | Ice and snow crystal particle image classification method based on lightweight network |
CN116721302B (en) * | 2023-08-10 | 2024-01-12 | 成都信息工程大学 | Ice and snow crystal particle image classification method based on lightweight network |
CN117636064A (en) * | 2023-12-21 | 2024-03-01 | 浙江大学 | Intelligent neuroblastoma classification system based on pathological sections of children |
CN117636064B (en) * | 2023-12-21 | 2024-05-28 | 浙江大学 | Intelligent neuroblastoma classification system based on pathological sections of children |
Also Published As
Publication number | Publication date |
---|---|
AU2022400601A1 (en) | 2024-03-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Costa et al. | End-to-end adversarial retinal image synthesis | |
Bargshady et al. | Enhanced deep learning algorithm development to detect pain intensity from facial expression images | |
Acharya et al. | TallyQA: Answering complex counting questions | |
Allaouzi et al. | A novel approach for multi-label chest X-ray classification of common thorax diseases | |
Boughrara et al. | Facial expression recognition based on a mlp neural network using constructive training algorithm | |
WO2023101564A1 (en) | Skin lesion classification system and method | |
He et al. | Learning and incorporating top-down cues in image segmentation | |
JP2006252559A (en) | Method of specifying object position in image, and method of classifying images of objects in different image categories | |
Chen et al. | Mobile convolution neural network for the recognition of potato leaf disease images | |
CN112418041A (en) | Multi-pose face recognition method based on face orthogonalization | |
CN116129141A (en) | Medical data processing method, apparatus, device, medium and computer program product | |
Bodapati | Enhancing brain tumor diagnosis using a multi-architecture deep convolutional neural network on MRI scans | |
CN115392474B (en) | Local perception graph representation learning method based on iterative optimization | |
Aktürk et al. | Classification of eye images by personal details with transfer learning algorithms | |
Swarup et al. | Biologically inspired CNN network for brain tumor abnormalities detection and features extraction from MRI images | |
Babahenini et al. | Using saliency detection to improve multi-focus image fusion | |
Balamurugan | Deep Wavelet Autoencoder Based Brain Tumor Detection Analysis Using Deep Neural Network | |
Doan et al. | Deep multi-view learning from sequential data without correspondence | |
Rao et al. | Novel approach of Using Periocular and Iris Biometric Recognition in the Authentication of ITS | |
Pundir et al. | Multiview Human Gait Recognition using a Hybrid CNN Approach | |
CN117974693B (en) | Image segmentation method, device, computer equipment and storage medium | |
Manaa et al. | A systematic review for image enhancement using deep learning techniques | |
Enireddy et al. | Compressed Medical Image Retrieval Using Data Mining and Optimized Recurrent Neural Network Techniques | |
Si et al. | Multi-view representation learning from local consistency and global alignment | |
Guo et al. | Medical Image Retrieval Based on Attention Triplet Hashing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22901909 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022400601 Country of ref document: AU Ref document number: 809081 Country of ref document: NZ Ref document number: AU2022400601 Country of ref document: AU |
|
ENP | Entry into the national phase |
Ref document number: 2022400601 Country of ref document: AU Date of ref document: 20221125 Kind code of ref document: A |