WO2023101564A1 - Skin lesion classification system and method - Google Patents

Skin lesion classification system and method Download PDF

Info

Publication number
WO2023101564A1
WO2023101564A1 PCT/NZ2022/050154 NZ2022050154W WO2023101564A1 WO 2023101564 A1 WO2023101564 A1 WO 2023101564A1 NZ 2022050154 W NZ2022050154 W NZ 2022050154W WO 2023101564 A1 WO2023101564 A1 WO 2023101564A1
Authority
WO
WIPO (PCT)
Prior art keywords
classification
classifiers
skin lesion
feature
transformer
Prior art date
Application number
PCT/NZ2022/050154
Other languages
French (fr)
Inventor
Zhen Yu
Toan Nguyen
Zongyuan Ge
Paul BONNINGTON
Original Assignee
Kahu.Ai Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kahu.Ai Limited filed Critical Kahu.Ai Limited
Priority to AU2022400601A priority Critical patent/AU2022400601A1/en
Publication of WO2023101564A1 publication Critical patent/WO2023101564A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/809Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of classification results, e.g. where the classifiers operate on the same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30088Skin; Dermal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment

Definitions

  • the invention relates to methods and systems for performing a risk assessment on a skin lesion.
  • both settings have limitation: 1) outputting probability of a lesion being non- cancerous or cancerous provides little information and may confuse dermatologists; 2) giving fine level predictions helps a dermatologist to comprehensively understand a lesion, but learning a flat model to distinguish mixed sub-types of lesions ignores the separability and correlation among different classes which may decrease the model's performance.
  • HD-CNN Hierarchical Deep Convolutional Neural Networks
  • B-CNN Branch Convolutional Neural Network
  • An additional or alternative object is to at least provide the public with a useful choice.
  • a system for classifying skin lesions comprises: a feature extraction module that determines feature information from an image of a skin lesion; a transformer that determines relationships between classifiers classifying distinct parameters of the skin lesion; and a hierarchical classifier having at least two of the classifiers ordered based on the number of classification categories in each classifier.
  • the classifiers classify distinct parameters in parallel with each other based on the feature information and the relationships between classifiers.
  • the transformer further comprises: at least two encoders that determine the global context of the feature information; and at least two decoders that determines the dependencies of the classifiers in the hierarchical classifier.
  • the classifiers classify distinct parameters in parallel with each other based on the global context and the dependencies.
  • a computer implemented method for classifying skin lesions comprises: determining feature information from an image of a skin lesion; determining relationships between classifiers classifying distinct parameters of the skin lesion; classifying distinct parameters with at least two of the classifiers in parallel, wherein the classifiers are ordered in a hierarchy based on the number of classification categories in each classifier.
  • the invention in one aspect comprises several steps. The relation of one or more of such steps with respect to each of the others, the apparatus embodying features of construction, and combinations of elements and arrangement of parts that are adapted to affect such steps, are all exemplified in the following detailed disclosure.
  • '(s)' following a noun means the plural and/or singular forms of the noun.
  • 'and/or' means 'and' or 'or' or both.
  • Figure 1 shows an example of classification levels in a hierarchical skin lesion classification
  • Figure 2 shows an embodiment system for hierarchical skin lesion classification
  • Figure 3 shows a schematic view of an embodiment transformer encoder in the system of figure 2;
  • Figure 4 shows another schematic view of the embodiment transformer encoder in figure 3;
  • Figure 5 shows a schematic view of an embodiment transformer encoder block and transformer decoder block in the system of figure 2;
  • Figure 6 shows an example of results obtained from the hierarchical skin lesion classification
  • Figure 7 shows a schematic view of a hierarchical knowledge distillation training strategy
  • Figure 8 shows an embodiment method for hierarchical skin lesion classification
  • Figure 9 shows an example of skin lesion classification classes organised in a 3-level hierarchical semantic tree.
  • Disclosed herein is a system and method for hierarchical skin lesion classification that reflects hierarchical structure on skin diseases and improves the inference of classification results.
  • FIG. 1 shows an example of the different classification levels involved in a hierarchical skin lesion classification.
  • Each classification level 102, 104, 106 classifies a distinct parameter of the skin lesion.
  • classification level 102 classifies a skin lesion in input image 202 into the benign or malignant categories/classes.
  • Classification level 104 classifies the skin lesion in input image 202 according to 8 categories/classes.
  • Classification level 106 classifies the skin lesion in input image 202 according to 65 categories/classes.
  • the lower or finer classification levels 104, 106 have more classification categories than their previous (coarser) classification levels.
  • FIG. 2 shows a system 200.
  • the system 200 is a CNN Transformer model.
  • the system has a feature extraction module 204 that is a convolutional neural network (CNN) for example.
  • the feature extraction module 204 determines feature information from an image of a skin lesion.
  • the feature information is a feature map 212 extracted from an input image 202 using feature extraction module 204.
  • the input image is an image of a skin lesion.
  • the feature map 212 is the output of the feature extraction module 204.
  • Each input image 202 can be denoted with a symbol Xi, where i represents that the image is the i-th input image.
  • the feature extraction module 204 backbone extracts a feature map 212 representation .
  • Feature map 212 is a 3D feature map with a shape of H x W x C.
  • the feature map 212 is provided as inputs to the transformer encoder 206.
  • the transformer encoder 206 uses a sequence as input.
  • the feature map 212 can be flattened or collapsed to a 2D feature representation along the spatial dimensions (H x W), resulting in a F°*G K feature sequence 214 represented as ’ A?XC .
  • Feature sequence 214 consists of H x W vectors and each of them is C-dimensional.
  • Corresponding position encodings are obtained for each feature sequence or collapsed feature map. This allows the spatial information of the feature map 212 to be retained when it is collapsed to a feature sequence 214. In other words, the pixel relationship in an image is preserved.
  • the position encodings and the feature sequence 214 form feature vectors that . can be represented as: .
  • Each feature vector is a three-dimensional vector that corresponds to each pixel of input image.
  • the feature information of the input image can be represented in more than one way, as a feature map 212, feature sequence 214 or feature vector for example.
  • a transformer determines relationships between classifiers 210 classifying distinct parameters of the skin lesion.
  • the transformer has a transformer encoder 206 and transformer decoder 208.
  • the transformer encoder 206 has at least two encoding blocks that determines the global context of the feature information.
  • the transformer decoder 208 has at least two decoding blocks that determines the dependencies of the classifiers in the hierarchical classifier.
  • the transformer encoder 206 receives the feature sequence 214 and positional encoding as the feature vector to learn global context.
  • Transformer encoder 206 obtains global image context from local image features.
  • Figure 3 shows an example of the transformer encoder
  • each of the multiple encoding blocks has a multi-head self-attention layer (MSA), multi-layer preceptor layer (MLP), patch attention layer (PA), and class attention layer (CA).
  • MSA multi-head self-attention layer
  • MLP multi-layer preceptor layer
  • PA patch attention layer
  • CA class attention layer
  • Each encoding block may also have a feed forward layer (FF).
  • the MSA layer first normalises feature vectors using a layer normalization function LNQ, then maps normalized features into separate vectors using linear projections. In an example, these vectors are query vectors
  • V C R ⁇ x ⁇ fr y and value vectors ’ are used so that self-attention does not have be performed by multiplying normalized feature vectors by itself. However, in another example normalized feature vectors do not have to be separated into separate vectors.
  • the MSA layer then generates new feature vectors 216 represented as ' using a self-attention function:
  • LN() and AQ represent a layer normalization function and the attention function respectively.
  • the attention function is used for the self-attention procedure in each encoding block.
  • Selfattention is a kind of mechanism for computing relation between input sequences. The selfattention compares each feature vector to the rest and generates a group of weights. Each weight is a number indicating the similarity of each feature vector to all other feature vectors. Each feature vector can be re-computed by multiplying its weights by the other feature vectors.
  • the pairwise dot product measures how similar each pair of query vector and key vector is.
  • the softmax function (SoftmaxO) computes a group of attention weights for every query vector and key vector.
  • the output of attention function where a value vector gets a larger weight if its corresponding key vector has larger dot product with its corresponding query vector.
  • a larger weight is based on the attention weight of the softmax function.
  • the transformer encoder mechanism enables global context of the feature vectors to be obtained by relating every feature vector to the others.
  • Selfattention is used to calculate relationships between all feature vectors that is projected to query, key and value vectors.
  • the weight is based on the dot product of the query and key vectors and indicates the strength of correlation between a feature vector and others. The larger the weight, the stronger the correlation of the feature vector.
  • the weight denotes the global context representing the semantic information of the entire image.
  • a sequence containing multiple feature vectors is represented as Z-.
  • Self-attention calculates relation of each feature vector within Z- to the others.
  • a set of local image features represented by the feature vectors are extracted by the CNN.
  • the transformer encoder then aggregates the local features by determining the relationship between all the local features using global context.
  • the transformer decoder 208 receives feature vectors 216 from the transformer encoder 206 and classification task queries to compute output embeddings for each classification task.
  • the transformer decoder 208 is similar to the transformer encoder 206 and contains multiple decoding blocks which consist of two MSA layers and an FF layer. The difference between the transformer encoder 206 and the transformer decoder 208 lies in its inputs.
  • the transformer decoder 208 receives features vectors 216 from the encoder 206 pF c along with classification task queries
  • the classification task queries are randomly initialized vectors and they are updated during model training. Each task has a task query vector that interacts with global image features for final classification.
  • the classification task queries are learnable parameters that can be updated to compute output embeddings for each classification task respectively by the transformer decoder 208.
  • Hierarchical classification classifiers 210 have more than one level of classification that ranges from a coarse level of classification (level-0) to a finer level of classification (level-2).
  • the classification task queries are used to determine the output embedding for each level of classification 102, 104, 106.
  • Each classification level has one of classifiers 210 to carry out the classification.
  • each classification task query is associated to the features vectors 216 from the transformer encoder 206 which captures the global image context for generating final prediction for the task using cross-attention.
  • the three levels of classification have three corresponding tasks.
  • the first task is to classify whether a skin lesion in input image 202 is benign or malignant.
  • the second task is to classify the skin lesion in input image 202 according to 8 classes in classification level-1.
  • the third task is to classify the skin lesion in input image 202 according to 65 classes in classification level-2. Therefore, each of the three levels of classification has a corresponding task and classification task query.
  • Each classification task query is a task query vector used for cross attention.
  • the transformer decoder 208 performs the following calculations: )) I where U° is a zero-initialized tensor with same size as the classification task query.
  • self-attention is a kind of mechanism for computing relation between input sequence of different classification tasks in the transformer decoder 208.
  • the self-attention will compare each classification task query to the rest and generate a group of weights.
  • Each weight is a number that indicates the relationship between each classification task query to the other classification task queries.
  • Each classification task query vector can be re-computed by multiplying its weights to the other classification task query vectors.
  • Figure 5 shows a visual representation of the interactions between the equations of each encoder and decoder block.
  • Hierarchical classifiers 210 include at least two of classifiers ordered based on the number of classification categories in each classifier.
  • the classifiers 210 classify distinct parameters in parallel with each other.
  • the classifiers 210 classify distinct parameters in parallel with each other based on the global context of the feature information of an input image and dependencies between the classifiers in the hierarchical classifier.
  • the output embeddings 218 of the transformer decoder 208 allow multiple classification tasks in a hierarchical classification to be performed simultaneously. For example, all three classification levels shown in Figure 1 can be carried out simultaneously once the output embeddings for each corresponding classification level are passed into the corresponding classifiers 210 for calculating coarse to fine prediction scores.
  • Figure 6 shows an example of the results obtained from the hierarchical classification.
  • Results 602 correspond to the classification level 102
  • results 604 correspond to the classification level 104
  • results 606 correspond to the classification level 106.
  • the self-attention mechanism used on the task query vectors determines the relationships between different classification tasks. Obtaining the relationship between different classification tasks allows all classification tasks in a hierarchical classification process to be performed concurrently or in parallel to each other. This provides more accurate classification results of a skin Jesion than if classification tasks were performed separately without regard to the relationship between classification tasks.
  • the final prediction is computed by three separate classifiers 102, 104, 106.
  • Each classifier may consist of a dropout layer, a linear projection layer and a Soft-max layer for example.
  • the outputs of the separate classifiers can be represented in an array of hierarchical from a coarse level to a fine level of classification.
  • the system 200 can be used to perform a computer implemented method for classifying skin lesions using hierarchical classification as shown in Figure 8.
  • the hierarchical classification method 800 uses feature extraction module 204 to determine 802 feature information from an input image 202 of a skin lesion.
  • the feature information is a spatial feature map for example.
  • the spatial features maps are used in a transformer model with the transformer encoder 206 and a transformer decoder 208.
  • the transformer encoder 206 uses a self-attention mechanism to generate discriminative features. As described previously, the transformer encoder 206 uses self-attention to re-represent feature map 212 by considering global context. This results in the more discriminative features to be outputted of transformer encoder 216.
  • the discriminative features are inputted into the transformer decoder 208 along with classification task queries representing the different levels of classification tasks from coarse-to-fine level.
  • the transformer decoder 208 also uses a self-attention mechanism and a cross-attention mechanism to determine the relationships between the feature information and classifiers and determine 804 relationships between classifiers classifying distinct parameters of the skin lesion. At least two separate classifiers are used to classify 806 distinct parameters of the skin lesion in parallel. The classifiers are ordered in a hierarchy based on the number of classification categories in each classifier. The hierarchical classification method is capable of capturing relationships among different skin classes while performing hierarchical classification in parallel.
  • the generalization performance can be improved and interaction between different levels of classifiers can be encouraged using a hierarchical knowledge distillation training strategy as shown in Figure 7.
  • the hierarchical knowledge distillation training strategy may consist of ensemble knowledge distillation and mutual knowledge distillation.
  • Ensemble knowledge distillation aggregates the multiple outputs from the multiple classifiers.
  • the aggregated outputs/predictions are aligned with each of the multiple outputs from each classifier during training.
  • the ensemble knowledge distillation utilises both ensemble learning and knowledge distillation.
  • the aggregated outputs or ensemble predictions can provide a better result by combing outputs from different classification levels while knowledge distillation is capable of generating soft target with the ensemble prediction to guide the learning of individual classifiers and improve their performance.
  • logits from each level of classifier can be represented as . . . ,. t .
  • Pens Softmax (ggpen) where w e ® ° represents the parameters of the linear layer. and P® 11 denote ensembled logit and probability respectively.
  • the ensemble prediction can be optimised with cross entropy loss. Since outputs from different classification heads vary in the dimension which is undesired for performing distillation with the ensemble prediction, we then map all of them into same dimension of the coarsest level along the path of leaf nodes to root nodes in the hierarchical structure: [g
  • ->0,g2- ), ...,g -l- O] map(gi, g2, ...,gAf-l) where map() performs logits mapping by summing all logits of fine level classes belonging to the same coarse level class. Kullback Leibler (KL) divergence loss among the ensemble prediction and all mapped predictions of each heads are calculated as follows:
  • Ensembled outputs act as a strong teacher for distilling knowledge to each classification layer, but it ignores the relationship between different classification layers. Therefore, mutual knowledge distillation can be used to align outputs from consecutive level of classifiers for maintaining the consistency. Because a hierarchical model with multiple classification layers should favor functions that give consistent output for same inputs. The consensus of outputs from different classification layers on the same input provides supplementary information and regularization to each classifier.
  • the final objective function for optimizing the hierarchical model includes cross entropy loss on outputs from each classifier as well as the ensemble prediction, ensemble knowledge distillation, and mutual knowledge distillation.
  • the present system and method for hierarchical skin lesion classification is used for skin image datasets. Skin image datasets may be collected from a clinical environment and follows tele-dermatology labelling standards. Skin image datasets may also be collected from publicly available International Skin Imaging Collaboration (ISIC) archives for example.
  • ISIC International Skin Imaging Collaboration
  • dataset with tele-dermatology labelling standards includes 235,268 teledermatology verified clinical and dermoscopic images.
  • the dataset may contain a total of 65 skin conditions organised in a 3-level hierarchical semantic tree as shown in Figure 9.
  • data from ISIC archives have 25,331 dermatoscopic images across 8 different categories: melanoma (MEL), melanocytic nevus (MN), basal cell carcinoma (BCC), actinic keratosis (AK), benign keratosis (BKL), dermatofibroma (DF), vascular lesion (VASC), and squamous cell carcinoma (SCC).
  • MEL melanoma
  • MN basal cell carcinoma
  • AK actinic keratosis
  • BKL benign keratosis
  • DF dermatofibroma
  • VASC vascular lesion
  • SCC squamous cell carcinoma
  • a dataset can be split into training, validation and testing datasets with a ratio of 7:1:2.
  • the standard data augmentation techniques such as random resized cropping, colour transformation, and flipping can be used on the datasets.
  • Each dermoscopic image is resized to a fixed size of 384x384 for example.
  • Image Net pre-trained ResNet-34 He, K Continue Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image recognition, pp. 770-778. doi:10.1109/CVPR.2016.90.
  • ADAM optimizer Kingma, D., Ba, J., 2014. Adam: A method for stochastic optimization. International Conference on Learning Representations .
  • batch size equal to 128 and an initial learning rates of 1 x 10 A -5 and 3 10 A -4 for the backbone and new added layers, respectively.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Public Health (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Radiology & Medical Imaging (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • General Engineering & Computer Science (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Pathology (AREA)
  • Quality & Reliability (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

Some previously proposed classification methods of skin lesions may lead to misclassifications. Disclosed herein is a system for classifying skin lesions. The system comprises a feature extraction module (204) that determines feature information from an image of a skin lesion. The system comprises a transformer (206, 208) that determines relationships between classifiers classifying distinct parameters of the skin lesion. The system comprises a hierarchical classifier (210) having at least two of the classifiers ordered based on the number of classification categories in each classifier. The classifiers classify distinct parameters in parallel with each other based on the feature information and the relationships between classifiers.

Description

SKIN LESION CLASSIFICATION SYSTEM AND METHOD
FIELD OF THE INVENTION
The invention relates to methods and systems for performing a risk assessment on a skin lesion.
BACKGROUND TO THE INVENTION
The incidence of skin cancer has been rising for several decades, and computer-aided diagnostic algorithms are desired for assisting dermatologists in diagnosing lesions more efficiently. Existing algorithms for skin cancer diagnosis either simply perform binary classification of benign versus malignant or classify lesions directly into multiple subcategories.
However, both settings have limitation: 1) outputting probability of a lesion being non- cancerous or cancerous provides little information and may confuse dermatologists; 2) giving fine level predictions helps a dermatologist to comprehensively understand a lesion, but learning a flat model to distinguish mixed sub-types of lesions ignores the separability and correlation among different classes which may decrease the model's performance.
(Yan et al., 2015) proposed a Hierarchical Deep Convolutional Neural Networks (HD-CNN) to classify objects by first focus on coarse categories then fine categories. The study shows that the model can achieve better performance than non-hierarchical models. However, the approach has some limitations such as it heeds two steps of training for coarse and fine categories and it cannot be used to classify data in hierarchy having more than two levels.
(Zhu and Bain, 2017) presented a Branch Convolutional Neural Network (B-CNN) model that construct multiple branches on top of different layers of a CNN to output multiple predictions corresponding to levels from coarse to fine in the hierarchical structure. But this method computed the predictions independently without considering hierarchical dependency among different classes. Computing classification results sequentially may lead to misclassification if images are classified into incorrect coarse categories.
It is an object of at least preferred embodiments to address at least some of the aforementioned disadvantages. An additional or alternative object is to at least provide the public with a useful choice.
SUMMARY OF THE INVENTION
In accordance with an aspect, a system for classifying skin lesions comprises: a feature extraction module that determines feature information from an image of a skin lesion; a transformer that determines relationships between classifiers classifying distinct parameters of the skin lesion; and a hierarchical classifier having at least two of the classifiers ordered based on the number of classification categories in each classifier. The classifiers classify distinct parameters in parallel with each other based on the feature information and the relationships between classifiers.
The term 'comprising' as used in this specification means 'consisting at least in part of'. When interpreting each statement in this specification that includes the term 'comprising', features other than that or those prefaced by the term may also be present. Related terms such as 'comprise' and 'comprises' are to be interpreted in the same manner.
In an embodiment the transformer further comprises: at least two encoders that determine the global context of the feature information; and at least two decoders that determines the dependencies of the classifiers in the hierarchical classifier. The classifiers classify distinct parameters in parallel with each other based on the global context and the dependencies.
In accordance with a further aspect of the invention, a computer implemented method for classifying skin lesions comprises: determining feature information from an image of a skin lesion; determining relationships between classifiers classifying distinct parameters of the skin lesion; classifying distinct parameters with at least two of the classifiers in parallel, wherein the classifiers are ordered in a hierarchy based on the number of classification categories in each classifier. The invention in one aspect comprises several steps. The relation of one or more of such steps with respect to each of the others, the apparatus embodying features of construction, and combinations of elements and arrangement of parts that are adapted to affect such steps, are all exemplified in the following detailed disclosure.
To those skilled in the art to which the invention relates, many changes in construction and widely differing embodiments and applications of the invention will suggest themselves without departing from the scope of the invention as defined in the appended claims. The disclosures and the descriptions herein are purely illustrative and are not intended to be in any sense limiting. Where specific integers are mentioned herein which have known equivalents in the art to which this invention relates, such known equivalents are deemed to be incorporated herein as if individually set forth.
In addition, where features or aspects of the invention are described in terms of Markush groups, those persons skilled in the art will appreciate that the invention is also thereby described in terms of any individual member or subgroup of members of the Markush group.
As used herein, '(s)' following a noun means the plural and/or singular forms of the noun.
As used herein, the term 'and/or' means 'and' or 'or' or both.
It is intended that reference to a range of numbers disclosed herein (for example, 1 to 10) also incorporates reference to all rational numbers within that range (for example, 1, 1.1, 2, 3, 3.9, 4, 5, 6, 6.5, 7, 8, 9, and 10) and also any range of rational numbers within that range (for example, 2 to 8, 1.5 to 5.5, and 3.1 to 4.7) and, therefore, all sub-ranges of all ranges expressly disclosed herein are hereby expressly disclosed. These are only examples of what is specifically intended and all possible combinations of numerical values between the lowest value and the highest value enumerated are to be considered to be expressly stated in this application in a similar manner.
In this specification where reference has been made to patent specifications, other external documents, or other sources of information, this is generally for the purpose of providing a context for discussing the features of the invention. Unless specifically stated otherwise, reference to such external documents or such sources of information is not to be construed as an admission that such documents or such sources of information, in any jurisdiction, are prior art or form part of the common general knowledge in the art
Although the present invention is broadly as defined above, those persons skilled in the art will appreciate that the invention is not limited thereto and that the invention also includes embodiments of which the following description gives examples.
BRIEF DESCRIPTION OF THE DRAWINGS
Preferred forms of the system and method will now be described by way of example only with reference to the accompanying figures in which:
Figure 1 shows an example of classification levels in a hierarchical skin lesion classification;
Figure 2 shows an embodiment system for hierarchical skin lesion classification;
Figure 3 shows a schematic view of an embodiment transformer encoder in the system of figure 2;
Figure 4 shows another schematic view of the embodiment transformer encoder in figure 3;
Figure 5 shows a schematic view of an embodiment transformer encoder block and transformer decoder block in the system of figure 2;
Figure 6 shows an example of results obtained from the hierarchical skin lesion classification;
Figure 7 shows a schematic view of a hierarchical knowledge distillation training strategy;
Figure 8 shows an embodiment method for hierarchical skin lesion classification; and
Figure 9 shows an example of skin lesion classification classes organised in a 3-level hierarchical semantic tree. DETAILED DESCRIPTION
Disclosed herein is a system and method for hierarchical skin lesion classification that reflects hierarchical structure on skin diseases and improves the inference of classification results.
The system and method capture the relationship between different classification levels that classifies different parameters of skin lesions and allows hierarchical classification to be performed in a parallel way. Figure 1 shows an example of the different classification levels involved in a hierarchical skin lesion classification. Each classification level 102, 104, 106 classifies a distinct parameter of the skin lesion. For example, classification level 102 classifies a skin lesion in input image 202 into the benign or malignant categories/classes. Classification level 104 classifies the skin lesion in input image 202 according to 8 categories/classes. Classification level 106 classifies the skin lesion in input image 202 according to 65 categories/classes. In this example, the lower or finer classification levels 104, 106 have more classification categories than their previous (coarser) classification levels.
Figure 2 shows a system 200. The system 200 is a CNN Transformer model. The system has a feature extraction module 204 that is a convolutional neural network (CNN) for example. The feature extraction module 204 determines feature information from an image of a skin lesion. As shown in Figure 2, the feature information is a feature map 212 extracted from an input image 202 using feature extraction module 204. The input image is an image of a skin lesion. In other words, the feature map 212 is the output of the feature extraction module 204.
Each input image 202 can be denoted with a symbol Xi, where i represents that the image is the i-th input image. For each input image 202, the feature extraction module 204 backbone extracts a feature map 212 representation
Figure imgf000007_0001
. Feature map 212 is a 3D feature map with a shape of H x W x C.
The feature map 212 is provided as inputs to the transformer encoder 206. In an example, the transformer encoder 206 uses a sequence as input. In order for the transformer encoder to receive the feature map 212 as a sequence, the feature map 212 can be flattened or collapsed to a 2D feature representation along the spatial dimensions (H x W), resulting in a F°*G K feature sequence 214 represented as ’ A?XC . Feature sequence 214 consists of H x W vectors and each of them is C-dimensional.
Corresponding position encodings are obtained for each feature sequence or collapsed feature map. This allows the spatial information of the feature map 212 to be retained when it is collapsed to a feature sequence 214. In other words, the pixel relationship in an image is preserved. The position encodings and the feature sequence 214 form feature vectors that
Figure imgf000008_0001
. can be represented as: . Each feature vector is a three-dimensional vector that corresponds to each pixel of input image. As demonstrated, the feature information of the input image can be represented in more than one way, as a feature map 212, feature sequence 214 or feature vector for example.
A transformer determines relationships between classifiers 210 classifying distinct parameters of the skin lesion. The transformer has a transformer encoder 206 and transformer decoder 208. The transformer encoder 206 has at least two encoding blocks that determines the global context of the feature information. The transformer decoder 208 has at least two decoding blocks that determines the dependencies of the classifiers in the hierarchical classifier.
The transformer encoder 206 receives the feature sequence 214 and positional encoding as the feature vector to learn global context. Transformer encoder 206 obtains global image context from local image features. Figure 3 shows an example of the transformer encoder
206 having multiple encoding blocks 302, 304, 306. In the example shown in Figure 4, each of the multiple encoding blocks has a multi-head self-attention layer (MSA), multi-layer preceptor layer (MLP), patch attention layer (PA), and class attention layer (CA). Each encoding block may also have a feed forward layer (FF). The MSA layer first normalises feature vectors using a layer normalization function LNQ, then maps normalized features into separate vectors using linear projections. In an example, these vectors are query vectors
V C R^ x ~fr
Figure imgf000008_0002
y and value vectors ’ . These vectors are used so that self-attention does not have be performed by multiplying normalized feature vectors by itself. However, in another example normalized feature vectors do not have to be separated into separate vectors. The MSA layer then generates new feature
Figure imgf000009_0001
vectors 216 represented as ' using a self-attention function:
IQ!.*. K| . Vfjt) = Linear (LN ZH )) , I = 1. L
Figure imgf000009_0004
where LinearQ denotes a linear function realised by a fully connected layer.
LN() and AQ represent a layer normalization function and the attention function respectively. The attention function is used for the self-attention procedure in each encoding block. Selfattention is a kind of mechanism for computing relation between input sequences. The selfattention compares each feature vector to the rest and generates a group of weights. Each weight is a number indicating the similarity of each feature vector to all other feature vectors. Each feature vector can be re-computed by multiplying its weights by the other feature vectors.
The pairwise dot product
Figure imgf000009_0002
measures how similar each pair of query vector and key vector is. The softmax function (SoftmaxO) computes a group of attention weights for every query vector and key vector.
The output of attention function
Figure imgf000009_0003
where a value vector gets a larger weight if its corresponding key vector has larger dot product with its corresponding query vector. In other words, a larger weight is based on the attention weight of the softmax function. The transformer encoder mechanism enables global context of the feature vectors to be obtained by relating every feature vector to the others. Selfattention is used to calculate relationships between all feature vectors that is projected to query, key and value vectors. The weight is based on the dot product of the query and key vectors and indicates the strength of correlation between a feature vector and others. The larger the weight, the stronger the correlation of the feature vector. The weight denotes the global context representing the semantic information of the entire image. A sequence containing multiple feature vectors is represented as Z-. Therefore, Self-attention calculates relation of each feature vector within Z- to the others. In other words, a set of local image features represented by the feature vectors are extracted by the CNN. The transformer encoder then aggregates the local features by determining the relationship between all the local features using global context.
The transformer decoder 208 receives feature vectors 216 from the transformer encoder 206 and classification task queries to compute output embeddings for each classification task. In an example, the transformer decoder 208 is similar to the transformer encoder 206 and contains multiple decoding blocks which consist of two MSA layers and an FF layer. The difference between the transformer encoder 206 and the transformer decoder 208 lies in its inputs. The transformer decoder 208 receives features vectors 216 from the encoder 206 pF c along with classification task queries
Figure imgf000010_0001
The classification task queries are randomly initialized vectors and they are updated during model training. Each task has a task query vector that interacts with global image features for final classification. The classification task queries are learnable parameters that can be updated to compute output embeddings for each classification task respectively by the transformer decoder 208.
Hierarchical classification classifiers 210 have more than one level of classification that ranges from a coarse level of classification (level-0) to a finer level of classification (level-2). The classification task queries are used to determine the output embedding for each level of classification 102, 104, 106. Each classification level has one of classifiers 210 to carry out the classification.
In the transformer decoder 208, each classification task query is associated to the features vectors 216 from the transformer encoder 206 which captures the global image context for generating final prediction for the task using cross-attention. In the example shown in Figure 1, the three levels of classification have three corresponding tasks. The first task is to classify whether a skin lesion in input image 202 is benign or malignant. The second task is to classify the skin lesion in input image 202 according to 8 classes in classification level-1. The third task is to classify the skin lesion in input image 202 according to 65 classes in classification level-2. Therefore, each of the three levels of classification has a corresponding task and classification task query. Each classification task query is a task query vector used for cross attention.
The transformer decoder 208 performs the following calculations: ))
Figure imgf000011_0004
I where U° is a zero-initialized tensor with same size as the classification task query.
MAO denotes multi-head attention function and the calculation is a combination of the following equations:
Figure imgf000011_0001
ft? = MA to?, K , v' uH ) ,
Figure imgf000011_0002
Within each decoder block, ' ‘ performs self-attention by computing attention weights based on classification task queries and output embeddings from a previous decoder block u -1).
Figure imgf000011_0003
links decoder blocks within the transformer decoder 208 and provides the relationship or dependencies between the different classification levels.
Again, self-attention is a kind of mechanism for computing relation between input sequence of different classification tasks in the transformer decoder 208. For a series of classification task queries, the self-attention will compare each classification task query to the rest and generate a group of weights. Each weight is a number that indicates the relationship between each classification task query to the other classification task queries. Each classification task query vector can be re-computed by multiplying its weights to the other classification task query vectors.
Figure imgf000012_0001
is cross attention or encoder-decoder attention and transforms features from the transformer encoder 206 into output embeddings 218 for each task corresponding to each level of classification. Output embeddings
Figure imgf000012_0002
are used to predict labels for M levels of hierarchical classification tasks. Figure 5 shows a visual representation of the interactions between the equations of each encoder and decoder block.
Hierarchical classifiers 210 include at least two of classifiers ordered based on the number of classification categories in each classifier. The classifiers 210 classify distinct parameters in parallel with each other. The classifiers 210 classify distinct parameters in parallel with each other based on the global context of the feature information of an input image and dependencies between the classifiers in the hierarchical classifier. The output embeddings 218 of the transformer decoder 208 allow multiple classification tasks in a hierarchical classification to be performed simultaneously. For example, all three classification levels shown in Figure 1 can be carried out simultaneously once the output embeddings for each corresponding classification level are passed into the corresponding classifiers 210 for calculating coarse to fine prediction scores.
Figure 6 shows an example of the results obtained from the hierarchical classification. Results 602 correspond to the classification level 102, results 604 correspond to the classification level 104 and results 606 correspond to the classification level 106.
Therefore, combining the self-attention and encoder-decoder attention in the transformer decoder enables dependency between different classification levels to be utilised/taken into account when classifying while being capable of using global image context in a parallel way. In particular, the self-attention mechanism used on the task query vectors determines the relationships between different classification tasks. Obtaining the relationship between different classification tasks allows all classification tasks in a hierarchical classification process to be performed concurrently or in parallel to each other. This provides more accurate classification results of a skin Jesion than if classification tasks were performed separately without regard to the relationship between classification tasks.
The final prediction is computed by three separate classifiers 102, 104, 106. Each classifier may consist of a dropout layer, a linear projection layer and a Soft-max layer for example. The outputs of the separate classifiers can be represented in an array of hierarchical
Figure imgf000013_0001
from a coarse level to a fine level of classification.
The system 200 can be used to perform a computer implemented method for classifying skin lesions using hierarchical classification as shown in Figure 8. The hierarchical classification method 800 uses feature extraction module 204 to determine 802 feature information from an input image 202 of a skin lesion. The feature information is a spatial feature map for example.
The spatial features maps are used in a transformer model with the transformer encoder 206 and a transformer decoder 208. The transformer encoder 206 uses a self-attention mechanism to generate discriminative features. As described previously, the transformer encoder 206 uses self-attention to re-represent feature map 212 by considering global context. This results in the more discriminative features to be outputted of transformer encoder 216. The discriminative features are inputted into the transformer decoder 208 along with classification task queries representing the different levels of classification tasks from coarse-to-fine level.
The transformer decoder 208 also uses a self-attention mechanism and a cross-attention mechanism to determine the relationships between the feature information and classifiers and determine 804 relationships between classifiers classifying distinct parameters of the skin lesion. At least two separate classifiers are used to classify 806 distinct parameters of the skin lesion in parallel. The classifiers are ordered in a hierarchy based on the number of classification categories in each classifier. The hierarchical classification method is capable of capturing relationships among different skin classes while performing hierarchical classification in parallel.
In an example, the generalization performance can be improved and interaction between different levels of classifiers can be encouraged using a hierarchical knowledge distillation training strategy as shown in Figure 7. The hierarchical knowledge distillation training strategy may consist of ensemble knowledge distillation and mutual knowledge distillation.
Ensemble knowledge distillation aggregates the multiple outputs from the multiple classifiers. The aggregated outputs/predictions are aligned with each of the multiple outputs from each classifier during training. The ensemble knowledge distillation utilises both ensemble learning and knowledge distillation. The aggregated outputs or ensemble predictions can provide a better result by combing outputs from different classification levels while knowledge distillation is capable of generating soft target with the ensemble prediction to guide the learning of individual classifiers and improve their performance.
In an example, logits from each level of classifier can be represented as . . . ,. t .
Figure imgf000014_0001
An ensemble prediction can be constructed by linearly combining logits of all classifiers into the coarsest level: ge® = Wr • Concat (go, i , .... g»_ i )
Pens = Softmax (gg„) where w e ® ° represents the parameters of the linear layer.
Figure imgf000014_0002
and P®11 denote ensembled logit and probability respectively. The ensemble prediction can be optimised with cross entropy loss. Since outputs from different classification heads vary in the dimension which is undesired for performing distillation with the ensemble prediction, we then map all of them into same dimension of the coarsest level along the path of leaf nodes to root nodes in the hierarchical structure: [g|->0,g2- ), ...,g -l- O] = map(gi, g2, ...,gAf-l) where map() performs logits mapping by summing all logits of fine level classes belonging to the same coarse level class. Kullback Leibler (KL) divergence loss among the ensemble prediction and all mapped predictions of each heads are calculated as follows:
Figure imgf000015_0001
Ensembled outputs act as a strong teacher for distilling knowledge to each classification layer, but it ignores the relationship between different classification layers. Therefore, mutual knowledge distillation can be used to align outputs from consecutive level of classifiers for maintaining the consistency. Because a hierarchical model with multiple classification layers should favor functions that give consistent output for same inputs. The consensus of outputs from different classification layers on the same input provides supplementary information and regularization to each classifier.
Similar to the ensemble knowledge distillation, the output logits of every consecutive classifier are mapped into the same dimension and the KL divergence loss is calculated. The specific calculation can be summarized as: ,gM-l)
Figure imgf000015_0002
The final objective function for optimizing the hierarchical model includes cross entropy loss on outputs from each classifier as well as the ensemble prediction, ensemble knowledge distillation, and mutual knowledge distillation.
Figure imgf000015_0003
The present system and method for hierarchical skin lesion classification is used for skin image datasets. Skin image datasets may be collected from a clinical environment and follows tele-dermatology labelling standards. Skin image datasets may also be collected from publicly available International Skin Imaging Collaboration (ISIC) archives for example.
In an example, dataset with tele-dermatology labelling standards includes 235,268 teledermatology verified clinical and dermoscopic images. The dataset may contain a total of 65 skin conditions organised in a 3-level hierarchical semantic tree as shown in Figure 9.
In another example, data from ISIC archives have 25,331 dermatoscopic images across 8 different categories: melanoma (MEL), melanocytic nevus (MN), basal cell carcinoma (BCC), actinic keratosis (AK), benign keratosis (BKL), dermatofibroma (DF), vascular lesion (VASC), and squamous cell carcinoma (SCC).
To train the system hierarchical skin lesion classification, a dataset can be split into training, validation and testing datasets with a ratio of 7:1:2. The standard data augmentation techniques such as random resized cropping, colour transformation, and flipping can be used on the datasets. Each dermoscopic image is resized to a fixed size of 384x384 for example. Image Net pre-trained ResNet-34 (He, K„ Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image recognition, pp. 770-778. doi:10.1109/CVPR.2016.90.) may be used as the backbone and the model can be trained with ADAM optimizer (Kingma, D., Ba, J., 2014. Adam: A method for stochastic optimization. International Conference on Learning Representations .) with batch size equal to 128 and an initial learning rates of 1 x 10A-5 and 3 10A-4 for the backbone and new added layers, respectively.
The foregoing description of the invention includes preferred forms thereof. Modifications may be made thereto without departing from the scope of the invention.

Claims

1. A system for classifying skin lesions, the system comprising: a feature extraction module that determines feature information from an image of a skin lesion; a transformer that determines relationships between classifiers classifying distinct parameters of the skin lesion; and a hierarchical classifier having at least two of the classifiers ordered based on the number of classification categories in each classifier; wherein the classifiers classify distinct parameters in parallel with each other based on the feature information and the relationships between classifiers.
2. The system of claim 1, wherein the transformer further comprises: at least two encoders that determine the global context of the feature information; and at least two decoders that determines the dependencies of the classifiers in the hierarchical classifier; wherein the classifiers classify distinct parameters in parallel with each other based on the global context and the dependencies.
3. A computer implemented method for classifying skin lesions, the method comprising: determining feature information from an image of a skin lesion; determining relationships between classifiers classifying distinct parameters of the skin lesion; and classifying distinct parameters with at least two of the classifiers in parallel, wherein the classifiers are ordered in a hierarchy based on the number of classification categories in each classifier.
PCT/NZ2022/050154 2021-12-02 2022-11-25 Skin lesion classification system and method WO2023101564A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2022400601A AU2022400601A1 (en) 2021-12-02 2022-11-25 Skin lesion classification system and method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
NZ782937 2021-12-02
NZ78293721 2021-12-02

Publications (1)

Publication Number Publication Date
WO2023101564A1 true WO2023101564A1 (en) 2023-06-08

Family

ID=86612806

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/NZ2022/050154 WO2023101564A1 (en) 2021-12-02 2022-11-25 Skin lesion classification system and method

Country Status (2)

Country Link
AU (1) AU2022400601A1 (en)
WO (1) WO2023101564A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116721302A (en) * 2023-08-10 2023-09-08 成都信息工程大学 Ice and snow crystal particle image classification method based on lightweight network
CN117636064A (en) * 2023-12-21 2024-03-01 浙江大学 Intelligent neuroblastoma classification system based on pathological sections of children

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10878567B1 (en) * 2019-09-18 2020-12-29 Triage Technologies Inc. System to collect and identify skin conditions from images and expert knowledge
CN112330621A (en) * 2020-10-30 2021-02-05 康键信息技术(深圳)有限公司 Method and device for carrying out abnormity classification on skin image based on artificial intelligence

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10878567B1 (en) * 2019-09-18 2020-12-29 Triage Technologies Inc. System to collect and identify skin conditions from images and expert knowledge
CN112330621A (en) * 2020-10-30 2021-02-05 康键信息技术(深圳)有限公司 Method and device for carrying out abnormity classification on skin image based on artificial intelligence

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
"Color Medical Image Analysis", vol. 6, 30 November 2012, SPRINGER, ISBN: 978-94-007-5388-4, article BALLERINI LUCIA, FISHER ROBERT, ALDRIDGE BEN, REES JONATHAN, CELEBI M. EMRE, SCHAEFER GERALD: "A Color and Texture Based Hierarchical K- NN Approach to the Classification of Non-melanoma Skin Lesions", pages: 63 - 86, XP009546955, DOI: 10.1007/978-94-007-5389-1_4 *
ANDRE ESTEVA, BRETT KUPREL, ROBERTO A. NOVOA, JUSTIN KO, SUSAN M. SWETTER, HELEN M. BLAU, SEBASTIAN THRUN: "Dermatologist-level classification of skin cancer with deep neural networks", NATURE, NATURE PUBLISHING GROUP UK, LONDON, vol. 542, no. 7639, 1 February 2017 (2017-02-01), London, pages 115 - 118, XP055536881, ISSN: 0028-0836, DOI: 10.1038/nature21056 *
BARATA CATARINA; CELEBI M. EMRE; MARQUES JORGE S.: "Explainable skin lesion diagnosis using taxonomies", PATTERN RECOGNITION., ELSEVIER., GB, vol. 110, 16 May 2020 (2020-05-16), GB , XP086328204, ISSN: 0031-3203, DOI: 10.1016/j.patcog.2020.107413 *
BARATA CATARINA; MARQUES JORGE S.: "Deep Learning For Skin Cancer Diagnosis With Hierarchical Architectures", 2019 IEEE 16TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2019), IEEE, 8 April 2019 (2019-04-08), pages 841 - 845, XP033576692, DOI: 10.1109/ISBI.2019.8759561 *
SHIMIZU, K ET AL.: "Four-Class Classification of Skin Lesions With Task Decomposition Strategy", IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, vol. 62, no. 1, January 2015 (2015-01-01), XP011568255, Retrieved from the Internet <URL:https://ieeexplore.ieee.org/abstract/document/6879310> [retrieved on 20230223], DOI: 10.1109/TBME.2014.2348323 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116721302A (en) * 2023-08-10 2023-09-08 成都信息工程大学 Ice and snow crystal particle image classification method based on lightweight network
CN116721302B (en) * 2023-08-10 2024-01-12 成都信息工程大学 Ice and snow crystal particle image classification method based on lightweight network
CN117636064A (en) * 2023-12-21 2024-03-01 浙江大学 Intelligent neuroblastoma classification system based on pathological sections of children
CN117636064B (en) * 2023-12-21 2024-05-28 浙江大学 Intelligent neuroblastoma classification system based on pathological sections of children

Also Published As

Publication number Publication date
AU2022400601A1 (en) 2024-03-28

Similar Documents

Publication Publication Date Title
Costa et al. End-to-end adversarial retinal image synthesis
Bargshady et al. Enhanced deep learning algorithm development to detect pain intensity from facial expression images
Acharya et al. TallyQA: Answering complex counting questions
Allaouzi et al. A novel approach for multi-label chest X-ray classification of common thorax diseases
Boughrara et al. Facial expression recognition based on a mlp neural network using constructive training algorithm
WO2023101564A1 (en) Skin lesion classification system and method
He et al. Learning and incorporating top-down cues in image segmentation
JP2006252559A (en) Method of specifying object position in image, and method of classifying images of objects in different image categories
Chen et al. Mobile convolution neural network for the recognition of potato leaf disease images
CN112418041A (en) Multi-pose face recognition method based on face orthogonalization
CN116129141A (en) Medical data processing method, apparatus, device, medium and computer program product
Bodapati Enhancing brain tumor diagnosis using a multi-architecture deep convolutional neural network on MRI scans
CN115392474B (en) Local perception graph representation learning method based on iterative optimization
Aktürk et al. Classification of eye images by personal details with transfer learning algorithms
Swarup et al. Biologically inspired CNN network for brain tumor abnormalities detection and features extraction from MRI images
Babahenini et al. Using saliency detection to improve multi-focus image fusion
Balamurugan Deep Wavelet Autoencoder Based Brain Tumor Detection Analysis Using Deep Neural Network
Doan et al. Deep multi-view learning from sequential data without correspondence
Rao et al. Novel approach of Using Periocular and Iris Biometric Recognition in the Authentication of ITS
Pundir et al. Multiview Human Gait Recognition using a Hybrid CNN Approach
CN117974693B (en) Image segmentation method, device, computer equipment and storage medium
Manaa et al. A systematic review for image enhancement using deep learning techniques
Enireddy et al. Compressed Medical Image Retrieval Using Data Mining and Optimized Recurrent Neural Network Techniques
Si et al. Multi-view representation learning from local consistency and global alignment
Guo et al. Medical Image Retrieval Based on Attention Triplet Hashing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22901909

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022400601

Country of ref document: AU

Ref document number: 809081

Country of ref document: NZ

Ref document number: AU2022400601

Country of ref document: AU

ENP Entry into the national phase

Ref document number: 2022400601

Country of ref document: AU

Date of ref document: 20221125

Kind code of ref document: A