WO2023096503A1

WO2023096503A1 - Methods and systems for automatic risk assessment of a skin lesion from multiple images

Info

Publication number: WO2023096503A1
Application number: PCT/NZ2022/050156
Authority: WO
Inventors: Yaniv Gal; Adrian William BOWLING
Original assignee: Kahu.Ai Limited
Priority date: 2021-11-25
Filing date: 2022-11-25
Publication date: 2023-06-01

Abstract

Some previously proposed skin lesion classification algorithms use images of the skin lesion of interest as inputs. Disclosed herein is a method for performing a risk assessment on a skin lesion. The method comprises receiving (102) at least two images of the skin lesion, the at least two images including a first image and a second image, the first image being different to the second image in at least one of modality, field of view, distance, lighting configuration, optical settings or time. The method comprises providing (104) the at least two images as input to an AI network with at least one fully connected layer, wherein the AI network is trained based on at least two different images of each of a plurality of skin lesions. The method comprises determining (106) a risk assessment associated to the skin lesion based at least partly on at least one output of the AI network.

Description

METHODS AND SYSTEMS FOR AUTOMATIC RISK ASSESSMENT OF A SKIN LESION FROM MULTIPLE IMAGES

FIELD OF THE INVENTION

The invention relates to methods and systems for performing a risk assessment on a skin lesion.

BACKGROUND TO THE INVENTION

To diagnose or classify a skin lesion, such as a mole, it is common for a dermatologist to simply observe the skin lesion through human vision.

A skin lesion can also be diagnosed or classified automatically using classification algorithms. These classification algorithms often use images of the skin lesion of interest as inputs. This is because dermatologists often use visual information of the skin lesion to perform diagnosis of the skin lesion from visual inspection.

Existing skin lesion classification algorithms classify skin lesions based on a single image. However, a single image contains limited information of the skin lesion.

It is an object of at least preferred embodiments to address at least some of the aforementioned disadvantages. An additional or alternative object is to at least provide the public with a useful choice.

SUMMARY OF THE INVENTION

In accordance with an aspect, a method for performing a risk assessment on a skin lesion comprises receiving at least two images of the skin lesion, the at least two images including a first image and a second image, the first image being different to the second image in at least one of modality, field of view, distance, lighting configuration, optical settings or time; providing the at least two images as input to an Al network with at least one fully connected layer, wherein the Al network is trained based on at least two different images of each of a plurality of skin lesions; and determining a risk assessment associated to the skin lesion based at least partly on at least one output of the Al network. The term 'comprising' as used in this specification means 'consisting at least in part of'. When interpreting each statement in this specification that includes the term 'comprising', features other than that or those prefaced by the term may also be present. Related terms such as 'comprise' and 'comprises' are to be interpreted in the same manner.

In an embodiment, the first image is a micro image and the second image is a macro image.

In an embodiment, the first image is a polarised and the second image is a non-polarised image.

In an embodiment, the first image is a non-polarised and the second image is a polarised image.

In an embodiment, both the first image and the second image are polarised.

In an embodiment, both the first image and the second image are non-polarised.

In an embodiment, when dependent on claim 1, wherein both the first image and the second image are micro images.

In an embodiment, when dependent on claim 1, wherein both the first image and the second image are macro images.

In an embodiment, wherein the polarised images are cross-polarised.

In an embodiment, the first image is taken at a time a day or more apart from a time that the second image is taken.

In an embodiment, the at least two images are obtained at different wavelengths.

In an embodiment, the risk assessment is associated to a likelihood that the skin lesion is malignant.

The invention in one aspect comprises several steps. The relation of one or more of such steps with respect to each of the others, the apparatus embodying features of construction, and combinations of elements and arrangement of parts that are adapted to affect such steps, are all exemplified in the following detailed disclosure. To those skilled in the art to which the invention relates, many changes in construction and widely differing embodiments and applications of the invention will suggest themselves without departing from the scope of the invention as defined in the appended claims. The disclosures and the descriptions herein are purely illustrative and are not intended to be in any sense limiting. Where specific integers are mentioned herein which have known equivalents in the art to which this invention relates, such known equivalents are deemed to be incorporated herein as if individually set forth.

In addition, where features or aspects of the invention are described in terms of Markush groups, those persons skilled in the art will appreciate that the invention is also thereby described in terms of any individual member or subgroup of members of the Markush group.

As used herein, '(s)' following a noun means the plural and/or singular forms of the noun.

As used herein, the term 'and/or' means 'and' or 'or' or both.

It is intended that reference to a range of numbers disclosed herein (for example, 1 to 10) also incorporates reference to all rational numbers within that range (for example, 1, 1.1, 2, 3, 3.9, 4, 5, 6, 6.5, 7, 8, 9, and 10) and also any range of rational numbers within that range (for example, 2 to 8, 1.5 to 5.5, and 3.1 to 4.7) and, therefore, all sub-ranges of all ranges expressly disclosed herein are hereby expressly disclosed. These are only examples of what is specifically intended and all possible combinations of numerical values between the lowest value and the highest value enumerated are to be considered to be expressly stated in this application in a similar manner.

In this specification where reference has been made to patent specifications, other external documents, or other sources of information, this is generally for the purpose of providing a context for discussing the features of the invention. Unless specifically stated otherwise, reference to such external documents or such sources of information is not to be construed as an admission that such documents or such sources of information, in any jurisdiction, are prior art or form part of the common general knowledge in the art. Although the present invention is broadly as defined above, those persons skilled in the art will appreciate that the invention is not limited thereto and that the invention also includes embodiments of which the following description gives examples.

BRIEF DESCRIPTION OF THE DRAWINGS

Preferred forms of the method for performing a risk assessment on a skin lesion will now be described by way of example only with reference to the accompanying figures in which:

Figure 1 shows the method for performing a risk assessment on a skin lesion; and

Figures 2 to 4 shows examples of Al networks from figure 1.

DETAILED DESCRIPTION

Figure 1 shows a method 100 for performing a risk assessment on a skin lesion. The method comprises the steps of receiving 102 at least two images of the skin lesion, providing 104 the at least two images as input to an Al network, and determining 106 a risk assessment associated to the skin lesion based at least partly on at least one output of the Al network. The skin lesion is a mole for example.

Figures 2 - 4 show examples of the Al networks 200, 300, 400 having deep-neural network architectures with at least one fully connected layer 206, 306, 406. The structures of Al networks 200, 300, 400 are trained to receive two or more independent image inputs concurrently. The image inputs do not need to be aligned and do not have to have the same dimensions. The image inputs may be different in terms of at least modality, field of view, distance, lighting configuration or optical settings. Al networks 200, 300, 400 combine convolutional layers 208, 210, 310, 312, 314, 402, 404 to produce an output classification or diagnosis of a skin lesion. Each image is passed through a sequence of convolutional layers 208, 210, 310, 312, 314, 402, 404 with additional contraction and activation layers between them (e.g. max pooling and ReLU activation). For example, convolutional layers 208 have a sequence of layers that can receive an image. The outputs of the convolutional layers 208, 210, 310, 312, 314, 402, 404 are used as inputs for a respective fully connected layer 206, 306, 406. The outputs of the convolutional layers can be joint (i.e. concatenating the two output vectors into a single, double length, vector) to the two convolutional blocks from the input layer or separated in any of the fully connected layers (including the first one) and joint in a later layer to form a single output.

The Al network is trained based on at least two images of each of a plurality of skin lesions. The images of each skin lesion used to train the Al network are different in terms of at least modality, field of view, distance, lighting configuration or optical settings. In other words, the Al network is trained based on multiple skin lesions and each of the multiple skin lesions are represented by at least two images containing different visual information of the skin lesion.

An example skin lesion used to train the Al network can be represented by a micro image and a macro image, both images provide different information about the skin lesion that are useful for the classification of the skin lesion.

A micro image can be obtained from a dermatological camera that presses a skin lesion flat against the lens and/or the skin around the skin lesion to put the whole of the skin lesion in focus. This can help prevents any polarisation and/or reflections from the skin that may influence the visual information in a micro image.

Certain issues regarding the accuracy of a micro image exist as one cannot be certain if a skin lesion is completely flat when compressed. Therefore, regions the skin lesion might not be in focus. Further, when a skin lesion is compressed, its color and texture can be altered from the compression of the blood vessels in the skin lesion. These issues can bias the results of classification using such micro images. In some examples, a micro image can be acquired from short distance of 10 - 20 mm away from the skin lesion instead of being in contact with the lesion. A micro image can also be obtained with a narrow field of view of 5 - 20 mm for example.

A macro image can provide visual information of the skin lesion when the skin lesion is not compressed. Therefore, these macro images contain visual information are also useful when classifying a skin lesion. The macro images can also be through a normal camera placed a certain distance away from the skin lesion. Using both micro and macro images to train an Al network allows the Al network to utilise additional information when classifying skin lesion. As described above, the use of macro images in addition to micro images provide additional information and context for the skin lesion, such as normal skin texture and an image of the skin lesion in its un-pressed form, where blood normally flows through it.

The Al network can be trained based on a database of pairs (or more) of images of each of multiple skin lesions. Each of these training images have a known diagnosis or classification associated with its corresponding skin lesion. Again, these training images or pairs of training images are different in terms of at least modality, field of view, distance, lighting configuration or optical settings. For example, these training images can be pairs of micro and macro images of each of multiple skin lesions. Each pair of training images are acquired from the same skin lesion within a short time frame to avoid the appearance of the skin lesion from changing between the images. The time gap between the acquisition of images can vary and the images of the skin lesion can be taken in any order. The known diagnosis or classification can also be considered a ground truth label or a clinical diagnosis outcome. Alternatively, two or more images of the same lesion, can be acquired with an intentional time-gap (e.g. weeks or months) and the network can then be trained and used to enforce diagnosis based on changes in appearance between the two images.

Ground truth labels refer to the labels which were assigned to the specific lesion by means of diagnostic procedures. These can include specialist analysis, by a dermatologist for example, or a lab diagnosis, through histopathology for example. The ground truth labels determine how a skin lesion can be classified and a skin lesion can be classified to different levels of accuracy. Examples of ground-truth labels include whether a skin lesion is malignant or benign. More distinct or detailed ground-truths can have more than two classifications. These more detailed classifications can include the lesion type or lesion categorisation. For example, a skin lesion can have the lesion types of melanoma, basal-cell-carcinoma (BCC), squamous cell carcinoma (SCC), melanocytic, vascular. In another example, a skin lesion can have a more specific lesion categorisation of SCC in situ, melanocytic dermal, nodular melanoma. The trained Al network can assess the risk of a new or 'not seen before' skin lesion being malignant. The trained Al network can classify the new skin lesion based on the multiple images of the new skin lesion of the same category as the images the Al network is trained. The at least two images include a first image and a second image, the first image being different to the second image in at least one of modality, field of view, distance, lighting configuration or optical settings. Examples include, a micro image and a macro image, polarised and none-polarised images, visible light and infra-red images, different time points (e.g. weeks apart) or combinations of the above. If the Al network is trained on pairs of micro and macro images of skin lesions, the Al network can classify a new skin lesion based on micro and macro images of the new skin lesion. In other words, the images contain nonidentical information for the Al network to exploit.

Figure 2 shows micro image 202 and macro image 204 as inputs to Al network 200. This means that the Al network 200 was trained on pairs of micro and macro images of skin lesions. Figure 3 shows two micro images 302, 304 and macro image 308 as inputs to Al network 300. This means that the Al network 300 was trained on more than two images of two micro and one macro images of skin lesions.

Training images and input images can also be polarised or non-polarised. Input and training images can be of different categories in more than one way. For example, input images of Al network 200 are micro and macro images, but they can also be polarised or non-polarised. Using Al network 200 as an example, a first image 202 is also polarised and the second image 204 is also non-polarised. In a contrasting example, the first image 202 is also nonpolarised instead and the second image 204 is also polarised. Training images and input images can also be acquired at different wavelengths. For example, visible light, infra-red, or ultra-violet electromagnetic waves may be used to illuminate a skin lesion for an image capture device to obtain images at different wavelengths.

Further, since images 202 and 204 are already different, as one is a micro image while the other is a macro image, the first image 202 and the second image 204 can both be either polarised or non-polarised.

Al network 300 shows both the first image 302 and the second image 304 being micro images. In this example, one of the micro images 302, 304 may be polarised while the other is non-polarised. In another example where input images contain different information through different polarisation, both a first input image and a second input image may be polarised.

Polarised images can also be cross-polarised, where light is reflecting from the skin lesion. The images are orthogonally polarised from light received by a lens of an image capture device for the skin lesion. Polarised images may also have varying degrees of polarisation. Further, input images may be captured at different times that create the difference in information contained in the input images.

The outputs of the Al networks 200, 300, 400 may be processed or presented in different ways, based on the application and the user needs. In an example, the Al network provides a risk assessment that is associated to a likelihood that a skin lesion is malignant or some level of lesion classification (i.e. into types, sub-types, etc.). The Al networks 200, 300, 400 may output a score between 0 and 1 for example with each possible classification based on the ground truth labels used to train the Al networks. For example, a score between 0 and 1 may be assigned to whether a lesion is malignant and whether the lesion is benign. In another, example a score between 0 and 1 may be assigned to each classification category of lesion type. The classification category with the highest score out of all the possible classes or labels can indicate the diagnosis of a skin lesion. There are many other ways to interpret these scores for different clinical applications that would be considered common practices in the art.

Given an image training dataset, the Al network may be trained on any computer. The Al may be implemented on a computer that can receive images of skin lesions. The images may be obtained from an image capture device such as a dermatological camera or a normal camera. The images can be acquired at a location separate from where the Al network analyses the images.

Claims

1. A method for performing a risk assessment on a skin lesion, the method comprising: receiving at least two images of the skin lesion, the at least two images including a first image and a second image, the first image being different to the second image in at least one of modality, field of view, distance, lighting configuration, optical settings or time; providing the at least two images as input to an Al network with at least one fully connected layer, wherein the Al network is trained based on at least two different images of each of a plurality of skin lesions; and determining a risk assessment associated to the skin lesion based at least partly on at least one output of the Al network.

2. The method of claim 1, wherein the first image is a micro image and the second image is a macro image.

3. The method of claim 1 or 2, wherein the first image is a polarised and the second image is a non-polarised image.

4. The method of claim 2, wherein the first image is a non-polarised and the second image is a polarised image.

5. The method of claim 2, wherein both the first image and the second image are polarised.

6. The method of claim 2, wherein both the first image and the second image are nonpolarised.

7. The method of claim 3, when dependent on claim 1, wherein both the first image and the second image are micro images.

8. The method of claim 3, when dependent on claim 1, wherein both the first image and the second image are macro images.

9. The method of one of claims 3 - 8, wherein the polarised images are cross-polarised.

10. The method of any one of the preceding claims, wherein the first image is taken at a time a day or more apart from a time that the second image is taken.

9

11. The method of any one of the preceding claims, wherein the at least two images are obtained at different wavelengths.

12. The method of any one of the preceding claims, wherein the risk assessment is associated to a likelihood that the skin lesion is malignant.