CN115511861A

CN115511861A - Identification method based on artificial neural network

Info

Publication number: CN115511861A
Application number: CN202211242487.5A
Authority: CN
Inventors: 彭璨
Original assignee: Shenzhen Siji Intelligent Control Technology Co ltd
Current assignee: Shenzhen Siji Intelligent Control Technology Co ltd
Priority date: 2019-11-28
Filing date: 2020-11-27
Publication date: 2022-12-23
Also published as: CN112862745B; CN114972278A; CN112862746B; CN112862746A; CN115049602A; CN115511860A; CN112862745A

Abstract

The present disclosure describes an artificial neural network-based identification method, comprising, acquiring a tissue image; the method comprises the steps of receiving a tissue image by using an artificial neural network module, carrying out lesion recognition and training on the tissue image, recognizing an inspection image on the basis of a feature map and an attention heat map to obtain a recognition result, combining the recognition result with an annotation image to obtain a first loss function when the attention mechanism is not used, combining the recognition result with the annotation image to obtain a second loss function when the attention mechanism is used, obtaining a total loss function comprising a first loss term based on the first loss function and a second loss term based on the difference between the second loss function and the first loss function by using the first loss function and the second loss function, and optimizing the artificial neural network module by using the total loss function, so that the recognition rate of tissue lesions can be effectively improved.

Description

Identification method based on artificial neural network

The application is a divisional application of patent applications, which are filed on 27.11.2020 and 27.202011364685X, and are named as an identification method and an identification system for tissue lesion identification based on an artificial neural network.

Technical Field

The present disclosure relates generally to artificial neural network based identification methods.

Background

With the development and maturity of artificial intelligence technology, the artificial intelligence technology has gradually been popularized in various aspects of the medical field. In particular, medical imaging in medicine is a popular field of application of artificial intelligence technology. Medical imaging is a useful tool for diagnosing many diseases, and a large amount of medical image data is generated during the medical imaging process, and the processing and identification of the image data requires a large amount of time for a physician, and it is difficult to ensure the accuracy of the identification. In the medical image, the tissue lesion recognition is mainly carried out on the tissue in the image by using an artificial intelligence technology so as to improve the accuracy of the tissue lesion recognition.

Convolutional Neural Networks (CNN) are currently commonly used in the identification of medical images using artificial intelligence techniques. The convolution structure of the convolutional neural network can reduce the memory amount occupied by the deep network, and the convolutional neural network has three key operations, namely local receptive fields, weight sharing and pooling layers. Therefore, the number of parameters of the network can be effectively reduced, and the overfitting problem of the convolutional neural network is relieved. The structure of the convolutional neural network can be well adapted to the structure of the medical image and the characteristics can be extracted and identified.

However, for some lesion parts such as fundus lesion parts, lesion areas are relatively small and distributed irregularly, and a general convolutional neural network applying an attention mechanism often easily ignores a lesion area with low attention in an attention heat map, so that a phenomenon of misjudgment occurs, and therefore, the accuracy of tissue lesion identification of the lesion areas is low.

Disclosure of Invention

The present disclosure has been made in view of the above-described state of the art, and an object of the present disclosure is to provide a tissue lesion recognition method and a tissue lesion recognition system using an artificial neural network, which can effectively improve the accuracy of tissue lesion recognition.

To this end, the present disclosure provides, in a first aspect, an identification method for tissue lesion identification based on an artificial neural network, including: acquiring a tissue image, wherein the tissue image is acquired by an acquisition device; receiving the tissue image and performing lesion recognition on the tissue image by using an artificial neural network module, wherein the artificial neural network module comprises a first artificial neural network, a second artificial neural network and a third artificial neural network, the first artificial neural network is configured to perform feature extraction on the tissue image to obtain a feature map, the second artificial neural network is configured to obtain an attention heat map indicating a lesion region, the third artificial neural network is configured to perform recognition on the tissue image based on the feature map, and the training of the artificial neural network module comprises: preparing a training data set comprising a plurality of examination images and an annotation image associated with the examination images, the annotation image comprising an annotation result with or without a lesion, feature extracting the examination images using the first artificial neural network to obtain a feature map, obtaining an attention heat map indicating a lesion region and a complementary attention heat map indicating a non-lesion region using the second artificial neural network, the examination images being composed of the lesion region and the non-lesion region, identifying the examination images using the third artificial neural network based on the feature map to obtain a first identification result, identifying the examination images using the third artificial neural network based on the feature map and the attention heat map to obtain a second identification result, identifying the examination images using the third artificial neural network based on the feature map and the complementary attention heat map to obtain a third identification result, combining the first identification result with the annotation image to obtain a first loss function when the attention function is not used, combining the second identification result with the annotation image to obtain a second loss function when the annotation result and the annotation function include a total loss of the lesion loss, and the total loss of the annotation function when the total loss of the first identification result and the annotation function include a loss, and a loss of the total loss of the second identification loss when the annotation function includes a loss, and a loss of the total loss of the annotation loss when the loss of the second identification loss and the annotation function include a loss of the total loss And optimizing the artificial neural network module. In this case, the identification result of the tissue lesion identification can be obtained by using the artificial neural network module, and the artificial neural network module can be optimized by using the total loss function, so that the accuracy of the tissue lesion identification can be improved.

In addition, in the identification method of tissue lesion identification based on artificial neural network according to the first aspect of the present disclosure, optionally, the total loss function further includes a total area term of the attention heat map, and the total area term is used for evaluating an area of the lesion region. In this case, it is possible to estimate the area of the lesion region within the attention heat map using the fifth loss term and control the number of pixels in the attention heat map that have a greater influence on the recognition result, thereby limiting the attention of the network to pixels that have a greater influence on the recognition result.

In addition, in the identification method of tissue lesion identification based on artificial neural network according to the first aspect of the present disclosure, optionally, the total loss function further includes a regularization term for the attention heat map. In this case, the artificial neural network module overfitting can be suppressed.

In addition, in the identification method of tissue lesion identification based on artificial neural network according to the first aspect of the present disclosure, optionally, the first artificial neural network, the second artificial neural network, and the third artificial neural network are trained simultaneously. In this case, the training speed can be increased.

In addition, in the identification method of tissue lesion identification based on an artificial neural network according to the first aspect of the present disclosure, optionally, the third artificial neural network includes an input layer, an intermediate layer, and an output layer connected in sequence, and the output layer is configured to output an identification result reflecting the inspection image. In this case, the recognition result reflecting the tissue image can be output using the third artificial neural network.

In addition, in the identification method of tissue lesion identification based on an artificial neural network according to the first aspect of the present disclosure, optionally, the training mode of the artificial neural network module is weak supervision. In this case, the recognition result with a large amount of information can be obtained by the artificial neural network module using the labeling result with a small amount of information.

In addition, in the identification method based on tissue lesion identification of artificial neural network according to the first aspect of the present disclosure, optionally, the first loss function is used to evaluate a degree of inconsistency between the identification result and the labeling result of the examination image when the attention mechanism is not used. In this case, the accuracy of tissue lesion identification by the artificial neural network module when the attention mechanism is not used can be improved.

In addition, in the identification method of tissue lesion identification based on artificial neural network according to the first aspect of the present disclosure, optionally, the second loss function is used to evaluate a degree of inconsistency between the identification result and the labeling result of the examination image when the attention mechanism is used. In this case, the accuracy of tissue lesion identification by the artificial neural network module when using the attention mechanism can be improved.

In addition, in the identification method of tissue lesion identification based on artificial neural network according to the first aspect of the present disclosure, optionally, the third loss function is used to evaluate a degree of inconsistency between an identification result of the examination image when the complementary attention mechanism is used and a labeling result without a lesion. In this case, the accuracy of tissue lesion identification by the artificial neural network module when using the complementary attention mechanism can be improved.

In addition, in the identification method of tissue lesion identification based on artificial neural network according to the first aspect of the present disclosure, optionally, the artificial neural network module is optimized by using the total loss function to minimize the total loss function. In this case, the total loss function can be minimized to improve the accuracy of tissue lesion identification by the artificial neural network module.

In addition, in the identification method based on the tissue lesion recognition of the artificial neural network according to the first aspect of the present disclosure, optionally, the tissue lesion is a fundus lesion. In this case, the recognition result of the fundus image with respect to the fundus lesion can be obtained by the artificial neural network module.

The second aspect of the present disclosure provides an identification system for tissue lesion identification based on an artificial neural network, which is characterized in that the identification method provided by the first aspect of the present disclosure is used for tissue lesion identification. In this case, the tissue image can be subjected to tissue lesion recognition using a recognition system.

According to the present disclosure, it is possible to provide an identification method and an identification system for tissue lesion identification based on an artificial neural network, which can effectively improve the accuracy of identifying tissue lesions.

Drawings

Embodiments of the present disclosure will now be explained in further detail, by way of example only, with reference to the accompanying drawings, in which:

fig. 1 is a schematic diagram illustrating an electronic device to which examples of the present disclosure relate.

Fig. 2 is a diagram illustrating a tissue image according to an example of the present disclosure.

Fig. 3 is a block diagram illustrating an identification system for tissue lesion identification based on an artificial neural network according to an example of the present disclosure.

Figure 4 is a block diagram illustrating one example of an artificial neural network module to which examples of the present disclosure relate.

Fig. 5 is a block diagram illustrating a variation of an artificial neural network module to which examples of the present disclosure relate.

Fig. 6 is a schematic diagram illustrating a structure of a first artificial neural network according to an example of the present disclosure.

Fig. 7 is a block diagram illustrating a training system for tissue lesion recognition based on an artificial neural network according to an example of the present disclosure.

Fig. 8 is a flow chart illustrating a training method for tissue lesion recognition based on an artificial neural network according to an example of the present disclosure.

Fig. 9 (a) is a schematic diagram showing an example of a fundus image obtained without using attention mechanism training according to an example of the present disclosure.

Fig. 9 (b) is a schematic diagram showing an example of a lesion region of a fundus image obtained using a complementary attention mechanism training according to an example of the present disclosure.

The main reference numbers: 1 \ 8230, an electronic device 10 \ 8230, a processor 20 \ 8230, a memory 30 \ 8230, a computer program 40 \ 8230, an identification system 410 \ 8230an acquisition module 4200 \ 8230, a trunk neural network 420 \ 8230, an artificial neural network module 421 \ 8230, a first artificial neural network 422 \ 8230, a second artificial neural network 423 \ 8230, a third artificial neural network 424 \ 8230, a characteristic combination module 430 \ 8230, a training system 431 \ 8230, a storage module 432 \ 8230, a processing module 433 \ 8230, an optimization module, a C1 \ 8230, a first lamination layer C2 \ 8230, a second lamination layer C3 \ 8230, a third lamination layer S1 \ 8230, a first lamination layer S8230, a second lamination layer S823030, a second lamination layer S3 \ 8230, a third lamination layer S82303, a third lamination layer S1 \ 8230

Detailed Description

Hereinafter, preferred embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. In the following description, the same components are denoted by the same reference numerals, and redundant description thereof is omitted. The drawings are schematic and the ratio of the dimensions of the components and the shapes of the components may be different from the actual ones.

Fig. 1 is a schematic diagram illustrating an electronic device according to an embodiment of the present disclosure.

As shown in fig. 1, the identification system 40 for tissue lesion identification based on artificial neural network according to the present disclosure may be carried by an electronic device 1 (e.g., a computer). In some examples, the electronic device 1 may include one or more processors 10, a memory 20, and a computer program 30 disposed in the memory 20. The one or more processors 10 may include, among other things, a central processing unit, a graphics processing unit, and any other electronic components capable of processing data. For example, the processor 10 may execute instructions stored on the memory 20.

In some examples, memory 20 may be a computer-readable medium that can be used to carry or store data. In some examples, the Memory 20 may include, but is not limited to, a non-volatile Memory or a Flash Memory (Flash Memory), or the like. In some examples, the memory 20 may also be, for example, a ferroelectric random access memory (FeRAM), a Magnetic Random Access Memory (MRAM), a phase change random access memory (PRAM), or a Resistive Random Access Memory (RRAM). This can reduce the possibility of data loss due to sudden power outage.

In other examples, the Memory 20 may be other types of readable storage media, such as Read-Only Memory (ROM), random Access Memory (RAM), programmable Read-Only Memory (PROM), erasable Programmable Read-Only Memory (EPROM), one-time Programmable Read-Only Memory (OTPROM), electrically Erasable rewritable Read-Only Memory (EEPROM), and compact disc Read-Only Memory (CD-ROM).

In some examples, the memory 20 may be an optical disk memory, a magnetic disk memory, or a tape memory. Thus, the appropriate memory 20 can be selected in accordance with different situations.

In some examples, computer program 30 may include instructions for execution by one or more processors 10, which may cause recognition system 40 to perform tissue lesion recognition on a tissue image. In some examples, the computer program 30 may be deployed in a local computer or may be deployed in a cloud server.

In some examples, computer program 30 may be stored in a computer readable medium. The computer-readable storage medium may include one or more of a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM) or flash memory, an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, and a magnetic storage device.

Fig. 2 is a diagram illustrating a tissue image according to an example of the present disclosure. Fig. 3 is a block diagram illustrating an identification system 40 for tissue lesion identification based on an artificial neural network according to an example of the present disclosure.

In some examples, the tissue lesion recognition of the tissue image may be implemented with the recognition system 40 based on tissue lesion recognition of an artificial neural network, and the recognition result is obtained. In some examples, identification system 40 of tissue lesion identification may also be referred to as identification system 40.

In some examples, as shown in fig. 3, recognition system 40 may include an acquisition module 410, an artificial neural network module 420, and a training system 430 for artificial neural network-based tissue lesion recognition. In some examples, the acquisition module 410 may be used to acquire tissue images. In some examples, the artificial neural network module 420 may be configured to perform feature extraction, tissue lesion recognition, and the like on the tissue image, and obtain a recognition result of the tissue lesion recognition. In some examples, a training system 430 for artificial neural network-based tissue lesion recognition may be used to train the artificial neural network module 420. In some examples, the training system 430 may utilize the first recognition result, the second recognition result, and the third recognition result obtained by the artificial neural network module 420 and obtain an overall loss function based on the first recognition result, the second recognition result, and the third recognition result to optimize the artificial neural network module 420. In this case, the first recognition result, the second recognition result, and the third recognition result can be obtained, and the total loss function can be obtained based on the first recognition result, the second recognition result, and the third recognition result, so that the artificial neural network module 420 can be optimized using the total loss function, thereby improving the accuracy of tissue lesion recognition of the artificial neural network module 420.

In some examples, training system 430 for tissue lesion recognition based on an artificial neural network may also be referred to as training system 430.

In some examples, the recognition system 40 can also include a preprocessing module and a determination module (not shown).

In some examples, the tissue image may be an image from a tissue cavity taken by a CT scan, PET-CT scan, SPECT scan, MRI, ultrasound, X-ray, mammogram, angiogram, fluoroscope, capsule endoscope, or a combination thereof. In some examples, the tissue image may be acquired by acquisition module 410.

In some examples, the acquisition module 410 may be configured to acquire tissue images, which may be tissue images acquired by an acquisition device such as a camera, an ultrasound imager, or an X-ray scanner.

In some examples, the tissue image may be, for example, a fundus image, an esophagus image, a stomach image, a large intestine image, a colon image, or a small intestine image. As shown in fig. 2, the tissue image may be a fundus image. In this case, fundus lesion recognition can be performed on the fundus image by the recognition system 40.

In some examples, the tissue lesion identification may be to identify a tissue lesion of the tissue image to obtain an identification result.

In some examples, where the tissue image is a fundus image, the tissue lesion may be a fundus lesion. In this case, the recognition result of the fundus image with respect to the fundus lesion can be obtained by the artificial neural network module 420.

In some examples, the tissue image may be comprised of a lesion region and a non-lesion region.

In some examples, tissue images with tissue lesions (color images) typically contain distinctive features such as erythema, redness, and so forth, and thus these features can be automatically extracted and identified using a trained artificial neural network to help a patient identify possible lesions. Therefore, the accuracy and the speed of recognition can be improved, and the problems of large error, long time consumption and the like caused by the fact that a human doctor reads the films one by one through self experience can be solved.

In some examples, where the tissue image is a fundus image, the tissue image may be classified by function. For example, in the training step, the tissue image may be an examination image, an annotation image (described later).

In some examples, the image input to the artificial neural network module 420 may be a tissue image. In this case, tissue lesion recognition can be performed on the tissue image through the artificial neural network module 420.

In some examples, identification system 40 may be used for tissue lesion identification on tissue images. In some examples, the tissue image may be pre-processed, feature extracted, and tissue lesion identified after entering the identification system 40.

In some examples, the recognition system 40 may also include a preprocessing module and a determination module. The pre-processing module may be used to pre-process the tissue image and input the pre-processed tissue image into the artificial neural network module 420.

In some examples, the pre-processing module may pre-process the tissue image. In some examples, the pre-processing may include at least one of region of interest detection, image cropping, resizing, and normalization. In this case, the tissue lesion recognition and judgment of the tissue image by the subsequent artificial neural network module 420 can be facilitated. In some examples, the tissue image may be, for example, a fundus image, an esophagus image, a stomach image, a large intestine image, a colon image, or a small intestine image.

In some examples, the pre-processing module may include a region detection unit, an adjustment unit, and a normalization unit.

In some examples, the region detection unit may detect a region of interest from the tissue image. For example, if the tissue image is a fundus image, a fundus region centered on the optic disk, a fundus region centered on the macula including the optic disk, or the like may be detected from the fundus image. In some examples, the region detection unit may detect the region of interest in the tissue image by, for example, a sampling thresholding method, a Hough (Hough) transform.

In some examples, the adjustment unit may be used to crop and resize the tissue image. Due to different apparatuses for acquiring tissue images or different shooting conditions, the obtained tissue images may differ in resolution, size, and the like. In this case, the tissue images may be cropped and resized to reduce the difference. In some examples, the tissue image may be cropped to a particular shape. In some examples, the particular shape may include, but is not limited to, a square, rectangle, circle, or oval, etc.

In other examples, the size of the tissue image may be adjusted to a prescribed size by the adjusting unit. For example, the specified size may be 256 × 256, 512 × 512, 1024 × 1024, or the like. Examples of the disclosure are not limited thereto, and in other examples, the size of the tissue image may be any other specification size. For example, the size of the tissue image may be 128 × 128, 768 × 768, 2048 × 2048, or the like.

In some examples, the pre-processing module may include a normalization unit. The normalization unit may be configured to perform normalization processing on a plurality of tissue images.

In some examples, the normalization method of the normalization unit is not particularly limited, and may be performed using, for example, a zero mean (zero mean), a unit standard deviation (unit standard deviation), or the like. Additionally, in some examples, normalization may also be in the range of [0,1 ]. In this case, by normalization, the difference of different tissue images can be overcome.

In some examples, normalization includes normalization of image format, image slice spacing, image intensity, image contract, and image orientation. In some examples, the tissue images may be normalized to a DICOM format, NIfTI format, or raw binary format.

As described above, the recognition system 40 may include an artificial neural network module 420. In some examples, the artificial neural network module 420 may be used to perform tissue lesion identification on the tissue image. In some examples, the artificial neural network module 420 may include a plurality of artificial neural networks. In some examples, the artificial neural network may be trained using one or more processors 10. In general, an artificial neural network may include artificial neurons or nodes that may be used to receive an image of tissue and perform operations on the image of tissue based on weights, and then selectively pass the results of the operations on to other neurons or nodes. Where weights may be associated with artificial neurons or nodes while constraining the output of other artificial neurons. The weights (i.e., network parameters) may be determined by iteratively training the artificial neural network through a training data set (described later).

In some examples, as shown in fig. 4, the artificial neural network module 420 may include a backbone neural network 4200 and a second artificial neural network 422.

In some examples, the backbone neural network 4200 may include a first artificial neural network 421, a third artificial neural network 423, and a feature combination module 424.

In some examples, the first artificial neural network 421 may receive the tissue image and perform feature extraction on the tissue image to obtain a feature map.

In some examples, the second artificial neural network 422 may receive the feature map and the recognition result from the third artificial neural network 423 and obtain a heat of attention map indicative of a diseased region and a complementary heat of attention map indicative of a non-diseased region. It is noted that in other examples, the above-described attention heat map or complementary attention heat map may also be considered a feature map.

In some examples, feature combination module 424 may receive the feature map, the attention heat map, and the complementary attention heat map and output a set of feature combinations. In some examples, the feature combination module 424 may also output the feature map directly.

In some examples, the third artificial neural network 423 may receive the feature map or feature combination set and output an identification result of tissue lesion identification of the tissue image.

In some examples, the tissue image (e.g., the pre-processed tissue image) input to the artificial neural network module 420 may enter the first artificial neural network 421, and the recognition result may be finally output by the third artificial neural network 423.

Additionally, in some examples, as shown in fig. 5, artificial neural network module 420 may include a backbone neural network 4200 and a second artificial neural network 422.

In some examples, as shown in fig. 5, the backbone neural network 4200 may include a first artificial neural network 421 and a third artificial neural network 423.

In some examples, the third artificial neural network 423 may have a feature combination function. For details, see the description associated with feature combination module 424.

In some examples, the second artificial neural network 422 may also obtain a complementary attention heat map indicating non-diseased regions from the attention heat map.

In some examples, the third artificial neural network 423 may receive the feature map, the attention heat map, and the complementary attention heat map and output a recognition result of tissue lesion recognition of the tissue image. In some examples, the attention heat map may be a heat map indicative of the lesion area obtained based on an attention mechanism. In some examples, the attention heat map may show the importance of various pixel points in the tissue image when forming the feature map.

In some examples, the complementary attention heat map may be a heat map indicative of non-diseased regions obtained based on a complementary attention mechanism.

In some examples, the complementary attention heat map may be a complementary image of the attention heat map. In some examples, the size and format of the complementary attention heat map may be the same as the size and format of the attention heat map.

As described above, the artificial neural network module 420 may include the first artificial neural network 421 (see fig. 5).

In some examples, the first artificial neural network 421 may use one or more deep neural networks to automatically identify features in the tissue image.

In some examples, the first artificial neural network 421 may be used to receive the tissue image pre-processed by the pre-processing module and generate one or more feature maps. In some examples, the first artificial neural network 421 may be constructed by, for example, combining multiple layers of low-level features (pixel-level features). In this case, an abstract description of the tissue image can be realized.

In some examples, the first artificial neural network 421 may include an input layer, an intermediate layer, and an output layer connected in series. The input layer may be configured to receive an image of the tissue pre-processed by the pre-processing module. The middle layer is configured to be used for extracting a feature map based on the tissue image, and the output layer is configured to be used for outputting the feature map.

In some examples, the tissue image input to the artificial neural network module 420 may be a matrix of pixels, for example, a matrix of pixels that may be three-dimensional. The length and width of the three-dimensional matrix may represent the size of the image and the depth of the three-dimensional matrix represents the color channels of the image. In some examples, the depth may be 1 (i.e., the tissue image is a grayscale image), and in some examples, the depth may be 3 (i.e., the tissue image is a color image in RGB color mode).

In some examples, the first artificial neural network 421 may employ a convolutional neural network. Because the convolutional neural network has the advantages of local receptive field, weight sharing and the like, the training of parameters can be greatly reduced, the processing speed can be improved, and the hardware cost can be saved. In addition, the convolutional neural network can more effectively identify the tissue image.

Fig. 6 is a schematic diagram illustrating a structure of a first artificial neural network 421 according to an example of the present disclosure.

In some examples, the first artificial neural network 421 may have a plurality of intermediate layers, and the intermediate layers may include a plurality of neurons or nodes, and each neuron or node in the intermediate layers may apply an excitation function (e.g., a ReLU (received linear unit) function, a sigmoid function, or a tanh function) to an output of each neuron or node. The stimulus functions applied by different neurons affect the stimulus functions applied by other neurons.

In some examples, as shown in fig. 6, the middle layer of the first artificial neural network 421 may include multiple convolutional layers and multiple pooling layers. In some examples, the convolutional layers and the pooling layers may be combined alternately. In some examples, the tissue image may pass sequentially through the first convolutional layer C1, the first pooling layer S1, the second convolutional layer C2, the second pooling layer S2, the third convolutional layer C3, and the third pooling layer S3. In this case, the convolution processing and the pooling processing can be alternately performed on the tissue image.

In other examples, the first artificial neural network 421 may not include a pooling layer, thereby being able to avoid losing data during pooling and being able to simplify the network structure.

In some examples, the convolutional layer may utilize a convolutional core to convolve the tissue image in a convolutional neural network. In this case, features with higher abstraction can be obtained to make the matrix depth deeper.

In some examples, the convolution kernel size may be 3 x 3. In other examples, the convolution kernel size may be 5x 5. In some examples, a 5 × 5 convolution kernel may be used for the first convolution layer C1, and a 3 × 3 convolution kernel may be used for the other convolution layers. In this case, training efficiency can be improved. In some examples, the size of the convolution kernel may be set to any size. In this case, the size of the convolution kernel can be selected according to the size of the image and the calculation cost.

In some examples, the pooling layer may also be referred to as a downsampling layer. In some examples, the input tissue image may be processed using pooling approaches such as max-pooling, mean-pooling, or random-pooling. Under the condition, through the pooling operation, on one hand, the feature dimensionality can be reduced, and the operation efficiency is improved, and on the other hand, the convolutional neural network can extract more abstract high-level features, so that the accuracy of tissue lesion identification is improved.

In addition, in some examples, in the convolutional neural network, the number of convolutional layers and pooling layers may be increased correspondingly according to circumstances. In this case, the convolutional neural network can also be made to extract higher-level features more abstract, so as to further improve the accuracy of tissue lesion identification.

In some examples, after the pre-processed tissue image passes through the first artificial neural network 421, a feature map corresponding to the tissue image may be output. In some examples, the feature map may have multiple depths. In some examples, after the pre-processed tissue image passes through the first artificial neural network 421, a plurality of feature maps may be output. In some examples, multiple feature maps may each correspond to a feature. In some examples, tissue lesion recognition may be performed on the tissue image based on features corresponding to the feature map.

In some examples, the feature map may be sequentially deconvoluted and upsampled before the first artificial neural network 421 outputs the feature map. In some examples, the feature map may undergo multiple deconvolution and upsampling processes. For example, the feature map may sequentially pass through a first deconvolution layer, a first upsampling layer, a second deconvolution layer, a second upsampling layer, a third deconvolution layer, and a third upsampling layer. In this case, it is possible to change the size of the feature map and retain data information of a part of the tissue image.

In some examples, the number of deconvolution layers may be the same as the number of convolution layers, and the number of pooling layers (downsampling layers) may be the same as the number of upsampling layers. This makes it possible to make the size of the feature map the same as that of the tissue image.

In some examples, tissue images processed by the convolutional layer (pooling layer) may be selected for convolution before the feature map is passed through the deconvolution layer (upsampling). For example, before the feature map enters the second deconvolution layer (second upsampling layer), the feature map may be convolved with the output image of the second convolution layer C2 (second pooling layer S2). Before the feature map enters the third deconvolution layer (third upsampling layer), the feature map may be convolved with the output image of the first convolution layer C1 (first pooling layer S1). In this case, data information lost when passing through the pooling layer or the convolutional layer can be supplemented.

In some examples, after the feature map is generated by the first artificial neural network 421, an attention heat map matching the feature map may be generated by the second artificial neural network 422.

In this embodiment, the second artificial neural network 422 is an artificial neural network with attention mechanism. In some examples, the output image of the second artificial neural network 422 may include an attention heat map and a complementary attention heat map.

In some examples, the second artificial neural network 422 may include an input layer, an intermediate layer, and an output layer connected in series. The input layer is configured to receive recognition results of partial weights or tissue lesion recognition through the feature map and the third artificial neural network 423. The intermediate layer may be configured for a feature weight obtained based on a partial weight of the third artificial neural network 423 or a tissue lesion recognition result. The intermediate layer may be configured for generating a heat of attention map and/or a complementary heat of attention map based on the feature map and the feature weights. The output layer is configured to be usable for outputting the attention heat map and/or the complementary attention heat map. In some examples, the feature map may be generated by the first artificial neural network 421.

In some examples, the attention mechanism may be to selectively screen out and focus on a small amount of important information from a large amount of information of the input feature map.

In some examples, the attention heat map may be an image that represents attention in the form of a heat map. In general, pixels of corresponding positions in the attention heat map, which appear in red or white, have a large influence on tissue lesion recognition in the tissue image. Pixels in corresponding positions in the attention heat map that appear blue or black have less effect on tissue lesion identification in the tissue image.

In some examples, the individual feature maps may be weighted with feature weights and the attention heat map is derived. In some examples, the feature weights may be obtained through an attention mechanism. In some examples, the attention mechanism may include, but is not limited to, a Channel Attention Mechanism (CAM), a gradient-based channel attention mechanism (Grad-CAM), a gradient-based enhanced channel attention mechanism (Grad-CAM + +), or a Spatial Attention Mechanism (SAM), among others.

In some examples, when the third artificial neural network 423 has a global pooling layer and a fully-connected layer. In some examples, the feature weights may be weights in the third artificial neural network 423 to an output layer of the third artificial neural network 423 through a fully-connected layer. For example, in the case where the tissue image is a fundus image, the third artificial neural network 423 may receive a feature map of the fundus image and obtain a first recognition result (described later). If the first recognition result is "macula lutea", weights of recognition results reaching "macula lutea" from each neuron or node of the global pooling layer in the fully-connected layer are extracted as feature weights.

In some examples, the feature weights may be calculated based on the tissue lesion recognition results in the third artificial neural network 423. In some examples, the partial derivatives of the first recognition result (e.g., the probability of tissue lesion) of the third artificial neural network 423 for all pixels in a feature map may be calculated, and the partial derivatives of all pixels of the feature map may be globally pooled to obtain the feature weight corresponding to the feature map.

In some examples, a heat of attention map that matches the signature map may be generated by the second artificial neural network 422. In some examples, a complementary attention heat map may be generated by the second artificial neural network 422. In some examples, two pixel values corresponding to pixels having the same location in the attention heat map and the complementary attention heat map are inversely related. In some examples, the attention heat map and/or the complementary attention heat map may be normalized. In some examples, the sum or product of two pixel values corresponding to pixels with the same position of the attention heat map and the complementary attention heat map is a constant value.

In some examples, the attention heat map and/or the complementary attention heat map may be regularized with a total variation.

In some examples, a feature combination module 424 may be connected at the output layer of the first artificial neural network 421 and the second artificial neural network 422.

In some examples, feature combining module 424 may have an input layer and an output layer, and in some examples, the output layer of feature combining module 424 may be a feature map or a set of feature combinations. In some examples, an input layer of the feature combination module 424 may receive a feature map, a heat of attention map, or a complementary heat of attention map.

In some examples, the feature combination module 424 may feature combine the feature map output by the first artificial neural network 421 and the attention heat map or complementary attention heat map output by the second artificial neural network 422 to form a feature combination set.

In some examples, the set of feature combinations may include at least one of a first set of feature combinations and a second set of feature combinations.

In some examples, the feature combination module 424 may feature combine the feature map output by the first artificial neural network 421 and the attention heat map output by the second artificial neural network 422 to form a first set of feature combinations.

In some examples, the feature combination module 424 may feature combine the feature map output by the first artificial neural network 421 and the complementary attention heat map output by the second artificial neural network 422 to form a second set of feature combinations.

In some examples, feature combining module 424 may output the feature map directly.

In some examples, the feature combination module 424 may also calculate the difference of the feature map and the attention heat map to obtain a first set of feature combinations.

In some examples, the feature combination module 424 may also calculate the difference of the feature map and the complementary attention heat map to obtain a second set of feature combinations.

In some examples, the feature combination module 424 may also calculate a convolution of the feature map and the attention heat map to obtain a first set of feature combinations.

In some examples, the feature combination module 424 may also calculate a convolution of the feature map and the complementary attention heat map to obtain a second set of feature combinations.

In some examples, the feature combination module 424 may also calculate a mean of the feature map and the attention heat map to obtain a first set of feature combinations.

In some examples, the feature combination module 424 may also calculate a mean of the feature map and the complementary attention heat map to obtain a second set of feature combinations.

Further, in other examples, feature combination module 424 may transform the feature map and the attention heat map linearly or non-linearly to obtain a first set of feature combinations.

Further, in other examples, feature combination module 424 may transform the feature map and the complementary attention heat map linearly or non-linearly to obtain a second set of feature combinations.

In some examples, an output layer of feature combination module 424 may output the feature map, the first set of feature combinations, and the second set of feature combinations. In some examples, the feature map, the first feature combination set, and the second feature combination set output by the feature combination module 424 may be input to the third artificial neural network 423 and tissue lesion recognition performed by the third artificial neural network 423.

In some examples, the feature combination module 424 may be incorporated into the third artificial neural network 423 as part of the third artificial neural network 423. In this case, the artificial neural network module 420 may include a first artificial neural network 421, a second artificial neural network 422, and a third artificial neural network 423.

In some examples, where the feature combination module 424 is incorporated into the third artificial neural network 423, the input layer of the third artificial neural network 423 may receive a feature map, a heat of attention map, or a complementary heat of attention map.

In some examples, the third artificial neural network 423 may include an input layer, an intermediate layer, and an output layer connected in sequence. In some examples, the output layer is configured to be operable to output a recognition result reflecting the tissue image. In this case, the recognition result reflecting the tissue image can be output using the third artificial neural network 423. In some examples, the output layer of the third artificial neural network 423 may include a Softmax layer. In some examples, the middle layer of the third artificial neural network 423 may be a fully connected layer.

In some examples, the final classification may be performed by the fully connected layer and the probability that the tissue image belongs to the category of the respective tissue lesion may be finally obtained through the Softmax layer. In this case, the recognition result of the tissue lesion recognition of the tissue image can be obtained based on the probability.

In some examples, the third artificial neural network 423 may include various linear classifiers, such as a single layer of fully connected layers.

In some examples, the third artificial neural network 423 may include various non-linear classifiers. Such as Logistic Regression (Logistic Regression), random Forest (Random Forest), support Vector Machines (Support Vector Machines), etc.

In some examples, the third artificial neural network 423 may include a plurality of classifiers. In some examples, the classifier may give an identification result of tissue lesion identification of the tissue image. For example, in the case where the tissue image is a fundus image, a recognition result of fundus lesion recognition of the fundus image may be given. In this case, fundus lesion recognition can be performed on the fundus image.

In some examples, the output of the third neural network 423 may be values between 0 and 1, which may be used to represent the probability that the tissue image belongs to a category of respective tissue lesions.

In some examples, when the probability that the tissue image belongs to a category of a certain tissue lesion is highest, the category is taken as a recognition result of tissue lesion recognition of the tissue image. For example, if the probability that a tissue image belongs to each tissue lesion type is the highest probability that the tissue image is lesion-free, the tissue image may be determined to have a lesion-free tissue lesion recognition result. For example, in the process of identifying a fundus oculi lesion in a fundus oculi image, if the prediction probabilities of macular degeneration and the non-lesion state output from the third artificial neural network 423 are 0.8 and 0.2, respectively, it can be considered that the fundus oculi image has a macular degeneration.

In some examples, the third artificial neural network 423 may output recognition results that match the tissue image. In some examples, the recognition results may include a first recognition result when the attentional mechanism is not used, a second recognition result when the attentional mechanism is used, and a third recognition result when the attentional mechanism and the complementary attentional mechanism are used.

In some examples, the third artificial neural network 423 may perform tissue lesion recognition on the feature map output by the feature combination module 424 and obtain a first recognition result.

In some examples, the third artificial neural network 423 may perform tissue lesion recognition on the first feature combination set output by the feature combination module 424 and obtain a second recognition result.

In some examples, the third artificial neural network 423 may perform tissue lesion recognition on the second feature combination set output by the feature combination module 424 and obtain a third recognition result.

In some examples, the recognition results may include both lesion and non-lesion results. In some examples, the recognition result may also include no lesions or a specific type of lesion. For example, in the case where the tissue image is a fundus image, the recognition result may include, but is not limited to, one of no lesion, hypertensive retinopathy, or diabetic retinopathy. In this case, a recognition result of fundus lesion recognition of the fundus image can be obtained. In some examples, the recognition result of one tissue image may be various. For example, the recognition result may be both a result of hypertensive retinopathy and a result of diabetic retinopathy.

In some examples, the recognition system 40 may also include a determination module.

In some examples, the determination module may receive an output of the artificial neural network module 420. In this case, the output results of the artificial neural network module 420 can be integrated by the determination module and the final recognition result can be output, so that an integrated report can be generated.

In some examples, the first recognition result may be used as a final recognition result of the tissue image. In this case, when the tissue lesion recognition is performed on the tissue image by using the artificial neural network module 420, the tissue lesion recognition may be performed on the tissue image through the main neural network 4200 including the first artificial neural network 421 and the third artificial neural network 423, thereby increasing the recognition speed.

In some examples, the second recognition result may be used as a final recognition result of the tissue image.

As described above, the third recognition result may be acquired based on the complementary attention mechanism. In some examples, a final recognition result of the tissue image may be obtained based on the first recognition result, the second recognition result, and the third recognition result. For example, in some examples, the final recognition result may include the second recognition result and the third recognition result. In some examples, the final recognition result may include the first recognition result and the third recognition result.

In some examples, the summary report generated by the determination module may include at least one of the first recognition result, the second recognition result, the third recognition result, and the final recognition result. In some examples, the determination module may color code the tissue image based on the attention heat map to generate a lesion indication map to indicate the lesion region. The summary report generated by the decision module may include a lesion indicator map.

In some examples, the summary report generated by the determination module may include a location of the respective lesion and mark the location with a marking box.

In some examples, the aggregated report generated by the determination module may display the lesion region of the tissue image in a heat map. Specifically, in the heat map, the region with a high probability of being diseased may appear red or white, and the region with a low probability of being diseased may appear blue or black. In this case, the lesion area can be indicated in an intuitive manner.

In some examples, the determination module may also be used for framing of the lesion region. In some examples, the lesion area may be framed by a fixed shape (e.g., a regular shape such as a triangle, circle, quadrilateral, etc.). In some examples, the lesion area may also be delineated. In this case, the lesion region can be visually displayed.

In some examples, the determination module may also be used to delineate a diseased region. For example, the values corresponding to the pixels in the attention heat map may be analyzed, and the pixels with the values larger than the first preset value may be classified as a lesion region, and the pixels with the values smaller than the first preset value may be classified as a non-lesion region.

The identification method of tissue lesion identification based on artificial neural network is implemented by the identification system 40.

In some examples, the identification method includes: acquiring a tissue image and acquiring a recognition result of tissue lesion recognition by using the artificial neural network module 420. In some examples, the tissue image may be a tissue image acquired by an acquisition device. In some examples, the artificial neural network module 420 is trained by a training system 430. In this case, the recognition result of the tissue lesion recognition can be obtained by using the artificial neural network module 420, and the artificial neural network module 420 can be optimized by using the total loss function, so that the accuracy of the tissue lesion recognition can be improved.

Hereinafter, a training method (may be simply referred to as a training method) and a training system for tissue lesion recognition based on an artificial neural network according to the present embodiment will be described in detail with reference to the drawings.

In some examples, the training method may be implemented with a training system 430 for tissue lesion recognition based on an artificial neural network. In this case, the artificial neural network module 420 can be trained using the training system 430.

Fig. 7 is a block diagram illustrating a training system 430 for tissue lesion recognition based on an artificial neural network according to an example of the present disclosure.

In some examples, as shown in fig. 7, training system 430 may include a storage module 431, a processing module 432, and an optimization module 433. In some examples, the storage module 431 may be configured to store a training data set. In some examples, the processing module 432 may utilize the artificial neural network module 420 for feature extraction, generating attention and complementary attention heat maps, and tissue lesion recognition. In some examples, the optimization module 433 may obtain a total loss function based on the recognition results (including the first recognition result, the second recognition result, and the third recognition result) of the tissue lesion recognition to optimize the artificial neural network module 420. In this case, the recognition result of the tissue lesion recognition can be obtained by using the attention mechanism and the complementary attention mechanism, and the total loss function can be obtained based on the recognition result of the tissue lesion recognition, so that the artificial neural network module 420 can be optimized by using the total loss function, thereby improving the accuracy of the tissue lesion recognition of the artificial neural network module 420.

In some examples, the training mode of the artificial neural network module 420 may be weak supervision. In this case, the recognition result with a large amount of information can be obtained by the artificial neural network module 420 using the labeling result with a small amount of information. In some examples, where the annotation result is a text annotation, the location and size of the lesion region may be included in the recognition result. In some examples, the training mode of the artificial neural network module 420 may also be an unsupervised mode, a semi-supervised mode, a reinforcement learning mode, and the like.

In some examples, the artificial neural network module 420 may be trained using the first loss function, the second loss function, and the third loss function. It should be noted that, since the training model and the loss function involved are generally complex, the model generally has no analytical solution, and in some examples, the value of the loss function may be reduced as much as possible by iterating the model parameters for a limited number of times through an optimization algorithm (e.g., batch Gradient Descent (BGD), random gradient descent (SGD), etc.), that is, the analytical solution of the model is found. In some examples, the artificial neural network module 420 may be trained using a back propagation algorithm, in which case the network parameters with the smallest error can be achieved, thereby improving the recognition accuracy.

In some examples, as shown in fig. 8, the training method may include preparing a training data set (step S100); inputting the training data set into the artificial neural network module 420, and obtaining a first recognition result, a second recognition result and a third recognition result which are matched with each inspection image (step S200); calculating a total loss function based on the first recognition result, the second recognition result, and the third recognition result (step S300) and optimizing the artificial neural network module 420 using the total loss function (step S400). In this case, the first recognition result, the second recognition result, and the third recognition result can be obtained, and the total loss function can be obtained based on the first recognition result, the second recognition result, and the third recognition result, so that the artificial neural network module 420 can be optimized using the total loss function, thereby improving the accuracy of tissue lesion recognition of the artificial neural network module 420.

In step S100, a training data set may be prepared. In some examples, the training data set may include a plurality of examination images and either lesion-bearing or lesion-free annotation results associated with the examination images.

In some examples, the training data set may include a plurality of inspection images and annotation images associated with the inspection images.

In some examples, the examination images may be 5-20 ten thousand tissue images from a cooperating hospital with patient information removed. In some examples, the examination image may be a tissue image from a CT scan, a PET-CT scan, a SPECT scan, an MRI, ultrasound, X-ray, mammogram, angiogram, fluoroscopic image, capsule endoscopic photograph, or a combination thereof. In some examples, the inspection image may be a fundus image. In some examples, the examination image may be composed of a lesion region and a non-lesion region. In some examples, the inspection image may be used for training of the artificial neural network module 420.

In some examples, the inspection image may be acquired by the acquisition module 410.

In some examples, the annotation image can include an annotation result with a lesion or an annotation result without a lesion. In some examples, the annotation result may be a true value to measure the size of the loss function.

In some examples, the annotation result can be an image annotation or a text annotation. In some examples, the image annotation may be an annotation box for framing the lesion region by manual annotation.

In some examples, the label box may be a fixed shape, such as a regular shape like a triangle, circle, or quadrilateral. In some examples, the annotation box may also be an irregular shape based on the delineation of the lesion region.

In some examples, the text annotation may be a determination to check whether a lesion exists in the image. Such as "diseased" or "non-diseased". In some examples, the text annotation may also be a type of lesion. For example, in the case where the inspection image is a fundus image, the text label may be "macular degeneration", "hypertensive retinopathy", or "diabetic retinopathy", or the like.

In some examples, the training data set may be stored in storage module 431. In some examples, the storage module 431 may be configured to store the training data set.

In some examples, the training data set may include 30% -60% of examination images without lesion outcome annotation results. In some examples, the training data set may include 10%, 20%, 30%, 40%, 50%, or 60% of the exam images without lesion outcome annotation results.

In some examples, the training data set may be stored using storage module 431. In some examples, the storage module 431 may include the memory 20.

In some examples, the storage module 431 may be configured to store the inspection image and the annotation image associated with the inspection image.

In some examples, artificial neural network module 420 may receive a training data set stored by storage module 431.

In some examples, the training data set may be pre-processed.

In step S200, the training data set may be input to the artificial neural network module 420, and the first recognition result, the second recognition result, and the third recognition result that match the respective inspection images are obtained. In some examples, the training data set may be input to the artificial neural network module 420 to obtain a feature map, a heat of attention map, and a complementary heat of attention map. In some examples, feature extraction may be performed on the inspection image to obtain a feature map. In some examples, the feature map may be processed based on an attention mechanism to obtain an attention heat map. In some examples, the attention heat map may be processed based on a complementary attention mechanism to obtain a complementary attention heat map.

In some examples, step S200 may be implemented with processing module 432. In some examples, processing module 432 may include at least one processor 10.

In some examples, as described above, the artificial neural network module 420 may include a first artificial neural network 421, a second artificial neural network 422, and a third artificial neural network 423.

In some examples, the processing module 432 may be configured to perform feature extraction on the inspection image using the first artificial neural network 421 to obtain a feature map. In some examples, the processing module 432 may be configured for obtaining an attention heat map indicative of a diseased region and a complementary attention heat map indicative of a non-diseased region using the second artificial neural network 422.

In some examples, the processing module 432 may be configured for obtaining an identification result including tissue lesion identification using the third artificial neural network 423. As described above, the third artificial neural network 423 may include an output layer. In some examples, the output layer may be configured to output a recognition result reflecting the inspection image. In this case, the third artificial neural network 423 can output a recognition result reflecting the inspection image.

In some examples, the processing module 432 may identify the inspection image based on the feature map using the third artificial neural network 423 to obtain a first identification result.

In some examples, the processing module 432 may identify the inspection image based on the feature map and the attention heat map using the third artificial neural network 423 to obtain a second identification result.

In some examples, the processing module 432 may utilize the third artificial neural network 423 to identify the inspection image based on the feature map and the complementary attention heat map to obtain a third identification result.

In some examples, the tissue lesion may be a fundus lesion. In this case, the artificial neural network module 420 can be used for fundus lesion recognition of the fundus image.

In step S300, a total loss function may be calculated based on the first recognition result, the second recognition result, and the third recognition result.

In some examples, step S300 may be implemented with the optimization module 433.

In some examples, the optimization module 433 may obtain an overall loss function of the artificial neural network module 420 based on the first, second, and third loss functions. In this case, the artificial neural network module 420 can be optimized with the total loss function.

In some examples, the optimization module 433 may combine the first recognition result with the annotation image to obtain a first loss function when the attentiveness mechanism is not used. In some examples, the first loss function may be used to evaluate a degree of inconsistency between the recognition result and the annotation result of the inspection image when the attention mechanism is not used. In this case, the accuracy of tissue lesion identification by the artificial neural network module 420 when attention is not used can be improved.

In some examples, the optimization module 433 may combine the second recognition result with the annotation image to obtain a second loss function when using the attentiveness mechanism. In some examples, a second loss function may be used to evaluate a degree of inconsistency between the recognition result and the annotation result of the inspection image when the attentiveness mechanism is used. In this case, the accuracy of tissue lesion identification by the artificial neural network module 420 when using the attention mechanism can be improved.

In some examples, the optimization module 433 may combine the third recognition result with an annotation image with an annotation result without a lesion to obtain a third loss function when using a complementary attention mechanism. In some examples, a third loss function may be used to evaluate a degree of inconsistency between the recognition result of the examination image when the complementary attention mechanism is used and the lesion-free recognition. In this case, the accuracy of tissue lesion identification by the artificial neural network module 420 when using the complementary attention mechanism can be improved.

In some examples, the first loss function, the second loss function, and the third loss function may be obtained by an error loss function. In some examples, the error loss function may be a correlation function, an L1 loss function, an L2 loss function, a Huber loss function, or the like, used to evaluate the correlation between the true value (i.e., the annotated result) and the predicted value (i.e., the identified result).

In some examples, the overall loss function may include a first loss term, a second loss term, and a third loss term.

In some examples, the first loss term may be positively correlated with the first loss function. In this case, the degree of inconsistency between the recognition result and the labeling result of the examination image when the attention mechanism is not used can be evaluated using the first loss term, so that the accuracy of tissue lesion recognition can be improved.

In some examples, the second loss term may be positively correlated with a difference of the second loss function and the first loss function. In some examples, the second loss term may be a constant value when the second loss function is less than the first loss function. In this case, the degree of inconsistency between the recognition result of the inspection image when the attention mechanism is used and the recognition result when the attention mechanism is not used can be evaluated using the second loss term.

In some examples, the second loss term can be positively correlated with a difference of the second loss function and the first loss function. Specifically, when the difference between the second loss function and the first loss function is greater than zero, the difference between the second loss function and the first loss function may be set as the second loss term, and when the difference between the second loss function and the first loss function is less than zero, the second loss term may be set to zero. In this case, the degree of inconsistency between the first recognition result and the second recognition result can be evaluated using the second loss term, so that the second recognition result can be brought closer to the annotation result with respect to the first recognition result.

In some examples, the third loss term may be positively correlated with the third loss function. In this case, the degree of inconsistency between the third recognition result and the lesion-free labeling result of the examination image when the complementary attention mechanism is used can be evaluated using the third loss term, so that the occurrence of erroneous judgment or missing judgment can be reduced.

In some examples, the total loss function may further include a fourth loss term. In some examples, the fourth loss term may be a regularization term. In some examples, the fourth loss term may be a regularization term for the attention heat map. In some examples, the regularization term may be obtained based on a full variation. In this case, the artificial neural network module 420 may be inhibited from overfitting.

In some examples, the overall loss function may include loss term weight coefficients that match the individual loss terms. In some examples, the overall loss function may further include a first loss term weight coefficient matching the first loss term, a second loss term weight coefficient matching the second loss term, a third loss term weight coefficient matching the third loss term, a fourth loss term weight coefficient matching the fourth loss term, and so on.

In some examples, the first loss term may be multiplied by a first loss term weight coefficient, the second loss term may be multiplied by a second loss term weight coefficient, the third loss term may be multiplied by a third loss term weight coefficient, the fourth loss term may be multiplied by a fourth loss term weight coefficient, and the fifth loss term may be multiplied by a fifth loss term weight coefficient. Thus, the influence degree of each loss term on the total loss function can be adjusted through the loss term weight coefficient.

In some examples, the loss term weight coefficient may be set to 0. In some examples, the loss term weighting factor may be set to a positive number. In this case, since each loss term is a non-negative number, the value of the total loss function can be made not less than zero.

In some examples, the functional form of the total loss function may be:

wherein L is the total loss function, λ ₁ Is the first loss term weight coefficient, λ ₂ Is the second loss term weight coefficient, λ ₃ Is the third loss term weight coefficient, λ ₄ Is a fourth loss term weight coefficient, F is an error loss function, X is an inspection image, F (X) is a feature map generated after the inspection image X passes through the first artificial neural network 421, l (X) is a labeling result of the inspection image X, max is a maximum function, C is a classifier function for outputting a recognition result based on an input feature map or a feature combination set, margin is a preset parameter, l is an error loss function, X is an inspection image, F (X) is a feature map generated after the inspection image X passes through the first artificial neural network 421, max is a maximum function, C is a classifier function for outputting a recognition result based on an input feature map or a feature combination set, and ₀ for the labeling result without lesion, M (X) is the attention heat map matched with the inspection image X,

for a complementary attention heat map matching the inspection image X, "·" in the functional expression of the total loss function is a dot product of a matrix, and Regularize (M) is a regular term for the attention heat map M. In some examples, the classifier function may be implemented by a third artificial neural network 423.

In some examples, the optimization module 433 may obtain an overall loss function including a first loss term based on the first loss function, a second loss term based on a difference of the second loss function and the first loss function, and a third loss term based on the third loss function using the first loss function, the second loss function, and the third loss function and optimize the artificial neural network module 420 using the overall loss function.

In some examples, the total loss function may further include a fifth loss term. In some examples, the fifth loss term may be a total area term of the attention heat map. Specifically, the total area term of the attention heat map may be an area determined as a lesion area in the attention heat map. In some examples, the total area term of the attention heat map M (X) may be represented by the formula SUM (M (X)). In some examples, the artificial neural network module 420 may be trained with a fourth loss term to make the lesion area within the attention heat map smaller. In this case, it is possible to estimate the area of the lesion region within the attention heat map using the fifth loss term and control the number of pixels in the attention heat map that have a greater influence on the recognition result, thereby limiting the attention of the network to pixels that have a greater influence on the recognition result. Thus, the accuracy of lesion region identification can be increased.

In some examples, the total loss function may further include a sixth loss term. In some examples, the sixth loss term may be used to evaluate a degree of inconsistency between the framed region of the lesion region in the recognition result and the labeling frame of the lesion region labeled manually in the labeling image.

In step S400, the artificial neural network module 420 may be optimized using the total loss function.

In some examples, step S400 may be implemented with the optimization module 433.

In some examples, the optimization module 433 may optimize the artificial neural network module 420 with a total loss function to minimize the total loss function. In this case, the total loss function can be minimized to improve the accuracy of tissue lesion identification by the artificial neural network module 420.

In some examples, the optimization module 433 may obtain a total loss function based on the first loss term, the second loss term, the third loss term, and the total area term of the attention heat map, and optimize the artificial neural network module 420 with the total loss function to obtain the artificial neural network module 420 that may be used for tissue lesion recognition. This can further improve the accuracy of tissue lesion recognition by the artificial neural network module 420.

In some examples, the optimization module 433 may adjust the total loss function by changing weights of the first, second, third, and fourth loss terms.

In some examples, the optimization module 433 may optimize the artificial neural network module 420 based on the first and sixth loss terms as a total loss function (i.e., setting the loss term weight coefficients of the other loss terms to zero). Thus, the accuracy of the attention heat map and the complementary attention heat map generated by the second artificial neural network 422 can be improved.

In some examples, the loss term weighting coefficients in the overall loss function may be modified during the optimization process.

In some examples, the optimization module 433 may iterate through the parameters in the total loss function a plurality of times with an optimization algorithm to reduce the value of the total loss function. For example, in the present embodiment, the value of the loss function may be reduced by randomly selecting a set of parameters of the input function by a small-batch stochastic gradient descent (mini-batch stochastic gradient) algorithm, and then iterating the parameters a plurality of times.

In some examples, the training is suspended when the total loss function is less than a second preset value or the number of iterations exceeds three preset values.

In some examples, the optimization module 433 may pre-train the artificial neural network module 420 without applying an attention mechanism and then train the artificial neural network module 420 with applying the attention mechanism. In this case, the training speed can be increased.

In some examples, the optimization module 433 may train the first artificial neural network 421, the second artificial neural network 422, and the third artificial neural network 423 simultaneously. In this case, the training speed can be increased.

In some examples, after training is complete, the optimization module 433 may employ, for example, 0-20000 tissue images (e.g., fundus images) as test tissue images to compose a test set.

In some examples, the test tissue image may be used for post-training testing of the artificial neural network module 420.

Fig. 9 (a) is a schematic diagram showing an example of a lesion region of a fundus image obtained without using attention mechanism training according to an example of the present disclosure. Fig. 9 (b) is a schematic diagram showing an example of a lesion region of a fundus image obtained using a complementary attention mechanism training according to an example of the present disclosure.

In some examples, the accuracy of tissue lesion identification is higher for fundus images trained using a complementary attention mechanism. As an example of the non-use attentiveness mechanism, fig. 9 (a) shows a lesion region a of a fundus image obtained without training using the attentiveness mechanism. As an example of the complementary attention mechanism, fig. 9 (B) shows a lesion region B of a fundus image obtained using training of the complementary attention mechanism.

While the present disclosure has been described in detail in connection with the drawings and examples, it should be understood that the above description is not intended to limit the disclosure in any way. Those skilled in the art can make modifications and variations to the present disclosure as needed without departing from the true spirit and scope of the disclosure, and such modifications and variations are intended to be within the scope of the disclosure.

Claims

1. An identification method based on an artificial neural network is characterized by comprising the following steps:

acquiring a tissue image; receiving the tissue image and carrying out lesion recognition on the tissue image by utilizing an artificial neural network module, wherein the artificial neural network module comprises a first artificial neural network, a second artificial neural network and a third artificial neural network, the first artificial neural network is configured to be capable of carrying out feature extraction on the tissue image so as to obtain a feature map, the second artificial neural network is configured to be capable of obtaining an attention heat map indicating a lesion region, the third artificial neural network is configured to be capable of carrying out recognition on the tissue image based on the feature map, and the training step of the artificial neural network module comprises the following steps: preparing a training data set including a plurality of examination images and an annotation image associated with the examination images, the annotation image including an annotation result with or without a lesion, feature extracting the examination images using the first artificial neural network to obtain a feature map, obtaining an attention heat map indicating a lesion region using the second artificial neural network, the examination images being composed of a lesion region and a non-lesion region, identifying the examination images using the third artificial neural network based on the feature map to obtain a first identification result, identifying the examination images using the third artificial neural network based on the feature map and the attention heat map to obtain a second identification result, combining the first identification result and the annotation image to obtain a first loss function when an attention mechanism is not used, combining the second identification result and the annotation image to obtain a second loss function when the attention mechanism is used, obtaining a total loss function including a first loss term based on the first loss function and a second loss term based on the second loss function using the first loss function and the second loss function, and optimizing the total loss term using the artificial neural network loss function including the first loss term and the second loss function.

2. The identification method of claim 1,

the total loss function further includes a total area term of the attention heat map, which is used to estimate the area of the lesion region.

3. The identification method of claim 1,

the total loss function further includes a regularization term of the attention heat map, the regularization term being obtained based on a total variation.

4. The identification method of claim 1,

the first loss function is used for evaluating the degree of inconsistency between the identification result and the annotation result of the inspection image when the attention mechanism is not used, and the second loss function is used for evaluating the degree of inconsistency between the identification result and the annotation result of the inspection image when the attention mechanism is used.

5. The identification method of claim 1,

the overall loss function includes a third loss term for evaluating a degree of inconsistency between the recognition result of the examination image when a complementary attention mechanism is used and the labeling result without a lesion.

6. The identification method of claim 1,

training the first artificial neural network, the second artificial neural network, and the third artificial neural network simultaneously.

7. The identification method of claim 1,

and pre-training the artificial neural network module, and then training the artificial neural network module by using the attention mechanism.

8. The identification method of claim 1,

the tissue lesion is a fundus lesion, and the examination image is a fundus image.

9. The identification method of claim 1,

in the optimization process, the total loss function is changed by modifying the loss term weighting coefficients in the total loss function.

10. The identification method of claim 1,

and carrying out multiple iterations on the total loss function through an optimization algorithm so as to reduce the value of the total loss function.