WO2022226949A1 - 基于人工神经网络的组织病变识别的识别方法及识别系统 - Google Patents
基于人工神经网络的组织病变识别的识别方法及识别系统 Download PDFInfo
- Publication number
- WO2022226949A1 WO2022226949A1 PCT/CN2021/091227 CN2021091227W WO2022226949A1 WO 2022226949 A1 WO2022226949 A1 WO 2022226949A1 CN 2021091227 W CN2021091227 W CN 2021091227W WO 2022226949 A1 WO2022226949 A1 WO 2022226949A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- neural network
- artificial neural
- examples
- tissue
- loss function
- Prior art date
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 276
- 230000003902 lesion Effects 0.000 title claims abstract description 197
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000000605 extraction Methods 0.000 claims abstract description 12
- 238000012549 training Methods 0.000 claims description 62
- 230000007246 mechanism Effects 0.000 claims description 58
- 230000000295 complement effect Effects 0.000 claims description 55
- 238000007689 inspection Methods 0.000 claims description 48
- 238000002372 labelling Methods 0.000 claims description 16
- 230000006870 function Effects 0.000 description 119
- 239000010410 layer Substances 0.000 description 96
- 238000011176 pooling Methods 0.000 description 26
- 238000005457 optimization Methods 0.000 description 21
- 238000010586 diagram Methods 0.000 description 17
- 238000012545 processing Methods 0.000 description 15
- 238000013527 convolutional neural network Methods 0.000 description 12
- 210000002569 neuron Anatomy 0.000 description 10
- 238000010606 normalization Methods 0.000 description 10
- 238000007781 pre-processing Methods 0.000 description 10
- 230000000875 corresponding effect Effects 0.000 description 9
- 239000011159 matrix material Substances 0.000 description 6
- 238000013473 artificial intelligence Methods 0.000 description 5
- 238000004590 computer program Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000002591 computed tomography Methods 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 206010012689 Diabetic retinopathy Diseases 0.000 description 3
- 206010025421 Macule Diseases 0.000 description 3
- 206010038926 Retinopathy hypertensive Diseases 0.000 description 3
- 230000002596 correlated effect Effects 0.000 description 3
- 238000002059 diagnostic imaging Methods 0.000 description 3
- 230000005284 excitation Effects 0.000 description 3
- 201000001948 hypertensive retinopathy Diseases 0.000 description 3
- 238000002604 ultrasonography Methods 0.000 description 3
- 206010015150 Erythema Diseases 0.000 description 2
- 239000002775 capsule Substances 0.000 description 2
- 210000001072 colon Anatomy 0.000 description 2
- 210000003238 esophagus Anatomy 0.000 description 2
- 210000002429 large intestine Anatomy 0.000 description 2
- 208000002780 macular degeneration Diseases 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000002603 single-photon emission computed tomography Methods 0.000 description 2
- 210000000813 small intestine Anatomy 0.000 description 2
- 210000002784 stomach Anatomy 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000005314 correlation function Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000001839 endoscopy Methods 0.000 description 1
- 231100000321 erythema Toxicity 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 238000012905 input function Methods 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
Definitions
- the present disclosure generally relates to a recognition method and recognition system for tissue lesion recognition based on an artificial neural network.
- convolutional neural network is usually used in the recognition of medical images by artificial intelligence technology.
- the convolutional structure of the convolutional neural network can reduce the amount of memory occupied by the deep network. It has three key operations, one is the local receptive field, the second is weight sharing, and the third is the pooling layer. In this way, the number of parameters of the network can be effectively reduced, and the overfitting problem of the convolutional neural network can be alleviated.
- the structure of convolutional neural network can better adapt to the structure of medical images and extract and recognize features.
- the lesion area is relatively small and irregularly distributed.
- the convolutional neural network applying the attention mechanism tends to ignore the low-attention lesion area in the attention heat map, resulting in errors. Therefore, the accuracy of tissue lesion identification in these lesion areas is low.
- the present disclosure is made in view of the above-mentioned state of the art, and its object is to provide an artificial neural network-based tissue lesion identification identification method and identification system that can effectively improve the accuracy of tissue lesion identification.
- a first aspect of the present disclosure provides a method for identifying tissue lesions based on an artificial neural network, which includes: acquiring a tissue image, where the tissue image is a tissue image collected by a collection device; using an artificial neural network module to receive The tissue image and pathological identification of the tissue image, the artificial neural network module includes a first artificial neural network, a second artificial neural network and a third artificial neural network, the first artificial neural network is configured to be able to Feature extraction is performed on the tissue image to obtain a feature map, the second artificial neural network is configured to be able to obtain an attention heat map indicating a lesion area, and the third artificial neural network is configured to be able to analyze the tissue image based on the feature map.
- the training step of the artificial neural network module includes: preparing a training data set, the training data set includes a plurality of inspection images and annotated images associated with the inspection images, and the annotated images include annotated results with lesions Or the labeling result of no lesions, use the first artificial neural network to perform feature extraction on the inspection image to obtain a feature map, and use the second artificial neural network to obtain an attention heat map indicating the lesion area and indicating the non-lesion area.
- the complementary attention heat map of The third artificial neural network identifies the inspection image based on the feature map and the attention heat map to obtain a second recognition result, and uses the third artificial neural network to perform the inspection based on the feature map and the complementary attention heat map.
- the image is recognized to obtain a third recognition result
- the first recognition result is combined with the labeled image to obtain the first loss function when the attention mechanism is not used
- the second recognition result is combined with the annotation image to obtain the second loss function when using the attention mechanism
- combining the third recognition result with the labeled image with the labeling result without lesions to obtain the third loss function when using the complementary attention mechanism a loss function, using the first loss function, the second loss function, and the third loss function to obtain a first loss term based on the first loss function, a first loss term based on the second loss function and the A second loss term of the difference of the first loss function, and a total loss function based on the third loss term of the third loss function and using the total loss function to optimize the artificial neural network module.
- the recognition result of tissue lesion recognition can be obtained by using the artificial neural network module, and the artificial neural network module can be optimized by using the total loss function, so that the accuracy of tissue lesion recognition can be improved.
- the total loss function further includes a total area item of the attention heat map
- the total area item is used to evaluate the area of the lesion area.
- the fifth loss term can be used to evaluate the area of the lesion area in the attention heatmap and control the number of pixels in the attention heatmap that have a greater impact on the recognition result, so that the network's attention is limited to The recognition result affects more pixels.
- the total loss function further includes a regular term for the attention heat map.
- the artificial neural network module can be suppressed from overfitting.
- the first artificial neural network, the second artificial neural network and the third artificial neural network are The neural network is trained concurrently. In this case, the training speed can be accelerated.
- the third artificial neural network includes an input layer, an intermediate layer, and an output layer that are connected in sequence, and the The output layer is configured to output a recognition result reflecting the inspection image.
- the recognition result reflecting the tissue image can be output using the third artificial neural network.
- the training mode of the artificial neural network module is weak supervision.
- the identification results with more information can be obtained through the artificial neural network module by using the annotation results with less information.
- the first loss function is used to evaluate the inspection image when the attention mechanism is not used The degree of inconsistency between the recognition result and the labeling result. In this case, the accuracy of tissue lesion recognition by the artificial neural network module without using the attention mechanism can be improved.
- the second loss function is used to evaluate the performance of the inspection image when the attention mechanism is used.
- the degree of inconsistency between the recognition result and the labeling result it is possible to improve the accuracy of tissue lesion recognition when the artificial neural network module uses the attention mechanism.
- the third loss function is used to evaluate the inspection image when the complementary attention mechanism is used.
- the degree of inconsistency between the identification results and the labelling results without lesions is used.
- the accuracy of tissue lesion recognition by the artificial neural network module when using the complementary attention mechanism can be improved.
- the artificial neural network module is optimized by using the total loss function, so that the total loss function minimize.
- the total loss function can be minimized to improve the accuracy of tissue lesion identification by the artificial neural network module.
- the tissue lesions are fundus lesions.
- the recognition result of the fundus lesions of the fundus image can be obtained using the artificial neural network module.
- a second aspect of the present disclosure provides an artificial neural network-based identification system for tissue lesion identification, characterized in that the identification method provided in the first aspect of the present disclosure is used to identify tissue lesions.
- tissue lesion recognition can be performed on the tissue image using the recognition system.
- tissue lesion identification identification method and identification system that can effectively improve the accuracy of tissue lesion identification.
- FIG. 1 is a schematic diagram illustrating an electronic device involved in an example of the present disclosure.
- FIG. 2 is an image showing a tissue involved in an example of the present disclosure.
- FIG. 3 is a block diagram showing the structure of an artificial neural network-based tissue lesion identification recognition system according to an example of the present disclosure.
- FIG. 4 is a block diagram illustrating an example of an artificial neural network module involved in an example of the present disclosure.
- FIG. 5 is a block diagram showing a modification of the artificial neural network module according to the example of the present disclosure.
- FIG. 6 is a schematic diagram showing the structure of the first artificial neural network involved in the example of the present disclosure.
- FIG. 7 is a block diagram showing the structure of the training system for tissue lesion recognition based on the artificial neural network according to the example of the present disclosure.
- FIG. 8 is a flow chart illustrating a training method for tissue lesion recognition based on an artificial neural network according to an example of the present disclosure.
- FIG. 9( a ) is a schematic diagram showing an example of a fundus image obtained by training without using the attention mechanism involved in the example of the present disclosure.
- FIG. 9( b ) is a schematic diagram illustrating an example of a lesion area of a fundus image obtained by training using a complementary attention mechanism according to an example of the present disclosure.
- FIG. 1 is a schematic diagram illustrating an electronic device according to an embodiment of the present disclosure.
- the recognition system 40 for tissue lesion recognition based on the artificial neural network may use an electronic device 1 (eg, a computer) as a carrier.
- the electronic device 1 may include one or more processors 10 , a memory 20 and a computer program 30 arranged in the memory 20 .
- the one or more processors 10 may include a central processing unit, a graphics processing unit, and any other electronic components capable of processing data.
- processor 10 may execute instructions stored on memory 20 .
- memory 20 may be a computer-readable medium that can be used to carry or store data.
- the memory 20 may include, but is not limited to, non-volatile memory or flash memory (Flash Memory).
- flash Memory flash memory
- memory 20 may also be, for example, ferroelectric random access memory (FeRAM), magnetic random access memory (MRAM), phase change random access memory (PRAM), or resistive random access memory (RRAM).
- FeRAM ferroelectric random access memory
- MRAM magnetic random access memory
- PRAM phase change random access memory
- RRAM resistive random access memory
- the memory 20 may also be other types of readable storage media, such as read-only memory (ROM), random access memory (RAM), programmable read-only memory (Programmable Read) -only Memory, PROM), Erasable Programmable Read Only Memory (EPROM), One-time Programmable Read-Only Memory (OTPROM), Electronic Erasable Rewritable Read Only Memory Memory (Electrically-Erasable Programmable Read-Only Memory, EEPROM), CD-ROM (CompactDisc Read-Only Memory, CD-ROM).
- ROM read-only memory
- RAM random access memory
- PROM programmable read-only memory
- PROM Programmable Read Only Memory
- EPROM Erasable Programmable Read Only Memory
- OTPROM One-time Programmable Read-Only Memory
- EEPROM Electrical Erasable Rewritable Read Only Memory Memory
- CD-ROM CompactDisc Read-Only Memory
- memory 20 may be optical disk storage, magnetic disk storage, or tape storage. Thereby, the appropriate memory 20 can be selected according to different situations.
- computer program 30 may include instructions executed by one or more processors 10 that may cause identification system 40 to perform tissue lesion identification on tissue images.
- computer program 30 may be deployed on a local computer or on a server in the cloud.
- Computer program 30 may be stored on a computer-readable medium.
- Computer readable storage media may include portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM) or flash memory, optical fiber, portable compact disk read only memory (CD-ROM), one or more of optical storage devices, magnetic storage devices.
- FIG. 2 is an image showing a tissue involved in an example of the present disclosure.
- FIG. 3 is a block diagram showing the structure of a recognition system 40 for tissue lesion recognition based on an artificial neural network according to an example of the present disclosure.
- the identification system 40 for tissue lesion identification based on an artificial neural network may be used to perform tissue lesion identification of tissue images and obtain identification results.
- identification system 40 for tissue lesion identification may also be referred to as identification system 40 .
- the identification system 40 may include an acquisition module 410 , an artificial neural network module 420 , and an artificial neural network-based training system 430 for tissue lesion identification.
- acquisition module 410 may be used to acquire tissue images.
- the artificial neural network module 420 may be used to perform feature extraction on tissue images, tissue lesion identification, etc., and obtain identification results of tissue lesion identification.
- the artificial neural network based tissue lesion identification training system 430 may be used to train the artificial neural network module 420 .
- the training system 430 may utilize the first recognition result, the second recognition result and the third recognition result obtained by the artificial neural network module 420, and obtain the total recognition result based on the first recognition result, the second recognition result and the third recognition result Loss function to optimize the artificial neural network module 420.
- the first recognition result, the second recognition result, and the third recognition result can be obtained, and the total loss function can be obtained based on the first recognition result, the second recognition result, and the third recognition result, so that the total loss function can be utilized
- the artificial neural network module 420 is optimized, thereby improving the accuracy of tissue lesion identification by the artificial neural network module 420 .
- the artificial neural network-based training system 430 for tissue lesion identification may also be referred to as the training system 430 .
- the recognition system 40 may also include a preprocessing module and a determination module (not shown).
- the tissue image may be from a CT scan, a PET-CT scan, a SPECT scan, an MRI, an ultrasound, an X-ray, a mammogram, an angiogram, a fluorogram, a tissue cavity captured by a capsule endoscope image or a combination thereof.
- tissue images may be acquired by acquisition module 410 .
- the acquisition module 410 may be configured to acquire tissue images, which may be tissue images acquired by acquisition devices such as a camera, an ultrasound imager, or an X-ray scanner.
- the tissue image may be, for example, a fundus image, an esophagus image, a stomach image, an image of the large intestine, an image of the colon, or an image of the small intestine.
- the tissue image may be a fundus image.
- fundus lesion identification can be performed on the fundus image by the identification system 40 .
- the tissue lesion identification may be to identify tissue lesions of the tissue image to obtain an identification result.
- the tissue lesion may be a fundus lesion.
- the artificial neural network module 420 can be used to obtain the recognition result of the fundus lesion of the fundus image.
- the tissue image may be composed of diseased and non-diseased regions.
- tissue images (color images) with tissue lesions generally contain obvious features such as erythema, redness, etc. Therefore, these features can be automatically extracted and identified using a trained artificial neural network to help patients identify possible lesions. . In this way, the accuracy and speed of recognition can be improved, and at the same time, problems such as large errors and long time-consuming caused by human physicians interpreting images one by one based on their own experience can be reduced.
- the tissue image may be classified by function.
- the tissue images may be inspection images, annotation images (described later).
- the images input to the artificial neural network module 420 may be tissue images.
- tissue lesion recognition can be performed on the tissue image through the artificial neural network module 420 .
- identification system 40 may be used for tissue lesion identification of tissue images. In some examples, after the tissue image enters the identification system 40, operations such as preprocessing, feature extraction, and tissue lesion identification may be performed on the tissue image.
- the recognition system 40 may also include a preprocessing module and a judgment module.
- the preprocessing module can be used to preprocess the tissue image and input the preprocessed tissue image to the artificial neural network module 420 .
- the preprocessing module may preprocess the tissue image.
- the preprocessing may include at least one of region of interest detection, image cropping, resizing, and normalization. In this case, it is convenient for the subsequent artificial neural network module 420 to perform tissue lesion identification and judgment on the tissue image.
- the tissue image may be, for example, a fundus image, an esophagus image, a stomach image, an image of the large intestine, an image of the colon, or an image of the small intestine.
- the preprocessing module may include a region detection unit, an adjustment unit, and a normalization unit.
- the region detection unit may detect regions of interest from the tissue image. For example, if the tissue image is a fundus image, a fundus region centered on the optic disc, or a fundus region including the optic disc and centered on the macula can be detected from the fundus image. In some examples, the region detection unit may detect regions of interest in the tissue image by, for example, sampling thresholding, Hough transform.
- the adjustment unit may be used to crop and resize the tissue image. Due to different equipment for collecting tissue images or different shooting conditions, the obtained tissue images may have differences in resolution, size, and the like. In this case, these tissue images can be cropped and resized to reduce discrepancies. In some examples, the tissue image may be cropped in a particular shape. In some examples, the specific shape may include, but is not limited to, square, rectangular, circular or oval, and the like.
- the size of the tissue image can be adjusted to a prescribed size by the adjustment unit.
- the specified size may be 256 ⁇ 256, 512 ⁇ 512, or 1024 ⁇ 1024.
- the examples of the present disclosure are not limited thereto, and in other examples, the size of the tissue image may be any other size.
- the size of the tissue image may be 128 ⁇ 128, 768 ⁇ 768, or 2048 ⁇ 2048.
- the preprocessing module may include a normalization unit.
- the normalization unit can be used to normalize multiple tissue images.
- the normalization manner of the normalization unit is not particularly limited, for example, zero mean, unit standard deviation, or the like may be used. Additionally, in some examples, normalization may also be in the range [0, 1]. In this case, through normalization, the variability of different tissue images can be overcome.
- normalization includes normalization of image format, image slice interval, image intensity, image contract, and image orientation.
- tissue images may be normalized to DICOM format, NIfTI format, or raw binary format.
- FIG. 4 is a block diagram illustrating an example of an artificial neural network module involved in an example of the present disclosure.
- the recognition system 40 may include an artificial neural network module 420 .
- the artificial neural network module 420 may be used to perform tissue lesion identification on tissue images.
- the artificial neural network module 420 may include multiple artificial neural networks.
- the artificial neural network may be trained using one or more processors 10 .
- an artificial neural network can include artificial neurons or nodes that can be used to receive tissue images and perform operations on the tissue images based on weights, and then selectively pass the results of the operations to other neurons or nodes superior.
- weights can be associated with artificial neurons or nodes and constrain the outputs of other artificial neurons at the same time.
- the weights ie, network parameters
- the weights can be determined by iteratively training the artificial neural network with a training dataset (described later).
- the artificial neural network module 420 may include a backbone neural network 4200 and a second artificial neural network 422 .
- the backbone neural network 4200 may include a first artificial neural network 421 , a third artificial neural network 423 and a feature combining module 424 .
- the first artificial neural network 421 may receive a tissue image and perform feature extraction on the tissue image to obtain a feature map.
- the second artificial neural network 422 may receive the feature map and the recognition results from the third artificial neural network 423 and obtain an attention heatmap indicative of diseased regions and a complementary attentional heatmap indicative of non-lesioned regions. It should be noted that, in other examples, the above-mentioned attention heatmap or complementary attention heatmap can also be considered as a feature map.
- feature combination module 424 may receive feature maps, attention heatmaps, and complementary attention heatmaps and output a feature combination set. In some examples, feature combination module 424 may also output feature maps directly.
- the third artificial neural network 423 may receive the feature map or feature combination set and output the identification result of tissue lesion identification of the tissue image.
- the tissue image (eg, the preprocessed tissue image) input to the artificial neural network module 420 may enter the first artificial neural network 421 , and finally the recognition result is output by the third artificial neural network 423 .
- FIG. 5 is a block diagram showing a modification of the artificial neural network module according to the example of the present disclosure.
- the artificial neural network module 420 may include a backbone neural network 4200 and a second artificial neural network 422 .
- the backbone neural network 4200 may include a first artificial neural network 421 and a third artificial neural network 423 .
- the third artificial neural network 423 may have a feature combining function.
- the feature combination module 424 may have a feature combining function.
- the first artificial neural network 421 may receive a tissue image and perform feature extraction on the tissue image to obtain a feature map.
- the second artificial neural network 422 may also obtain a complementary attention heat map indicating non-lesioned regions from the attention heat map.
- the third artificial neural network 423 may receive the feature map, the attention heat map, and the complementary attention heat map and output the recognition result of tissue lesion recognition of the tissue image.
- the attention heatmap may be a heatmap obtained based on an attention mechanism indicating a lesion area.
- an attention heatmap can show how important individual pixels in an organized image are when forming a feature map.
- the complementary attention heatmap may be a heatmap indicative of non-lesioned regions obtained based on a complementary attention mechanism.
- the complementary attention heatmap may be a complementary image of the attention heatmap. In some examples, the complementary attention heatmap may be the same size and format as the attention heatmap.
- the artificial neural network module 420 may include the first artificial neural network 421 (see FIG. 5 ).
- the first artificial neural network 421 may use one or more deep neural networks to automatically identify features in the tissue image.
- the first artificial neural network 421 may be used to receive tissue images preprocessed by a preprocessing module and generate one or more feature maps.
- the first artificial neural network 421 may combine multiple layers of low-level features (pixel-level features), for example. In this case, an abstract description of the tissue image can be achieved.
- the first artificial neural network 421 may include an input layer, an intermediate layer, and an output layer connected in sequence.
- the input layer may be configured to receive tissue images preprocessed by the preprocessing module.
- the intermediate layer is configured to be capable of extracting feature maps based on the tissue image, and the output layer is configured to be capable of outputting feature maps.
- the tissue image input to the artificial neural network module 420 may be converted into a pixel matrix, which may be, for example, a three-dimensional pixel matrix.
- the length and width of the three-dimensional matrix can represent the size of the image, and the depth of the three-dimensional matrix represents the color channel of the image.
- the depth may be 1 (ie, the tissue image is a grayscale image), and in some examples, the depth may be 3 (ie, the tissue image is a color image in RGB color mode).
- the first artificial neural network 421 may employ a convolutional neural network. Since the convolutional neural network has the advantages of local receptive field and weight sharing, it can greatly reduce the training of parameters, so it can improve the processing speed and save the hardware overhead. In addition, convolutional neural networks can more effectively identify tissue images.
- FIG. 6 is a schematic diagram showing the structure of the first artificial neural network 421 involved in the example of the present disclosure.
- the first artificial neural network 421 may contain multiple intermediate layers, the intermediate layers may include multiple neurons or nodes, and an excitation function (such as ReLU (rectified linear) may be applied to each neuron or node in the intermediate layer. unit) function, sigmoid function or tanh function, etc.) act on the output of each neuron or node.
- an excitation function such as ReLU (rectified linear) may be applied to each neuron or node in the intermediate layer. unit) function, sigmoid function or tanh function, etc.
- the excitation functions applied by different neurons affect the excitation functions applied by other neurons.
- the intermediate layers of the first artificial neural network 421 may include multiple convolutional layers and multiple pooling layers.
- convolutional layers and pooling layers can be combined alternately.
- the tissue image may be sequentially passed through a first convolutional layer C1, a first pooling layer S1, a second convolutional layer C2, a second pooling layer S2, a third convolutional layer C3, a third pooling layer S3. In this case, the tissue images can be alternately convoluted and pooled.
- the first artificial neural network 421 may not include a pooling layer, thereby avoiding data loss during the pooling process and simplifying the network structure.
- a convolutional layer may convolve an image of tissue in a convolutional neural network with a convolution kernel. In this case, more abstract features can be obtained to make the matrix depth deeper.
- the kernel size can be 3*3. In other examples, the convolution kernel size can be 5*5. In some examples, a 5 ⁇ 5 kernel can be used in the first convolutional layer C1, and a 3 ⁇ 3 kernel can be used in other convolutional layers. In this case, the training efficiency can be improved. In some examples, the size of the convolution kernel can be set to any size. In this case, the size of the convolution kernel can be chosen according to the size of the image and the computational cost.
- the pooling layer may also be referred to as a downsampling layer.
- input tissue images may be processed using pooling approaches such as max-pooling, mean-pooling, or stochastic-pooling. In this case, through the pooling operation, on the one hand, the feature dimension can be reduced and the operation efficiency can be improved;
- the number of layers of convolutional layers and pooling layers may be correspondingly increased according to the situation.
- the convolutional neural network can also be made to extract more abstract high-level features to further improve the accuracy of tissue lesion identification.
- a feature map corresponding to the tissue image may be output.
- feature maps may have multiple depths.
- multiple feature maps can be output.
- multiple feature maps may correspond to one feature respectively.
- tissue lesion identification may be performed on the tissue image based on the features corresponding to the feature maps.
- deconvolution and upsampling may be sequentially performed on the feature map.
- the feature maps can be deconvolved and upsampled multiple times.
- the feature map may sequentially go through the first deconvolution layer, the first upsampling layer, the second deconvolution layer, the second upsampling layer, the third deconvolution layer, and the third upsampling layer.
- the size of the feature map can be changed and the data information of part of the tissue image can be preserved.
- the number of deconvolutional layers may be the same as the number of convolutional layers, and the number of pooling layers (downsampling layers) may be the same as the number of upsampling layers. Thereby, the size of the feature map can be made the same as that of the tissue image.
- the tissue image processed by the convolution layer may be selected for convolution before the feature map is subjected to the deconvolution layer (up-sampling).
- the feature map before the feature map enters the second deconvolution layer (second upsampling layer), the feature map can be convolved with the output image of the second convolution layer C2 (second pooling layer S2 ).
- the feature map Before the feature map enters the third deconvolution layer (third upsampling layer), the feature map can be convolved with the output image of the first convolution layer C1 (first pooling layer S1 ). In this case, the data information lost when passing through the pooling or convolutional layers can be supplemented.
- an attention heat map matching the feature map may be generated by the second artificial neural network 422 .
- the second artificial neural network 422 is an artificial neural network with an attention mechanism.
- the output image of the second artificial neural network 422 may include an attention heatmap and a complementary attention heatmap.
- the second artificial neural network 422 may include an input layer, an intermediate layer, and an output layer connected in sequence.
- the input layer is configured to receive recognition results of partial weights or tissue lesion recognition through the feature map and the third artificial neural network 423 .
- the middle layer may be configured for feature weights obtained based on partial weights of the third artificial neural network 423 or tissue lesion identification results.
- the intermediate layers can be configured to generate attention heatmaps and/or complementary attention heatmaps based on feature maps and feature weights.
- the output layer is configured to output attention heatmaps and/or complementary attention heatmaps.
- the feature map may be generated by the first artificial neural network 421 .
- the attention mechanism can selectively filter out a small amount of important information from the large amount of information in the input feature map and focus on these important information.
- the attention heatmap may be an image that represents attention in a heatmap fashion.
- the pixels at the corresponding positions in red or white in the attention heat map have a greater impact on the identification of tissue lesions in tissue images.
- Pixels at corresponding positions in blue or black in the attention heatmap have less influence on the identification of tissue lesions in tissue images.
- feature weights may be used to weight each feature map to obtain an attention heat map.
- feature weights can be obtained through an attention mechanism.
- the attention mechanism may include, but is not limited to, a channel attention module (CAM), a gradient-based channel attention mechanism (Grad-CAM), a gradient-based enhanced channel attention mechanism (Grad-CAM). CAM++) or spatial attention mechanism (spatial attention module, SAM), etc.
- the feature weights may be weights in the third artificial neural network 423 through the fully connected layer to the output layer of the third artificial neural network 423 .
- the third artificial neural network 423 can receive the feature map of the fundus image, and obtain a first recognition result (described later). If the first recognition result is "macula", extract the weight of the recognition result from each neuron or node in the global pooling layer to the "macula" in the fully connected layer as the feature weight.
- feature weights may be calculated based on the tissue lesion identification results in the third artificial neural network 423 .
- the partial derivatives of the first identification results (eg, the probability of tissue lesions) of the third artificial neural network 423 for all pixels in a feature map may be calculated, and the partial derivatives of all pixels in the feature map may be calculated. Global pooling is performed to obtain the feature weights corresponding to this feature map.
- an attention heatmap that matches the feature map can be generated by the second artificial neural network 422 .
- a complementary attention heatmap may be generated by the second artificial neural network 422 .
- two pixel values corresponding to co-located pixels in the attention heatmap and the complementary attention heatmap are inversely correlated.
- the attention heatmap and/or the complementary attention heatmap may be normalized.
- the sum or product of two pixel values corresponding to pixels in the same location in the attention heatmap and the complementary attention heatmap is a constant value.
- the attention heatmap and/or the complementary attention heatmap may be regularized with total variation.
- a feature combination module 424 may be connected at the output layers of the first artificial neural network 421 and the second artificial neural network 422 .
- the feature combination module 424 may have an input layer and an output layer, and in some examples, the output layer of the feature combination module 424 may be a feature map or feature combination set. In some examples, the input layer of the feature combination module 424 may receive a feature map, an attention heat map, or a complementary attention heat map.
- the feature combination module 424 may perform feature combination of the feature map output by the first artificial neural network 421 and the attention heat map or the complementary attention heat map output by the second artificial neural network 422 to form a feature combination set.
- the feature combination set may include at least one of a first feature combination set and a second feature combination set.
- the feature combining module 424 may perform feature combining of the feature map output by the first artificial neural network 421 and the attention heat map output by the second artificial neural network 422 to form a first feature combination set.
- the feature combination module 424 may perform feature combination of the feature map output by the first artificial neural network 421 and the complementary attention heat map output by the second artificial neural network 422 to form a second feature combination set.
- the feature combination module 424 may output the feature map directly.
- the feature combination module 424 may also calculate the difference between the feature map and the attention heat map to obtain the first feature combination set.
- the feature combination module 424 may also compute the difference between the feature map and the complementary attention heatmap to obtain a second feature combination set
- the feature combination module 424 may also compute the convolution of the feature map and the attention heatmap to obtain the first feature combination set.
- the feature combination module 424 may also compute the convolution of the feature map with the complementary attention heatmap to obtain a second feature combination set.
- the feature combination module 424 may also calculate the mean of the feature map and the attention heat map to obtain the first feature combination set.
- the feature combination module 424 may also calculate the mean of the feature map and the complementary attention heat map to obtain a second feature combination set.
- the feature combination module 424 may perform linear or non-linear transformation on the feature map and the attention heat map to obtain the first feature combination set.
- the feature combination module 424 may perform linear or non-linear transformation on the feature map and the complementary attention heat map to obtain the second feature combination set.
- the output layer of the feature combination module 424 may output a feature map, a first set of feature combinations, and a second set of feature combinations.
- the feature map, the first feature combination set, and the second feature combination set output by the feature combination module 424 may be input to the third artificial neural network 423, and the third artificial neural network 423 performs tissue lesion identification.
- feature combination module 424 may be incorporated into and part of third artificial neural network 423 .
- the artificial neural network module 420 may include a first artificial neural network 421 , a second artificial neural network 422 and a third artificial neural network 423 .
- the input layer of the third artificial neural network 423 may receive a feature map, an attention heat map, or a complementary attention heat map.
- the third artificial neural network 423 may include an input layer, an intermediate layer, and an output layer connected in sequence.
- the output layer is configured to be operable to output recognition results reflecting tissue images.
- the recognition result reflecting the tissue image can be output using the third artificial neural network 423 .
- the output layer of the third artificial neural network 423 may include a Softmax layer.
- the middle layers of the third artificial neural network 423 may be fully connected layers.
- the final classification may be performed by the fully connected layer, and the probability that the tissue image belongs to the category of each tissue lesion is finally obtained through the Softmax layer.
- the identification result of the tissue lesion identification of the tissue image can be obtained based on the probability.
- the third artificial neural network 423 may include various linear classifiers, such as a single layer of fully connected layers.
- the third artificial neural network 423 may include various non-linear classifiers. For example, Logistic Regression, Random Forest or Support Vector Machines, etc.
- the third artificial neural network 423 may include multiple classifiers.
- the classifier may give identification results for tissue lesion identification of the tissue image. For example, in the case where the tissue image is a fundus image, the identification result of the fundus lesion identification of the fundus image can be given. In this case, fundus lesion recognition can be performed on the fundus image.
- the output of the third neural network 423 may be values between 0 and 1, which may be used to represent the probability that the tissue image belongs to the respective category of tissue lesions.
- the category is used as the identification result of tissue lesion identification of the tissue image. For example, among the probabilities that the tissue image belongs to each category of tissue lesions, if the category has the highest probability of no lesion, the identification result of the tissue lesion recognition of the tissue image may be no lesion. For another example, in the process of identifying fundus lesions in the fundus image, if the predicted probabilities of macular and no lesions output by the third artificial neural network 423 are 0.8 and 0.2 respectively, it can be considered that the fundus image has macular degeneration.
- the third artificial neural network 423 may output recognition results that match the tissue image.
- the recognition results may include a first recognition result when the attention mechanism is not used, a second recognition result when the attention mechanism is used, and a third recognition result when the attention mechanism and complementary attention are used.
- the third artificial neural network 423 may perform tissue lesion identification on the feature map output by the feature combination module 424 and obtain a first identification result.
- the third artificial neural network 423 may perform tissue lesion recognition on the first feature combination set output by the feature combination module 424 and obtain a second recognition result.
- the third artificial neural network 423 may perform tissue lesion identification on the second feature combination set output by the feature combination module 424 and obtain a third identification result.
- the identification results may include both lesions and no lesions. In some examples, the identification results may also include no lesions or a specific lesion type. For example, when the tissue image is a fundus image, the identification result may include, but is not limited to, one of no lesions, hypertensive retinopathy, or diabetic retinopathy. In this case, the recognition result of the fundus lesion recognition of the fundus image can be obtained. In some examples, the recognition results of one tissue image may be multiple. For example, the identification results may be two results of hypertensive retinopathy and diabetic retinopathy.
- identification system 40 may also include a determination module.
- the decision module may receive the output of the artificial neural network module 420 .
- the output result of the artificial neural network module 420 can be integrated by the judgment module and the final identification result can be output, so that a summary report can be generated.
- the first recognition result may be used as the final recognition result of the tissue image.
- the tissue image when using the artificial neural network module 420 to perform tissue lesion recognition on the tissue image, the tissue image can be recognized by the trunk neural network 4200 including the first artificial neural network 421 and the third artificial neural network 423. , thereby speeding up the recognition speed.
- the second recognition result may be used as the final recognition result of the tissue image.
- the third recognition result can be obtained based on the complementary attention mechanism.
- a final identification result of the tissue image may be obtained based on the first identification result, the second identification result, and the third identification result.
- the final recognition result may include the second recognition result and the third recognition result.
- the final recognition result may include the first recognition result and the third recognition result.
- the summary report generated by the judgment module may include at least one of the first identification result, the second identification result, the third identification result, and the final identification result.
- the determination module may color-code the tissue image based on the attention heatmap to generate a lesion indicator map to indicate a lesion area.
- the summary report generated by the decision module may include a lesion indicator map.
- the summary report generated by the determination module may include the location of the corresponding lesion and mark the location with a marker box.
- the summary report generated by the determination module may display the lesion area of the tissue image as a heat map.
- areas with high likelihood of lesions can be colored red or white, and areas with low likelihood of lesions can be colored blue or black.
- the lesion area can be indicated in an intuitive manner.
- the judgment module can also be used to frame the lesion area.
- the lesion area can be framed by a fixed shape (eg, a regular shape such as a triangle, a circle, a quadrangle, etc.).
- the lesion area may also be delineated. In this case, the lesion area can be visually displayed.
- the judgment module can also be used to delineate the lesion area. For example, the values corresponding to each pixel in the attention heat map can be analyzed, and the pixels whose values are greater than the first preset value are classified as lesion areas, and the pixels whose values are less than the first preset value are classified as non-lesion areas area.
- the identification method of the tissue lesion identification based on the artificial neural network involved in the present disclosure is implemented by the identification system 40 .
- the identification method includes: acquiring a tissue image and using the artificial neural network module 420 to acquire an identification result of tissue lesion identification.
- the tissue image may be a tissue image acquired by an acquisition device.
- artificial neural network module 420 is trained by training system 430 . In this case, the artificial neural network module 420 can be used to obtain the recognition result of tissue lesion recognition, and the total loss function can be used to optimize the artificial neural network module 420, so that the accuracy of tissue lesion recognition can be improved.
- the training method (sometimes may also be simply referred to as the training method) and the training system for tissue lesion identification based on the artificial neural network according to the present embodiment will be described in detail with reference to the accompanying drawings.
- the training method may be implemented using an artificial neural network-based training system 430 for tissue lesion identification.
- the artificial neural network module 420 can be trained using the training system 430 .
- FIG. 7 is a structural block diagram illustrating a training system 430 for tissue lesion recognition based on an artificial neural network according to an example of the present disclosure.
- the training system 430 may include a storage module 431 , a processing module 432 , and an optimization module 433 .
- storage module 431 may be configured to store training data sets.
- the processing module 432 may utilize the artificial neural network module 420 to perform operations such as feature extraction, generating an attention heat map and a complementary attention heat map, and identifying tissue lesions.
- the optimization module 433 may obtain a total loss function to optimize the artificial neural network module 420 based on the identification results of the tissue lesion identification (including the first identification result, the second identification result, and the third identification result).
- the recognition result of tissue lesion recognition can be obtained by using the attention mechanism and the complementary attention mechanism, and a total loss function can be obtained based on the recognition result of tissue lesion recognition, so that the artificial neural network module 420 can be optimized by using the total loss function, Further, the accuracy of tissue lesion identification by the artificial neural network module 420 is improved.
- the artificial neural network module 420 may be trained in a weakly supervised manner. In this case, the artificial neural network module 420 can use the labeling result with less information to obtain the recognition result with more information. In some examples, where the annotation result is a text annotation, the location and size of the lesion area may be included in the recognition result. In some examples, the training method of the artificial neural network module 420 may also be unsupervised, semi-supervised, reinforcement learning or the like.
- the artificial neural network module 420 may be trained using the first loss function, the second loss function, and the third loss function. It should be noted that since the training model and loss function involved are generally complex, the model generally does not have an analytical solution. In some examples, optimization algorithms such as batch gradient descent (BGD), stochastic gradient descent (SGD) can be used. ), etc.) iterates the model parameters for a finite number of times to reduce the value of the loss function as much as possible, that is, to find the analytical solution of the model. In some examples, the artificial neural network module 420 can be trained by using the back-propagation algorithm, in this case, the network parameters with the smallest error can be achieved, thereby improving the recognition accuracy.
- BGD batch gradient descent
- SGD stochastic gradient descent
- the artificial neural network module 420 can be trained by using the back-propagation algorithm, in this case, the network parameters with the smallest error can be achieved, thereby improving the recognition accuracy.
- FIG. 8 is a flow chart illustrating a training method for tissue lesion recognition based on an artificial neural network according to an example of the present disclosure.
- the training method may include preparing a training data set (step S100); inputting the training data set into the artificial neural network module 420, and obtaining a first recognition result matching each inspection image, The second recognition result and the third recognition result (step S200); the total loss function is calculated based on the first recognition result, the second recognition result and the third recognition result (step S300), and the artificial neural network module 420 is optimized by using the total loss function (step S300). S400).
- the first recognition result, the second recognition result, and the third recognition result can be obtained, and the total loss function can be obtained based on the first recognition result, the second recognition result, and the third recognition result, so that the total loss function can be utilized
- the artificial neural network module 420 is optimized, thereby improving the accuracy of tissue lesion identification by the artificial neural network module 420 .
- a training data set may be prepared.
- the training dataset may include a plurality of examination images and annotated results with or without lesions associated with the examination images.
- the training dataset may include multiple inspection images and annotated images associated with the inspection images.
- the examination images may be 50-200,000 tissue images from a partner hospital with patient information removed.
- the examination image may be a tissue image from a CT scan, PET-CT scan, SPECT scan, MRI, ultrasound, X-ray, mammogram, angiogram, fluorogram, capsule endoscopy, or its combination.
- the examination image may be a fundus image.
- the examination image may consist of a lesion area and a non-lesion area.
- the inspection images may be used for training of the artificial neural network module 420 .
- the inspection image may be acquired by acquisition module 410 .
- the annotated image may include annotated results with lesions or annotated results without lesions.
- the annotation results can be used as the ground truth to measure the size of the loss function.
- annotation results may be image annotations or text annotations.
- image annotations may be manually annotated boxes for framing the lesion area.
- the callout box may be a fixed shape, such as a regular shape such as a triangle, circle, or quadrilateral. In some examples, the callout box may also be an irregular shape based on the outline of the lesion area.
- the text annotation may be the result of examining the image for the presence of lesions. For example "with lesions” or “without lesions”.
- the text annotation can also be the type of lesion.
- the text annotation may be "macular degeneration", “hypertensive retinopathy” or “diabetic retinopathy” or the like.
- training data sets may be stored in storage module 431 .
- storage module 431 may be configured to store training data sets.
- the training dataset may include 30%-60% of the examination images with no lesion result annotation results. In some examples, the training dataset may include 10%, 20%, 30%, 40%, 50%, or 60% of the examination images of the lesion-free result annotation results.
- storage module 431 may be used to store training data sets.
- storage module 431 may include memory 20 .
- the storage module 431 may be configured to store inspection images and annotated images associated with the inspection images.
- artificial neural network module 420 may receive training data sets stored by storage module 431 .
- the training dataset can be preprocessed.
- the training data set can be input into the artificial neural network module 420, and the first recognition result, the second recognition result and the third recognition result matching each inspection image can be obtained.
- the training dataset may be input to the artificial neural network module 420 to obtain feature maps, attention heat maps, and complementary attention heat maps.
- feature extraction may be performed on the inspection image to obtain a feature map.
- the feature map can be processed based on an attention mechanism to obtain an attention heatmap.
- attention heatmaps may be processed based on complementary attention mechanisms to obtain complementary attention heatmaps.
- step S200 may be implemented using processing module 432 .
- processing module 432 may include at least one processor 10 .
- the artificial neural network module 420 may include a first artificial neural network 421 , a second artificial neural network 422 , and a third artificial neural network 423 , as described above.
- the processing module 432 may be configured to perform feature extraction on the inspection image using the first artificial neural network 421 to obtain a feature map. In some examples, the processing module 432 may be configured to utilize the second artificial neural network 422 to obtain an attention heatmap indicative of a lesion area and a complementary attention heatmap indicative of a non-lesion area.
- processing module 432 may be configured to utilize third artificial neural network 423 to obtain identification results including tissue lesion identification.
- the third artificial neural network 423 may include an output layer.
- the output layer may be configured to output a recognition result reflecting the inspection image.
- the third artificial neural network 423 can output a recognition result reflecting the inspection image.
- the processing module 432 may utilize the third artificial neural network 423 to identify the inspection image based on the feature map to obtain the first identification result.
- the processing module 432 may utilize the third artificial neural network 423 to identify the inspection image based on the feature map and the attention heat map to obtain the second identification result.
- the processing module 432 may utilize the third artificial neural network 423 to identify the inspection image based on the feature map and the complementary attention heatmap to obtain a third identification result.
- the tissue lesions may be fundus lesions.
- the artificial neural network module 420 can be used for fundus lesion identification of the fundus image.
- a total loss function may be calculated based on the first recognition result, the second recognition result and the third recognition result.
- step S300 may be implemented using optimization module 433 .
- optimization module 433 may obtain a total loss function for artificial neural network module 420 based on the first loss function, the second loss function, and the third loss function. In this case, the artificial neural network module 420 can be optimized using the total loss function.
- the optimization module 433 may combine the first recognition result with the annotated image to obtain the first loss function when the attention mechanism is not used.
- the first loss function may be used to evaluate the degree of inconsistency between the recognition results and the annotation results of the inspection image when the attention mechanism is not used. In this case, the accuracy of tissue lesion recognition by the artificial neural network module 420 when the attention mechanism is not used can be improved.
- the optimization module 433 may combine the second recognition result with the annotated image to obtain a second loss function when using the attention mechanism.
- the second loss function may be used to evaluate the degree of inconsistency between the recognition results and the annotation results of the inspection image when using the attention mechanism. In this case, the accuracy of tissue lesion identification when the artificial neural network module 420 uses the attention mechanism can be improved.
- the optimization module 433 may combine the third recognition result with the annotated image with the lesion-free annotation result to obtain a third loss function when using the complementary attention mechanism.
- the third loss function may be used to assess the degree of inconsistency between the recognition results of the inspection image when using the complementary attention mechanism and the lesion-free recognition. In this case, the accuracy of tissue lesion recognition by the artificial neural network module 420 when using the complementary attention mechanism can be improved.
- the first loss function, the second loss function, and the third loss function may be obtained by an error loss function.
- the error loss function may be a correlation function, an L1 loss function, an L2 loss function, or a Huber loss function, which is used to evaluate the correlation between the true value (ie, the labeling result) and the predicted value (ie, the recognition result). function.
- the overall loss function may include a first loss term, a second loss term, and a third loss term.
- the first loss term may be positively correlated with the first loss function.
- the first loss term can be used to evaluate the degree of inconsistency between the recognition result of the inspection image when the attention mechanism is not used and the labeling result, so that the accuracy of tissue lesion recognition can be improved.
- the second loss term may be positively related to the difference between the second loss function and the first loss function.
- the second loss term when the second loss function is smaller than the first loss function, the second loss term may be a constant value. In this case, the second loss term can be used to evaluate the degree of inconsistency between the recognition results of the inspection image when the attention mechanism is used and the recognition results when the attention mechanism is not used.
- the second loss term may be positively related to the difference between the second loss function and the first loss function. Specifically, when the difference between the second loss function and the first loss function is greater than zero, the difference between the second loss function and the first loss function can be used as the second loss term, when the difference between the second loss function and the first loss function When the difference is less than zero, the second loss term can be set to zero. In this case, the second loss term can be used to evaluate the degree of inconsistency between the first recognition result and the second recognition result, so that the second recognition result can be closer to the labeling result than the first recognition result.
- the third loss term may be positively correlated with the third loss function.
- the third loss term can be used to evaluate the degree of inconsistency between the third recognition result of the inspection image when the complementary attention mechanism is used and the labeling result without lesions, so that the occurrence of misjudgment or missed judgment can be reduced.
- the overall loss function may also include a fourth loss term.
- the fourth loss term may be a regularization term.
- the fourth loss term may be a regularization term for the attention heatmap.
- the regularization term may be obtained based on total variation. In this case, the artificial neural network module 420 can be suppressed from overfitting.
- the overall loss function may include loss term weight coefficients that match the individual loss terms.
- the total loss function may further include a first loss item weight coefficient matching the first loss item, a second loss item weight coefficient matching the second loss item, and a third loss item matching the third loss item. Three loss item weight coefficients, a fourth loss item weight coefficient matching the fourth loss item, and the like.
- the first loss term may be multiplied by the first loss term weight factor
- the second loss term may be multiplied by the second loss term weight factor
- the third loss term may be multiplied by the third loss term weight factor
- the fourth loss term may be multiplied by the weight coefficient of the fourth loss term
- the fifth loss term may be multiplied by the weight coefficient of the fifth loss term.
- the loss term weight coefficient may be set to 0. In some examples, the loss term weight coefficient can be set to a positive number. In this case, since each loss term is non-negative, the value of the total loss function can be made not less than zero.
- the functional formulation of the total loss function can be:
- L is the total loss function
- ⁇ 1 is the weight coefficient of the first loss item
- ⁇ 2 is the weight coefficient of the second loss item
- ⁇ 3 is the weight coefficient of the third loss item
- ⁇ 4 is the weight coefficient of the fourth loss item
- f Error loss function
- X is the inspection image
- F(X) is the feature map generated by the inspection image X after passing through the first artificial neural network 421
- l(X) is the labeling result of the inspection image X
- max is the maximum value function
- C is the classifier function that outputs the recognition result based on the input feature map or feature combination set
- margin is the preset parameter
- l 0 is the labeling result without lesions
- M(X) is the attention heat map matching the inspection image X
- the " ⁇ " in the functional formula of the total loss function is the dot product of the matrix
- Regularize(M) is the regular term for the attention heatmap M.
- the classifier function may be implemented by the third
- the optimization module 433 may utilize the first loss function, the second loss function, and the third loss function to obtain a first loss term based on the first loss function, a difference based on the second loss function and the first loss function The second loss term of , and the total loss function of the third loss term based on the third loss function and using the total loss function to optimize the artificial neural network module 420 .
- the overall loss function may also include a fifth loss term.
- the fifth loss term may be the total area term of the attention heatmap.
- the total area item of the attention heat map may be the area of the area determined to be the lesion area in the attention heat map.
- the total area term of the attention heatmap M(X) may be represented by the formula SUM(M(X)).
- the artificial neural network module 420 may be trained with the fourth loss term to make the lesion area within the attention heatmap smaller.
- the fifth loss term can be used to evaluate the area of the lesion area in the attention heatmap and control the number of pixels in the attention heatmap that have a greater impact on the recognition result, so that the network's attention is limited to The recognition result affects more pixels. Thereby, the accuracy of the identification of the lesion area can be increased.
- the overall loss function may also include a sixth loss term.
- the sixth loss term may be used to evaluate the degree of inconsistency between the framed area for the lesion area in the recognition result and the annotated frame of the manually annotated lesion area in the annotated image.
- step S400 the artificial neural network module 420 may be optimized using the total loss function.
- step S400 may be implemented using optimization module 433 .
- optimization module 433 may optimize artificial neural network module 420 with the overall loss function to minimize the overall loss function. In this case, the total loss function can be minimized to improve the accuracy of tissue lesion identification by the artificial neural network module 420 .
- the optimization module 433 may obtain a total loss function based on the first loss term, the second loss term, the third loss term, and the total area term of the attention heatmap, and perform the artificial neural network module 420 on the artificial neural network module 420 with the total loss function. Optimized to obtain an artificial neural network module 420 that can be used for tissue lesion identification. Thereby, the accuracy of tissue lesion identification by the artificial neural network module 420 can be further improved.
- the optimization module 433 may adjust the overall loss function by changing the weights of the first loss term, the second loss term, the third loss term, and the fourth loss term.
- the optimization module 433 may optimize the artificial neural network module 420 based on the first loss term and the sixth loss term as a total loss function (ie, setting the loss term weight coefficients of the other loss terms to zero).
- the accuracy of the attention heat map and the complementary attention heat map generated by the second artificial neural network 422 can be improved.
- the loss term weight coefficients in the total loss function may be modified.
- optimization module 433 may perform multiple iterations of parameters in the overall loss function with an optimization algorithm to reduce the value of the overall loss function.
- a mini-batch stochastic gradient descent (mini-batch stochastic gradient descent) algorithm can be used to randomly select the parameters of a set of input functions, and then perform multiple iterations on the parameters to reduce the value of the loss function. .
- training is suspended when the total loss function is less than a second preset value or when the number of iterations exceeds three preset values.
- the optimization module 433 may pre-train the artificial neural network module 420 without the attention mechanism before training the artificial neural network module 420 with the attention mechanism. In this case, the training speed can be accelerated.
- the optimization module 433 may simultaneously train the first artificial neural network 421 , the second artificial neural network 422 and the third artificial neural network 423 . In this case, the training speed can be accelerated.
- the optimization module 433 may employ, for example, 0-20,000 tissue images (eg, fundus images) as test tissue images to form a test set.
- test tissue images may be used for post-training testing of artificial neural network module 420 .
- FIG. 9( a ) is a schematic diagram illustrating an example of a lesion area of a fundus image obtained by training without using an attention mechanism according to an example of the present disclosure.
- FIG. 9( b ) is a schematic diagram illustrating an example of a lesion area of a fundus image obtained by training using a complementary attention mechanism according to an example of the present disclosure.
- Fig. 9(a) shows the lesion area A of the fundus image obtained without training with the attention mechanism.
- Fig. 9(b) shows the lesion region B of the fundus image trained using the complementary attention mechanism.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Image Analysis (AREA)
Abstract
本公开描述了一种基于人工神经网络的组织病变识别的识别方法及识别系统,包括:获取组织图像,组织图像为通过采集装置采集的组织图像;利用人工神经网络模块接收组织图像并对组织图像进行病变识别,人工神经网络模块包括第一人工神经网络、第二人工神经网络以及第三人工神经网络,第一人工神经网络配置为能够对组织图像进行特征提取以获得特征图,第二人工神经网络配置为能够获得指示病变区域的注意力热度图,第三人工神经网络配置为能够基于特征图对组织图像进行识别,并根据识别结果获得总损失函数书以优化人工神经网络模块,由此能够有效地提高对组织病变的识别率。
Description
本公开大体涉及基于人工神经网络的组织病变识别的识别方法及识别系统。
随着人工智能技术的发展和成熟,在医疗领域的各个方面人工智能技术逐渐得到了推广。特别是医学中的医学成像是目前人工智能技术应用比较热门的领域。医学成像是诊断许多疾病的有用工具,医学成像过程中会产生大量医学图像数据,对这些图像数据进行处理和识别需要医师大量时间,而且难以保证识别的准确性。医学图像中,主要利用人工智能技术对图像中的组织进行组织病变识别,以提高组织病变识别的准确性。
目前在应用人工智能技术对医学图像进行识别中通常采用卷积神经网络(CNN)。卷积神经网络的卷积结构可以减少深层网络占用的内存量,其具有三个关键的操作,其一是局部感受野,其二是权值共享,其三是池化层。由此,能够有效地减少网络的参数个数,缓解卷积神经网络的过拟合问题。卷积神经网络的结构能够较好地适应医学图像的结构并对特征进行提取以及识别。
然而,对于一些病变部位例如眼底病变部位,病变区域比较小且分布不规则,一般的应用注意力机制的卷积神经网络,往往容易忽略注意力热度图中注意力低的病变区域,导致出现误判的现象,从而使这些病变区域的组织病变识别的准确性较低。
发明内容
本公开有鉴于上述现有技术的状况而完成,其目的在于提供一种能够有效地提高对组织病变进行识别的准确性的基于人工神经网络的组织病变识别的识别方法及识别系统。
为此,本公开第一方面提供了一种基于人工神经网络的组织病变识别的识别方法,其包括:获取组织图像,所述组织图像为通过采集装置采集的组织图像;利用人工神经网络模块接收所述组织图像并对所述组织图像进行病变识别,所述人工神经网络模块包括第一人工神经网络、第二人工神经网络以及第三人工神经网络,所述第一人工神经网络配置为能够对所述组织图像进行特征提取以获得特征图,所述第二人工神经网络配置为能够获得指示病变区域的注意力热度图,所述第三人工神经网络配置为能够基于特征图对所述组织图像进行识别,所述人工神经网络模块的训练步骤包括:准备训练数据集,所述训练数据集包括多张检查图像以及与所述检查图像关联的标注图像,所述标注图像包括有病变的标注结果或无病变的标注结果,利用所述第一人工神经网络对所述检查图像进行特征提取以获得特征图,利用所述第二人工神经网络获得指示病变区域的注意力热度图和指示非病变区域的互补注意力热度图,所述检查图像由所述病变区域和所述非病变区域构成,利用所述第三人工神经网络基于特征图对所述检查图像进行识别以获得第一识别结果,利用所述第三人工神经网络基于特征图和注意力热度图对所述检查图像进行识别以获得第二识别结果,利用所述第三人工神经网络基于特征图和互补注意力热度图对所述检查图像进行识别以获得第三识别结果,结合所述第一识别结果与所述标注图像以获得在未使用所述注意力机制时的第一损失函数,结合所述第二识别结果与所述标注图像以获得在使用所述注意力机制时的第二损失函数,结合所述第三识别结果与具有无病变的标注结果的所述标注图像以获得在使用所述互补注意力机制时的第三损失函数,利用所述第一损失函数、所述第二损失函数、以及所述第三损失函数获取包括基于所述第一损失函数的第一损失项、基于所述第二损失函数和所述第一损失函数的差的第二损失项、以及基于所述第三损失函数的第三损失项的总损失函数并利用所述总损失函数对所述人工神经网络模块进行优化。在这种情况下,能够利用人工神经网络模块获取组织病变识别的识别结果,并且能够利用总损失函数对人工神经网络模块进行优化,从而能够提高组织病变识别的准确性。
另外,在本公开第一方面所涉及的基于人工神经网络的组织病变 识别的识别方法中,可选地,所述总损失函数还包括所述注意力热度图的总面积项,所述总面积项用于评估所述病变区域的面积。在这种情况下,能够利用第五损失项评估注意力热度图内的病变区域的面积并控制注意力热度图中对识别结果影响较大的像素的数量,从而使网络的注意力限制在对识别结果影响更大的像素上。
另外,在本公开第一方面所涉及的基于人工神经网络的组织病变识别的识别方法中,可选地,所述总损失函数还包括针对所述注意力热度图的正则项。在这种情况下,可以抑制人工神经网络模块过拟合。
另外,在本公开第一方面所涉及的基于人工神经网络的组织病变识别的识别方法中,可选地,对所述第一人工神经网络、所述第二人工神经网络和所述第三人工神经网络同时进行训练。在这种情况下,能够加快训练速度。
另外,在本公开第一方面所涉及的基于人工神经网络的组织病变识别的识别方法中,可选地,所述第三人工神经网络包括依次连接的输入层、中间层和输出层,所述输出层配置为用于输出反映所述检查图像的识别结果。在这种情况下,能够利用第三人工神经网络输出反映组织图像的识别结果。
另外,在本公开第一方面所涉及的基于人工神经网络的组织病变识别的识别方法中,可选地,所述人工神经网络模块的训练方式为弱监督。在这种情况下,能够利用信息量较少的标注结果通过人工神经网络模块获得信息量较多的识别结果。
另外,在本公开第一方面所涉及的基于人工神经网络的组织病变识别的识别方法中,可选地,所述第一损失函数用于评估所述检查图像在未使用所述注意力机制时的识别结果与所述标注结果之间的不一致程度。在这种情况下,能够提高人工神经网络模块在未使用注意力机制时的组织病变识别的准确性。
另外,在本公开第一方面所涉及的基于人工神经网络的组织病变识别的识别方法中,可选地,所述第二损失函数用于评估所述检查图像在使用所述注意力机制时的识别结果与所述标注结果之间的不一致程度。在这种情况下,能够提高人工神经网络模块在使用注意力机制时的组织病变识别的准确性。
另外,在本公开第一方面所涉及的基于人工神经网络的组织病变识别的识别方法中,可选地,所述第三损失函数用于评估所述检查图像在使用所述互补注意力机制时的识别结果与无病变的标注结果之间的不一致程度。在这种情况下,能够提高人工神经网络模块在使用互补注意力机制时的组织病变识别的准确性。
另外,在本公开第一方面所涉及的基于人工神经网络的组织病变识别的识别方法中,可选地,利用所述总损失函数对所述人工神经网络模块进行优化以使所述总损失函数最小化。在这种情况下,能够最小化总损失函数以提高人工神经网络模块的组织病变识别的准确性。
另外,在本公开第一方面所涉及的基于人工神经网络的组织病变识别的识别方法中,可选地,所述组织病变为眼底病变。在这种情况下,能够利用人工神经网络模块获得眼底图像的关于眼底病变的识别结果。
本公开第二方面提供了一种基于人工神经网络的组织病变识别的识别系统,其特征在于,使用本公开第一方面提供的识别方法进行组织病变识别。在这种情况下,能够利用识别系统对组织图像进行组织病变识别。
根据本公开,能够提供一种能够有效地提高对组织病变进行识别的准确性的基于人工神经网络的组织病变识别的识别方法及识别系统。
现在将仅通过参考附图的例子进一步详细地解释本公开的实施例,其中:
图1是示出了本公开示例所涉及的电子设备示意图。
图2是示出了本公开示例所涉及的一种组织图像。
图3是示出了本公开示例所涉及的基于人工神经网络的组织病变识别的识别系统的结构框图。
图4是示出了本公开示例所涉及的人工神经网络模块的一种例子的框图。
图5是示出了本公开示例所涉及的人工神经网络模块的变形例的框图。
图6是示出了本公开示例所涉及的第一人工神经网络的结构示意图。
图7是示出了本公开示例所涉及的基于人工神经网络的组织病变识别的训练系统的结构框图。
图8是示出了本公开示例所涉及的基于人工神经网络的组织病变识别的训练方法的流程图。
图9(a)是示出了本公开示例所涉及的未使用注意力机制训练得到的眼底图像的一种例子的示意图。
图9(b)是示出了本公开示例所涉及的使用互补注意力机制训练得到的眼底图像的病变区域的一种例子的示意图。
主要附图标号:1…电子设备,10…处理器,20…存储器,30…计算机程序,40…识别系统,410…获取模块,4200…主干神经网络,420…人工神经网络模块,421…第一人工神经网络,422…第二人工神经网络,423…第三人工神经网络,424…特征组合模块,430…训练系统,431…存储模块,432…处理模块,433…优化模块,C1…第一卷积层,C2…第二卷积层,C3…第三卷积层,S1…第一池化层,S2…第二池化层,S3…第三池化层
以下,参考附图,详细地说明本公开的优选实施方式。在下面的说明中,对于相同的部件赋予相同的符号,省略重复的说明。另外,附图只是示意性的图,部件相互之间的尺寸的比例或者部件的形状等可以与实际的不同。
图1是示出了本公开的实施方式所涉及的电子设备示意图。
如图1所示,本公开所涉及基于人工神经网络的组织病变识别的识别系统40可以以电子设备1(如计算机)为载体。在一些示例中,电子设备1可以包括一个或多个处理器10、存储器20和布置在存储器20中的计算机程序30。其中,一个或多个处理器10可以包括中央处理单元、图形处理单元以及能够处理数据的其它任何电子部件。例如,处理器10可以执行存储在存储器20上的指令。
在一些示例中,存储器20可以是能够用于携带或存储数据的计算 机可读的介质。在一些示例中,存储器20可以包括但不限于是非易失性存储器或闪速存储器(Flash Memory)等。在一些示例中,存储器20还可以例如是铁电随机存储器(FeRAM)、磁性随机存储器(MRAM)、相变随机存储器(PRAM)或阻变随机存储器(RRAM)。由此,能够降低因为突发性断电而造成数据丢失的可能性。
在另一些示例中,存储器20还可以是其它类型的可读存储介质,例如只读存储器(Read-Only Memory,ROM)、随机存储器(Random Access Memory,RAM)、可编程只读存储器(Programmable Read-only Memory,PROM)、可擦除可编程只读存储器(ErasableProgrammable Read Only Memory,EPROM)、一次可编程只读存储器(One-timeProgrammable Read-Only Memory,OTPROM)、电子抹除式可复写只读存储器(Electrically-Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(CompactDisc Read-Only Memory,CD-ROM)。
在一些示例中,存储器20可以是光盘存储器、磁盘存储器或磁带存储器。由此,能够跟据不同的情况选择合适的存储器20。
在一些示例中,计算机程序30可以包括由一个或多个处理器10执行的指令,通过执行指令可以使识别系统40对组织图像进行组织病变识别。在一些示例中,计算机程序30可以部署在本地计算机内,也可以部署在云端的服务器。
在一些示例中,计算机程序30可以存储在计算机可读介质中。计算机可读存储介质可以包括便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM)或闪存、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件中的一种或多种。
图2是示出了本公开示例所涉及的一种组织图像。图3是示出了本公开示例所涉及的基于人工神经网络的组织病变识别的识别系统40的结构框图。
在一些示例中,可以利用基于人工神经网络的组织病变识别的识别系统40实施组织图像的组织病变识别,并获得识别结果。在一些示例中,组织病变识别的识别系统40也可以称为识别系统40。
在一些示例中,如图3所示,识别系统40可以包括获取模块410、人工神经网络模块420、以及基于人工神经网络的组织病变识别的训练系统430。在一些示例中,获取模块410可以用于获取组织图像。在一些示例中,人工神经网络模块420可以用于对组织图像进行特征提取,组织病变识别等处理,并获得组织病变识别的识别结果。在一些示例中,基于人工神经网络的组织病变识别的训练系统430可以用于训练人工神经网络模块420。在一些示例中,训练系统430可以利用人工神经网络模块420获得的第一识别结果、第二识别结果和第三识别结果,并基于第一识别结果、第二识别结果和第三识别结果获得总损失函数以优化人工神经网络模块420。在这种情况下,能够获得第一识别结果、第二识别结果和第三识别结果,并基于第一识别结果、第二识别结果和第三识别结果获得总损失函数,从而能够利用总损失函数优化人工神经网络模块420,进而提高人工神经网络模块420的组织病变识别的准确性。
在一些示例中,基于人工神经网络的组织病变识别的训练系统430也可以称为训练系统430。
在一些示例中,识别系统40还可以包括预处理模块和判断模块(未图示)。
在一些示例中,组织图像可以是来自CT扫描,PET-CT扫描,SPECT扫描,MRI,超声,X射线,乳房X射线照片,血管造影照片,荧光图,胶囊内窥镜拍摄的组织腔体的图像或其组合。在一些示例中,组织图像可以通过获取模块410获取。
在一些示例中,获取模块410可以配置为用于获取组织图像,可以是通过相机、超声成像仪或X射线扫描仪等采集装置采集的组织图像。
在一些示例中,组织图像例如可以为眼底图像,食道图像,胃部图像,大肠图像,结肠图像或小肠图像。如图2所示,组织图像可以是眼底图像。在这种情况下,能够通过识别系统40对眼底图像进行眼底病变识别。
在一些示例中,组织病变识别可以是对组织图像的组织病变进行识别以获得识别结果。
在一些示例中,在组织图像为眼底图像的情况下,组织病变可以是眼底病变。在这种情况下,能够利用人工神经网络模块420获得眼底图像的关于眼底病变的识别结果。
在一些示例中,组织图像可以由病变区域和非病变区域构成。
在一些示例中,具有组织病变的组织图像(彩色图像)一般含有明显的红斑,红肿等特征,因此,可利用经训练的人工神经网络自动提取和识别这些特征,以帮助患者识别可能存在的病变。由此,能够提高识别的准确性和速度,同时能够减少人类医师借助自身经验一张一张地阅片带来的误差大以及耗时长等问题。
在一些示例中,在组织图像是眼底图像的情况下,组织图像可以按照功能进行分类。例如在训练步骤中,组织图像可以是检查图像、标注图像(后续描述)。
在一些示例中,输入人工神经网络模块420的图像可以是组织图像。在这种情况下,能够通过人工神经网络模块420对组织图像进行组织病变识别。
在一些示例中,识别系统40可以用于对组织图像的组织病变识别。在一些示例中,当组织图像进入识别系统40后,可以对组织图像进行预处理、特征提取和组织病变识别等操作。
在一些示例中,识别系统40还可以包括预处理模块和判断模块。预处理模块可以用于对组织图像进行预处理,并将经过预处理的组织图像输入人工神经网络模块420。
在一些示例中,预处理模块可以对组织图像进行预处理。在一些示例中,预处理可以包括感兴趣区域检测、图像剪裁、尺寸调整和归一化中的至少一种。在这种情况下,能够方便后续人工神经网络模块420对组织图像进行组织病变识别和判断。在一些示例中,组织图像例如可以为眼底图像,食道图像,胃部图像,大肠图像,结肠图像或小肠图像。
在一些示例中,预处理模块可以包括区域检测单元、调整单元和归一化单元。
在一些示例中,区域检测单元可以从组织图像中检测出感兴趣区域。例如若组织图像为眼底图像,则可以从眼底图像中检测以视盘为 中心的眼底区域,或者是包含视盘且以黄斑中心的眼底区域等。在一些例子中,区域检测单元可以通过例如采样阈值法、霍夫(Hough)变换来探测组织图像中的感兴趣区域。
在一些示例中,调整单元可以用于对组织图像进行剪裁和尺寸调整。由于用于采集组织图像的设备不同或者拍摄条件不同,所获得的组织图像在分辨率、尺寸等方面上均可能存在差异。在这种情况下,可以对这些组织图像进行剪裁和尺寸调整以减少差异。在一些示例中,可以对组织图像按照特定形状进行剪裁。在一些示例中,特定形状可以包括但不限于方形、矩形、圆形或椭圆形等。
在另一些示例中,可以通过调整单元将组织图像的尺寸调整至规定的尺寸。例如规定的尺寸可以为256×256、512×512或1024×1024等。但本公开的示例不限于此,在另一些示例中,组织图像的尺寸也可以是任意其他规格的大小。例如组织图像的尺寸可以为128×128、768×768或2048×2048等。
在一些示例中,预处理模块可以包括归一化单元。归一化单元可以用于对多张组织图像进行归一化处理。
在一些示例中,归一化单元的归一化方式没有特别限定,例如可以采用零均值(zero mean)、单位标准方差(unit standard deviation)等进行。另外,在一些示例中,也可以归一化在[0,1]的范围内。在这种情况下,通过归一化,能够克服不同组织图像的差异性。
在一些示例中,归一化包括图像格式的归一化,图像切片间隔,图像强度,图像合约和图像取向。在一些示例中,组织图像可以被归一化为DICOM格式,NIfTI格式或原始二进制格式。
图4是示出了本公开示例所涉及的人工神经网络模块的一种例子的框图。
如上所述,识别系统40可以包括人工神经网络模块420。在一些示例中,人工神经网络模块420可以用于对组织图像进行组织病变识别。在一些示例中,人工神经网络模块420可以包括多个人工神经网络。在一些示例中,可以利用一个或多个处理器10对人工神经网络进行训练。一般而言,人工神经网络可以包括人工神经元或节点,该人工神经元或节点可以用于接收组织图像并基于权重对组织图像执行运 算,接着选择性的将运算结果传递到其他神经元或节点上。其中,权重可以与人工神经元或节点相关联并同时约束着其它人工神经元的输出。权重(也即网络参数)可以通过训练数据集(后续描述)对人工神经网络迭代地进行训练而确定。
在一些示例中,如图4所示,人工神经网络模块420可以包括主干神经网络4200和第二人工神经网络422。
在一些示例中,主干神经网络4200可以包括第一人工神经网络421、第三人工神经网络423和特征组合模块424。
在一些示例中,第一人工神经网络421可以接收组织图像并对组织图像进行特征提取以获得特征图。
在一些示例中,第二人工神经网络422可以接收特征图和来自第三人工神经网络423的识别结果并获得指示病变区域的注意力热度图和指示非病变区域的互补注意力热度图。要注意的是,在另一些示例中,上述的注意力热度图或互补注意力热度图也可以被认为是一种特征图。
在一些示例中,特征组合模块424可以接收特征图、注意力热度图和互补注意力热度图并输出特征组合集。在一些示例中,特征组合模块424也可以直接输出特征图。
在一些示例中,第三人工神经网络423可以接收特征图或特征组合集并输出组织图像的组织病变识别的识别结果。
在一些示例中,输入人工神经网络模块420的组织图像(例如经过预处理的组织图像)可以进入第一人工神经网络421,并最终由第三人工神经网络423输出识别结果。
图5是示出了本公开示例所涉及的人工神经网络模块的变形例的框图。
另外,在一些示例中,如图5所示,人工神经网络模块420可以包括主干神经网络4200和第二人工神经网络422。
在一些示例中,如图5所示,主干神经网络4200可以包括第一人工神经网络421和第三人工神经网络423。
在一些示例中,第三人工神经网络423可以具有特征组合功能。具体内容参见特征组合模块424中的相关描述。
在一些示例中,第一人工神经网络421可以接收组织图像并对组织图像进行特征提取以获得特征图。
在一些示例中,第二人工神经网络422还可以根据注意力热度图获得指示非病变区域的互补注意力热度图。
在一些示例中,第三人工神经网络423可以接收特征图、注意力热度图和互补注意力热度图并输出组织图像的组织病变识别的识别结果。在一些示例中,注意力热度图可以是基于注意力机制获得的指示病变区域的热图。在一些示例中,注意力热度图可以显示在形成特征图时组织图像中的各个像素点的重要程度。
在一些示例中,互补注意力热度图可以是基于互补注意力机制获得的指示非病变区域的热图。
在一些示例中,互补注意力热度图可以是注意力热度图的互补图像。在一些示例中,互补注意力热度图的大小和格式可以和注意力热度图的大小和格式相同。
如上所述,人工神经网络模块420可以包括第一人工神经网络421(参见图5)。
在一些示例中,第一人工神经网络421可以使用一个或多个深度神经网络来自动识别组织图像中的特征。
在一些示例中,第一人工神经网络421可以用于接收经过预处理模块预处理的组织图像并产生一个或多个特征图。在一些示例中,第一人工神经网络421可以通过例如组合多层低级特征(像素级特征)。在这种情况下,能够实现对组织图像的抽象描述。
在一些示例中,第一人工神经网络421可以包括依次连接的输入层、中间层和输出层。输入层可以配置为用于接收经过预处理模块预处理的组织图像。中间层配置为可以用于基于组织图像提取特征图,输出层配置为可以用于输出特征图。
在一些示例中,输入人工神经网络模块420的组织图像可以转换为的像素矩阵,例如可以为三维的像素矩阵。三维矩阵的长和宽可以代表图像的大小,三维矩阵的深度代表了图像的色彩通道。在一些示例中,深度可以为1(即组织图像为灰度图像),在一些示例中,深度可以为3(即组织图像为在RGB色彩模式下的彩色图像)。
在一些示例中,第一人工神经网络421可以采用卷积神经网络。由于卷积神经网络具有局部感受野和权值共享等优点,能够极大地减小参数的训练,因此能够提高处理速度和节约硬件开销。另外,卷积神经网络能够更加有效地对组织图像进行识别。
图6是示出了本公开示例所涉及的第一人工神经网络421的结构示意图。
在一些示例中,第一人工神经网络421可以含有多个中间层,中间层可以包括多个神经元或节点,中间层中的每一神经元或节点中可以应用激励函数(如ReLU(rectified linear unit)函数、sigmoid函数或tanh函数等)作用于每一个神经元或节点的输出。不同的神经元应用的激励函数影响其它神经元应用的激励函数。
在一些示例中,如图6所示,第一人工神经网络421的中间层可以包括多个卷积层和多个池化层。在一些示例中,卷积层和池化层可以交替组合。在一些示例中,组织图像可以依次通过第一卷积层C1、第一池化层S1、第二卷积层C2、第二池化层S2、第三卷积层C3、第三池化层S3。在这种情况下,能够交替地对组织图像进行卷积处理和池化处理。
在另一些示例中,第一人工神经网络421可以不包括池化层,由此能够避免在池化过程中丢失数据并且能够简化网络结构。
在一些示例中,卷积层可以利用卷积核对卷积神经网络中的组织图像进行卷积。在这种情况下,能够得到抽象度更高的特征,以使矩阵深度变得更深。
在一些示例中,卷积核大小可以为3*3。在另一些示例中,卷积核大小可以为5*5。在一些示例中,可以在第一卷积层C1使用5×5的卷积核,其他卷积层使用3×3的卷积核。在这种情况下,能够提高训练效率。在一些示例中,卷积核的大小可以设置为任意大小。在这种情况下,能够根据图像的大小和计算成本选择卷积核的大小。
在一些示例中,池化层也可以称为下采样层。在一些示例中,可以使用最大池化(max-pooling)、平均池化(mean-pooling)或随机池化(stochastic-pooling)等池化方式处理输入的组织图像。在这种情况下,通过池化操作,一方面可以降低特征维度,提高运算效率,另一 方面,也可以使卷积神经网络提取更加抽象的高层特征,以提高对组织病变识别的准确性。
另外,在一些示例中,在上述卷积神经网络中,也可以根据情况对应地增加卷积层和池化层的层数。在这种情况下,也可以使卷积神经网络提取更加抽象的高层特征,以进一步提高对组织病变识别的准确性。
在一些示例中,经过预处理的组织图像通过第一人工神经网络421后,可以输出与该组织图像相对应的特征图。在一些示例中,特征图可以具有多个深度。在一些示例中,经过预处理的组织图像通过第一人工神经网络421后,可以输出多张特征图。在一些示例中,多张特征图可以分别对应一种特征。在一些示例中,可以基于特征图对应的特征对组织图像进行组织病变识别。
在一些示例中,在第一人工神经网络421输出特征图前,可以对特征图依次进行反卷积和上采样处理。在一些示例中,特征图可以经过多次反卷积和上采样处理。例如特征图可以依次经过第一反卷积层、第一上采样层、第二反卷积层、第二上采样层、第三反卷积层、第三上采样层。在这种情况下,能够改变特征图的大小,并保留部分组织图像的数据信息。
在一些示例中,反卷积层的数量可以与卷积层的数量相同,池化层(下采样层)的数量可以与上采样层的数量相同。由此,能够使特征图的大小与组织图像相同。
在一些示例中,特征图经过反卷积层(上采样)前,可以选取经过卷积层(池化层)处理的组织图像进行卷积。例如,特征图进入第二反卷积层(第二上采样层)前,可以将该特征图与第二卷积层C2(第二池化层S2)的输出图像进行卷积处理。特征图进入第三反卷积层(第三上采样层)前,可以将该特征图与第一卷积层C1(第一池化层S1)的输出图像进行卷积处理。在这种情况下,能够补充经过池化层或卷积层时丢失的数据信息。
在一些示例中,在通过第一人工神经网络421生成特征图后,可以通过第二人工神经网络422生成与该特征图相匹配的注意力热度图。
本实施方式中,第二人工神经网络422为带有注意力机制的人工 神经网络。在一些示例中,第二人工神经网络422的输出图像可以包括注意力热度图和互补注意力热度图。
在一些示例中,第二人工神经网络422可以包括依次连接的输入层、中间层和输出层。输入层配置为可以用于接收经过特征图和第三人工神经网络423的部分权重或组织病变识别的识别结果。中间层可以配置为用于基于第三人工神经网络423的部分权重或组织病变识别结果获得的特征权重。中间层可以配置为用于基于特征图和特征权重生成注意力热度图和/或互补注意力热度图。输出层配置为可以用于输出注意力热度图和/或互补注意力热度图。在一些示例中,特征图可以由第一人工神经网络421产生。
在一些示例中,注意力机制可以为从输入特征图的大量信息中有选择地筛选出少量重要信息并聚焦到这些重要信息上。
在一些示例中,注意力热度图可以是以热度图的方式表现注意力的图像。一般而言,注意力热度图中呈红色或白色的对应位置的像素对组织图像组织病变识别的影响较大。注意力热度图中呈蓝色或黑色的对应位置的像素对组织图像组织病变识别的影响较小。
在一些示例中,可以利用特征权重对各张特征图进行加权,并得到注意力热度图。在一些示例中,特征权重可以通过注意力机制获得。在一些示例中,注意力机制可以包括但不限于是通道注意力机制(channel attention module,CAM)、基于梯度的通道注意力机制(Grad-CAM)、基于梯度的强化通道注意力机制(Grad-CAM++)或空间注意力机制(spatial attention module,SAM)等。
在一些示例中,当第三人工神经网络423具有全局池化层和全连接层时。在一些示例中,特征权重可以是第三人工神经网络423中通过全连接层到第三人工神经网络423的输出层的权重。例如,对于组织图像为眼底图像的情况下,第三人工神经网络423可以接收到该眼底图像的特征图,并得到第一识别结果(后续描述)。若第一识别结果为“黄斑”,则提取全连接层中从全局池化层的各个神经元或节点到达“黄斑”这一识别结果的权重作为特征权重。
在一些示例中,可以基于第三人工神经网络423中的组织病变识别结果计算特征权重。在一些示例中,可以计算第三人工神经网络423 的第一识别结果(例如组织病变的概率)对一张特征图中的所有像素的偏导,并对本张特征图的所有像素的偏导进行全局池化处理以得到对应本张特征图的特征权重。
在一些示例中,可以通过第二人工神经网络422生成与特征图相匹配的注意力热度图。在一些示例中,可以通过第二人工神经网络422生成互补注意力热度图。在一些示例中,注意力热度图和互补注意力热度图位置相同的像素所对应的两个像素值反相关。在一些示例中,可以对注意力热度图和/或互补注意力热度图进行归一化处理。在一些示例中,注意力热度图和互补注意力热度图位置相同的像素所对应的两个像素值的和或积为恒定值。
在一些示例中,可以利用全变分对注意力热度图和/或互补注意力热度图进行正则化。
在一些示例中,在第一人工神经网络421和第二人工神经网络422的输出层可以连接特征组合模块424。
在一些示例中,特征组合模块424可以具有输入层和输出层,在一些示例中,特征组合模块424的输出层可以特征图或特征组合集。在一些示例中,特征组合模块424的输入层可以接收特征图、注意力热度图或互补注意力热度图。
在一些示例中,特征组合模块424可以将第一人工神经网络421输出的特征图和第二人工神经网络422输出的注意力热度图或互补注意力热度图进行特征组合以形成特征组合集。
在一些示例中,特征组合集可以包括第一特征组合集和第二特征组合集中的至少一个。
在一些示例中,特征组合模块424可以将第一人工神经网络421输出的特征图和第二人工神经网络422输出的注意力热度图进行特征组合以形成第一特征组合集。
在一些示例中,特征组合模块424可以将第一人工神经网络421输出的特征图和第二人工神经网络422输出的互补注意力热度图进行特征组合以形成第二特征组合集。
在一些示例中,特征组合模块424可以直接输出特征图。
在一些示例中,特征组合模块424也可以计算特征图与注意力热 度图的差异来获得第一特征组合集。
在一些示例中,特征组合模块424也可以计算特征图与互补注意力热度图的差异来获得第二特征组合集
在一些示例中,特征组合模块424也可以计算特征图与注意力热度图的卷积来获得第一特征组合集。
在一些示例中,特征组合模块424也可以计算特征图与互补注意力热度图的卷积来获得第二特征组合集。
在一些示例中,特征组合模块424还可以计算特征图与注意力热度图的均值来获得第一特征组合集。
在一些示例中,特征组合模块424还可以计算特征图与互补注意力热度图的均值来获得第二特征组合集。
此外,在另一些示例中,特征组合模块424可以对特征图与注意力热度图进行线性或非线性变换来获得第一特征组合集。
此外,在另一些示例中,特征组合模块424可以对特征图与互补注意力热度图进行线性或非线性变换来获得第二特征组合集。
在一些示例中,特征组合模块424的输出层可以输出特征图、第一特征组合集和第二特征组合集。在一些示例中,特征组合模块424的输出的特征图、第一特征组合集和第二特征组合集可以输入第三人工神经网络423,并由第三人工神经网络423进行组织病变识别。
在一些示例中,特征组合模块424可以并入第三人工神经网络423中,并作为第三人工神经网络423中的一部分。在这种情况下,人工神经网络模块420可以包括第一人工神经网络421、第二人工神经网络422和第三人工神经网络423。
在一些示例中,在特征组合模块424并入到第三人工神经网络423的情况下,第三人工神经网络423的输入层可以接收特征图、注意力热度图或互补注意力热度图。
在一些示例中,第三人工神经网络423可以包括依次连接的输入层、中间层和输出层。在一些示例中,输出层配置为可以用于输出反映组织图像的识别结果。在这种情况下,能够利用第三人工神经网络423输出反映组织图像的识别结果。在一些示例中,第三人工神经网络423的输出层可以包括Softmax层。在一些示例中,第三人工神经网络 423的中间层可以是全连接层。
在一些示例中,可以由全连接层进行最后的分类,并最终经过Softmax层获取组织图像属于各个组织病变的类别的概率。在这种情况下,能够基于概率得到组织图像的组织病变识别的识别结果。
在一些示例中,第三人工神经网络423可以包括各种线性分类器,例如单层的全连接层。
在一些示例中,第三人工神经网络423可以包括各种非线性分类器。例如逻辑回归(Logistic Regression)、随机森林(Random Forest)或支持向量机(Support Vector Machines)等。
在一些示例中,第三人工神经网络423可以包括多个分类器。在一些示例中,分类器可以给出组织图像的组织病变识别的识别结果。例如在组织图像为眼底图像的情况下,可以给出眼底图像的眼底病变识别的识别结果。在这种情况下,能够对眼底图像进行眼底病变识别。
在一些示例中,第三神经网络423的输出可以是0到1之间的值,这些值可以用于表示组织图像属于各个组织病变的类别的概率。
在一些示例中,当组织图像属于某个组织病变的类别的概率最高时,则将该类别作为组织图像的组织病变识别的识别结果。例如,在组织图像属于各个组织病变的类别的概率中,若类别为无病变的概率最高,则该组织图像的组织病变识别的识别结果可以为无病变。又例如在对眼底图像进行眼底病变识别过程中,第三人工神经网络423输出的黄斑、无病变的预测概率分别0.8、0.2,则可以认为该眼底图像存在黄斑病变。
在一些示例中,第三人工神经网络423可以输出与组织图像相匹配的识别结果。在一些示例中,识别结果可以包括在未使用注意力机制时的第一识别结果、在使用注意力机制时的第二识别结果以及在使用注意力机制和互补注意力时的第三识别结果。
在一些示例中,第三人工神经网络423可以对特征组合模块424的输出的特征图进行组织病变识别并得到第一识别结果。
在一些示例中,第三人工神经网络423可以对特征组合模块424的输出的第一特征组合集进行组织病变识别并得到第二识别结果。
在一些示例中,第三人工神经网络423可以对特征组合模块424 的输出的第二特征组合集进行组织病变识别并得到第三识别结果。
在一些示例中,识别结果可以包括有病变和无病变两种结果。在一些示例中,识别结果也可以包括无病变或具体的病变类型。例如在组织图像为眼底图像的情况下,识别结果可以包括但不限于为无病变、高血压视网膜病变或糖尿病视网膜病变中的一种。在这种情况下,能够获得该眼底图像的眼底病变识别的识别结果。在一些示例中,一张组织图像的识别结果可以为多种。例如,识别结果可以为高血压视网膜病变和糖尿病视网膜病变两种结果。
在一些示例中,识别系统40还可以包括判断模块。
在一些示例中,判断模块可以接收人工神经网络模块420的输出。在这种情况下,能够通过判断模块对人工神经网络模块420的输出结果进行综合并输出最终识别结果,从而能够生成汇总报告。
在一些示例中,可以将第一识别结果作为组织图像的最终识别结果。在这种情况下,在利用人工神经网络模块420对组织图像进行组织病变识别时,可以通过包括第一人工神经网络421和第三人工神经网络423的主干神经网络4200对组织图像进行组织病变识别,从而加快识别速度。
在一些示例中,可以将第二识别结果作为组织图像的最终识别结果。
如上所述,可以基于互补注意力机制的获取第三识别结果。在一些示例中,可以基于第一识别结果、第二识别结果和第三识别结果获得组织图像的最终识别结果。例如,在一些示例中,最终识别结果可以包括第二识别结果和第三识别结果。在一些示例中,最终识别结果可以包括第一识别结果和第三识别结果。
在一些示例中,判断模块生成的汇总报告可以包括第一识别结果、第二识别结果、第三识别结果和最终识别结果中的至少一种结果。在一些示例中,判断模块可以基于注意力热度图对组织图像进行颜色编码生成病变指示图以指示病变区域。判断模块生成的汇总报告可以包括病变指示图。
在一些示例中,判断模块生成的汇总报告可以包括相应病变的部位,并利用标记框对该部位进行标记。
在一些示例中,判断模块生成的汇总报告可以以热图的方式显示组织图像的病变区域。具体而言,在热图中,病变可能性大的区域可以呈红色或白色,病变可能性小的区域可以呈蓝色或黑色。在这种情况下,能够以直观地方式指示病变区域。
在一些示例中,判断模块还可以用于对病变区域的框选。在一些示例中,可以通过固定的形状(例如三角形,圆形,四边形等规则形状)对病变区域进行框选。在一些示例中,也可以对病变区域进行勾勒。在这种情况下,能够直观地显示出病变区域。
在一些示例中,判断模块还可以用于对病变区域进行勾勒。例如,可以分析注意力热度图内的各个像素点所对应的数值,并将数值大于第一预设值的像素点划分为病变区域,将数值小于第一预设值的像素点划分为非病变区域。
本公开所涉及基于人工神经网络的组织病变识别的识别方法通过识别系统40实施。
在一些示例中,识别方法包括:获取组织图像和利用人工神经网络模块420获取组织病变识别的识别结果。在一些示例中,组织图像可以为通过采集装置采集的组织图像。在一些示例中,人工神经网络模块420通过训练系统430进行训练。在这种情况下,能够利用人工神经网络模块420获取组织病变识别的识别结果,并且能够利用总损失函数对人工神经网络模块420进行优化,从而能够提高组织病变识别的准确性。
以下,结合附图具体描述本实施方式所涉及的基于人工神经网络的组织病变识别的训练方法(有时也可以简称为训练方法)以及训练系统。
在一些示例中,可以利用基于人工神经网络的组织病变识别的训练系统430实施训练方法。在这种情况下,能够利用训练系统430对人工神经网络模块420进行训练。
图7是示出了本公开示例所涉及的基于人工神经网络的组织病变识别的训练系统430的结构框图。
在一些示例中,如图7所示,训练系统430可以包括存储模块431、处理模块432以及优化模块433。在一些示例中,存储模块431可以配 置为用于储存训练数据集。在一些示例中,处理模块432,可以利用人工神经网络模块420进行特征提取、生成注意力热度图和互补注意力热度图、以及组织病变识别等操作。在一些示例中,优化模块433可以基于组织病变识别的识别结果(包括第一识别结果、第二识别结果、第三识别结果)获得总损失函数以对人工神经网络模块420进行优化。在这种情况下,能够利用注意力机制和互补注意力机制获得组织病变识别的识别结果,并基于组织病变识别的识别结果获得总损失函数,从而能够利用总损失函数优化人工神经网络模块420,进而提高人工神经网络模块420的组织病变识别的准确性。
在一些示例中,人工神经网络模块420的训练方式可以为弱监督。在这种情况下,能够利用信息量较少的标注结果通过人工神经网络模块420获得信息量较多的识别结果。在一些示例中,在标注结果是文本标注的情况下,识别结果中可以包括病变区域的位置和大小。在一些示例中,人工神经网络模块420的训练方式也可以为无监督、半监督、强化学习等方式。
在一些示例中,可以利用第一损失函数、第二损失函数和第三损失函数训练人工神经网络模块420。需要说明的是,由于涉及的训练模型和损失函数一般较复杂,因此该模型一般没有解析解,在一些示例中,可以通过优化算法(例如批量梯度下降法(BGD),随机梯度下降法(SGD)等)有限次迭代模型参数来尽可能降低损失函数的值,即求出该模型的解析解。在一些示例中,可以利用反向传播算法来训练人工神经网络模块420,在这种情况下,能够达到误差最小的网络参数,进而提高识别精度。
图8是示出了本公开示例所涉及的基于人工神经网络的组织病变识别的训练方法的流程图。
在一些示例中,如图8所示,训练方法可以包括准备训练数据集(步骤S100);将训练数据集输入人工神经网络模块420,并获得与各张检查图像相匹配的第一识别结果、第二识别结果和第三识别结果(步骤S200);基于第一识别结果、第二识别结果和第三识别结果计算总损失函数(步骤S300)以及利用总损失函数优化人工神经网络模块420(步骤S400)。在这种情况下,能够获得第一识别结果、第二 识别结果和第三识别结果,并基于第一识别结果、第二识别结果和第三识别结果获得总损失函数,从而能够利用总损失函数优化人工神经网络模块420,进而提高人工神经网络模块420的组织病变识别的准确性。
在步骤S100中,可以准备训练数据集。在一些示例中,训练数据集可以包括多张检查图像及与检查图像关联的有病变的标注结果或无病变的标注结果。
在一些示例中,训练数据集可以包括多张检查图像以及与检查图像关联的标注图像。
在一些示例中,检查图像可以是来自合作医院且去除患者信息的5-20万幅组织图像。在一些示例中,检查图像可以是来自CT扫描,PET-CT扫描,SPECT扫描,MRI,超声,X射线,乳房X射线照片,血管造影照片,荧光图,胶囊内窥镜拍摄的组织图像或其组合。在一些示例中,检查图像可以是眼底图像。在一些示例中,检查图像可以由病变区域和非病变区域构成。在一些示例中,检查图像可以用于人工神经网络模块420的训练。
在一些示例中,检查图像可以通过获取模块410获取。
在一些示例中,标注图像可以包括有病变的标注结果或无病变的标注结果。在一些示例中,标注结果可以作为真实值以衡量损失函数的大小。
在一些示例中,标注结果可以是图像标注或文本标注。在一些示例中,图像标注可以是由人工标注的用于框选病变区域的标注框。
在一些示例中,标注框可以是固定的形状,例如三角形,圆形或四边形等规则形状。在一些示例中,标注框也可以是基于病变区域勾勒的不规则形状。
在一些示例中,文本标注可以是检查图像是否存在病变的判定结果。例如“有病变”或“无病变”。在一些示例中,文本标注也可以是病变的类型。例如在检查图像是眼底图像的情况下,文本标注可以是“黄斑病变”、“高血压视网膜病变”或“糖尿病视网膜病变”等。
在一些示例中,训练数据集可以存储在存储模块431中。在一些示例中,存储模块431可以配置为用于存储训练数据集。
在一些示例中,训练数据集可以包括30%-60%的无病变结果标注结果的检查图像。在一些示例中,训练数据集可以包括10%、20%、30%、40%、50%或60%的无病变结果标注结果的检查图像。
在一些示例中,可以使用存储模块431存储训练数据集。在一些示例中,存储模块431可以包括存储器20。
在一些示例中,存储模块431可以配置为用于存储检查图像和与检查图像关联的标注图像。
在一些示例中,人工神经网络模块420可以接收存储模块431存储的训练数据集。
在一些示例中,可以对训练数据集进行预处理。
在步骤S200中,可以将训练数据集输入人工神经网络模块420,并获得与各张检查图像相匹配的第一识别结果、第二识别结果和第三识别结果。在一些示例中,可以将训练数据集输入人工神经网络模块420以获取特征图、注意力热度图和互补注意力热度图。在一些示例中,可以对检查图像进行特征提取以获得特征图。在一些示例中,可以基于注意力机制对特征图进行处理以获得注意力热度图。在一些示例中,可以基于互补注意力机制对注意力热度图进行处理以获得互补注意力热度图。
在一些示例中,可以利用处理模块432实施步骤S200。在一些示例中,处理模块432可以包括至少一个处理器10。
在一些示例中,如上所述,人工神经网络模块420可以包括第一人工神经网络421、第二人工神经网络422以及第三人工神经网络423。
在一些示例中,处理模块432可以配置为用于利用第一人工神经网络421对检查图像进行特征提取以获得特征图。在一些示例中,处理模块432可以配置为用于利用第二人工神经网络422获得指示病变区域的注意力热度图和指示非病变区域的互补注意力热度图。
在一些示例中,处理模块432可以配置为用于利用第三人工神经网络423获得包括组织病变识别的识别结果。如上所述,第三人工神经网络423可以包括输出层。在一些示例中,该输出层可以配置为用于输出反映检查图像的识别结果。在这种情况下,第三人工神经网络423能够输出反映检查图像的识别结果。
在一些示例中,处理模块432可以利用第三人工神经网络423基于特征图对检查图像进行识别以获得第一识别结果。
在一些示例中,处理模块432可以利用第三人工神经网络423基于特征图和注意力热度图对检查图像进行识别以获得第二识别结果。
在一些示例中,处理模块432可以利用第三人工神经网络423基于特征图和互补注意力热度图对检查图像进行识别以获得第三识别结果。
在一些示例中,组织病变可以是眼底病变。在这种情况下,能够将人工神经网络模块420用于眼底图像的眼底病变识别。
在步骤S300中,基于第一识别结果、第二识别结果和第三识别结果可以计算总损失函数。
在一些示例中,可以利用优化模块433实施步骤S300。
在一些示例中,优化模块433可以基于第一损失函数、第二损失函数和第三损失函数获得人工神经网络模块420的总损失函数。在这种情况下,能够利用总损失函数优化人工神经网络模块420。
在一些示例中,优化模块433可以结合第一识别结果与标注图像以获得在未使用注意力机制时的第一损失函数。在一些示例中,第一损失函数可以用于评估检查图像在未使用注意力机制时的识别结果与标注结果之间的不一致程度。在这种情况下,能够提高人工神经网络模块420在未使用注意力机制时的组织病变识别的准确性。
在一些示例中,优化模块433可以结合第二识别结果与标注图像以获得在使用注意力机制时的第二损失函数。在一些示例中,第二损失函数可以用于评估检查图像在使用注意力机制时的识别结果与标注结果之间的不一致程度。在这种情况下,能够提高人工神经网络模块420在使用注意力机制时的组织病变识别的准确性。
在一些示例中,优化模块433可以结合第三识别结果与具有无病变的标注结果的标注图像以获得在使用互补注意力机制时的第三损失函数。在一些示例中,第三损失函数可以用于评估检查图像在使用互补注意力机制时的识别结果与无病变识别之间的不一致程度。在这种情况下,能够提高人工神经网络模块420在使用互补注意力机制时的组织病变识别的准确性。
在一些示例中,第一损失函数、第二损失函数和第三损失函数可以通过误差损失函数获得。在一些示例中,误差损失函数可以为相关函数、L1损失函数、L2损失函数或Huber损失函数等用于评估真值(也即标注结果)与预测值(也即识别结果)之间相关性的函数。
在一些示例中,总损失函数可以包括第一损失项、第二损失项和第三损失项。
在一些示例中,第一损失项可以与第一损失函数正相关。在这种情况下,能够利用第一损失项评估检查图像在未使用注意力机制时的识别结果与标注结果之间的不一致程度,从而能够提高组织病变识别的准确性。
在一些示例中,第二损失项可以与第二损失函数和第一损失函数的差正相关。在一些示例中,第二损失函数小于第一损失函数时,第二损失项可以为恒定值。在这种情况下,能够利用第二损失项评估检查图像在使用注意力机制时的识别结果与未使用注意力机制时的识别结果的不一致程度。
在一些示例中,第二损失项可以与第二损失函数和第一损失函数的差正相关。具体而言,当第二损失函数和第一损失函数的差大于零时,可以把第二损失函数和第一损失函数的差作为第二损失项,当第二损失函数和第一损失函数的差小于零时,可以令第二损失项设为零。在这种情况下,能够利用第二损失项评估第一识别结果与第二识别结果之间的不一致程度,从而能够使第二识别结果相对于第一识别结果更加接近标注结果。
在一些示例中,第三损失项可以与第三损失函数正相关。在这种情况下,能够利用第三损失项评估检查图像在使用互补注意力机制时的第三识别结果与无病变的标注结果之间的不一致程度,从而能够降低误判或漏判的发生。
在一些示例中,总损失函数还可以包括第四损失项。在一些示例中,第四损失项可以是正则项。在一些示例中,第四损失项可以是针对注意力热度图的正则项。在一些示例中,正则项可以基于全变分获得。在这种情况下,可以抑制人工神经网络模块420过拟合。
在一些示例中,总损失函数可以包括与各个损失项相匹配的损失 项权重系数。在一些示例中,总损失函数还可以包括与第一损失项相匹配的第一损失项权重系数、与第二损失项相匹配的第二损失项权重系数、与第三损失项相匹配的第三损失项权重系数和与第四损失项相匹配的第四损失项权重系数等。
在一些示例中,第一损失项可以与第一损失项权重系数相乘,第二损失项可以与第二损失项权重系数相乘,第三损失项可以与第三损失项权重系数相乘,第四损失项可以与第四损失项权重系数相乘,第五损失项可以与第五损失项权重系数相乘。由此,能够通过损失项权重系数调整各个损失项对总损失函数的影响程度。
在一些示例中,损失项权重系数可以设置为0。在一些示例中,损失项权重系数可以设置为正数。在这种情况下,由于各个损失项均为非负数,能够使总损失函数的值不小于零。
在一些示例中,总损失函数的函数式可以为:
其中,L为总损失函数,λ
1为第一损失项权重系数、λ
2为第二损失项权重系数、λ
3为第三损失项权重系数,λ
4为第四损失项权重系数,f为误差损失函数,X为检查图像,F(X)为检查图像X经过第一人工神经网络421后产生的特征图,l(X)为检查图像X的标注结果,max为取最大值函数,C为基于输入的特征图或特征组合集输出识别结果的分类器函数,margin为预设参数,l
0为无病变的标注结果,M(X)为与检查图像X相匹配的注意力热度图,
为与检查图像X相匹配的互补注意力热度图,总损失函数的函数式中的“·”为矩阵的点乘运算,Regularize(M)为针对注意力热度图M的正则项。在一些示例中,分类器函数可以通过第三人工神经网络423实现。
在一些示例中,优化模块433可以利用第一损失函数、第二损失函数、以及第三损失函数获取包括基于第一损失函数的第一损失项、基于第二损失函数和第一损失函数的差的第二损失项、以及基于第三损失函数的第三损失项的总损失函数并利用总损失函数对人工神经网 络模块420进行优化。
在一些示例中,总损失函数还可以包括第五损失项。在一些示例中,第五损失项可以是注意力热度图的总面积项。具体而言,注意力热度图的总面积项可以是注意力热度图内的判断为病变区域的面积。在一些示例中,可以用公式SUM(M(X))表示注意力热度图M(X)的总面积项。在一些示例中,可以利用第四损失项训练人工神经网络模块420以使注意力热度图内的病变区域变小。在这种情况下,能够利用第五损失项评估注意力热度图内的病变区域的面积并控制注意力热度图中对识别结果影响较大的像素的数量,从而使网络的注意力限制在对识别结果影响更大的像素上。由此,能够增加病变区域识别的准确性。
在一些示例中,总损失函数还可以包括第六损失项。在一些示例中,第六损失项可以用于评估识别结果中对病变区域的框选区域和标注图像中由人工标注的病变区域的标注框之间的不一致程度。
在步骤S400中,可以利用总损失函数优化人工神经网络模块420。
在一些示例中,可以利用优化模块433实施步骤S400。
在一些示例中,优化模块433可以利用总损失函数对人工神经网络模块420进行优化以使总损失函数最小化。在这种情况下,能够最小化总损失函数以提高人工神经网络模块420的组织病变识别的准确性。
在一些示例中,优化模块433可以基于第一损失项、第二损失项、第三损失项和注意力热度图的总面积项获得总损失函数,并且利用总损失函数对人工神经网络模块420进行优化,以获得可以用于组织病变识别的人工神经网络模块420。由此,能够进一步提高人工神经网络模块420的组织病变识别的准确性。
在一些示例中,优化模块433可以通过改变第一损失项、第二损失项、第三损失项和第四损失项的权重对总损失函数进行调节。
在一些示例中,优化模块433可以基于第一损失项和第六损失项作为总损失函数(即将其他损失项的损失项权重系数设为零)以优化人工神经网络模块420。由此,能够提高第二人工神经网络422生成的注意力热度图和互补注意力热度图的准确性。
在一些示例中,在优化过程中,可以修改总损失函数中的损失项权重系数。
在一些示例中,优化模块433可以借助优化算法来对总损失函数中的参数进行多次迭代,以降低总损失函数的值。例如,在本实施方式中,可以借助小批量随机梯度下降(mini-batch stochastic gradient descent)算法,通过随机选取一组输入函数的参数,接下来对参数进行多次迭代,以降低损失函数的值。
在一些示例中,当总损失函数小于第二预设值时或迭代次数超过三预设值时,暂停训练。
在一些示例中,优化模块433可以先不用注意力机制对人工神经网络模块420进行预训练,然后再以运用注意力机制对人工神经网络模块420进行训练。在这种情况下,能够加快训练速度。
在一些示例中,优化模块433可以对第一人工神经网络421、第二人工神经网络422和第三人工神经网络423同时进行训练。在这种情况下,能够加快训练速度。
在一些示例中,在训练完成后,优化模块433可以采用例如0-20000幅组织图像(例如眼底图像)作为测试组织图像以组成测试集。
在一些示例中,测试组织图像可以用于人工神经网络模块420的训练后的测试。
图9(a)是示出了本公开示例所涉及的未使用注意力机制训练得到的眼底图像的病变区域一种例子的示意图。图9(b)是示出了本公开示例所涉及的使用互补注意力机制训练得到的眼底图像的病变区域的一种例子的示意图。
在一些示例中,使用互补注意力机制训练得到的眼底图像的组织病变识别的准确性更高。作为未使用注意力机制的示例,图9(a)示出了未使用注意力机制训练得到的眼底图像的病变区域A。作为互补注意力机制的示例,图9(b)示出了使用互补注意力机制训练得到的眼底图像的病变区域B。
虽然以上结合附图和实施例对本公开进行了具体说明,但是可以理解,上述说明不以任何形式限制本公开。本领域技术人员在不偏离 本公开的实质精神和范围的情况下可以根据需要对本公开进行变形和变化,这些变形和变化均落入本公开的保护范围内。
Claims (12)
- 一种基于人工神经网络的组织病变识别的识别方法,其特征在于,包括:获取组织图像,所述组织图像为通过采集装置采集的组织图像;利用人工神经网络模块接收所述组织图像并对所述组织图像进行病变识别,所述人工神经网络模块包括第一人工神经网络、第二人工神经网络以及第三人工神经网络,所述第一人工神经网络配置为能够对所述组织图像进行特征提取以获得特征图,所述第二人工神经网络配置为能够获得指示病变区域的注意力热度图,所述第三人工神经网络配置为能够基于特征图对所述组织图像进行识别,所述人工神经网络模块的训练步骤包括:准备训练数据集,所述训练数据集包括多张检查图像以及与所述检查图像关联的标注图像,所述标注图像包括有病变的标注结果或无病变的标注结果,利用所述第一人工神经网络对所述检查图像进行特征提取以获得特征图,利用所述第二人工神经网络获得指示病变区域的注意力热度图和指示非病变区域的互补注意力热度图,所述检查图像由所述病变区域和所述非病变区域构成,利用所述第三人工神经网络基于特征图对所述检查图像进行识别以获得第一识别结果,利用所述第三人工神经网络基于特征图和注意力热度图对所述检查图像进行识别以获得第二识别结果,利用所述第三人工神经网络基于特征图和互补注意力热度图对所述检查图像进行识别以获得第三识别结果,结合所述第一识别结果与所述标注图像以获得在未使用所述注意力机制时的第一损失函数,结合所述第二识别结果与所述标注图像以获得在使用所述注意力机制时的第二损失函数,结合所述第三识别结果与具有无病变的标注结果的所述标注图像以获得在使用所述互补注意力机制时的第三损失函数,利用所述第一损失函数、所述第二损失函数、以及所述第三损失函数获取包括基于所述第一损失函数的第一损失项、基于所述第二损失函数和所述第一损失函数的差的第二损失项、以及基于所述第三损失函数的第三损失项的总损失函数并利用所述总损失函数对所述人工神经网络模块进行优化。
- 如权利要求1所述的识别方法,其特征在于,所述总损失函数还包括所述注意力热度图的总面积项,所述总面积项用于评估所述病变区域的面积。
- 如权利要求1所述的识别方法,其特征在于,所述总损失函数还包括针对所述注意力热度图的正则项。
- 如权利要求1所述的识别方法,其特征在于,对所述第一人工神经网络、所述第二人工神经网络和所述第三人工神经网络同时进行训练。
- 如权利要求1所述的识别方法,其特征在于,所述第三人工神经网络包括依次连接的输入层、中间层和输出层,所述输出层配置为用于输出反映所述检查图像的识别结果。
- 如权利要求1所述的识别方法,其特征在于,所述人工神经网络模块的训练方式为弱监督。
- 如权利要求1所述的识别方法,其特征在于,所述第一损失函数用于评估所述检查图像在未使用所述注意力机制时的识别结果与所述标注结果之间的不一致程度。
- 如权利要求1所述的识别方法,其特征在于,所述第二损失函数用于评估所述检查图像在使用所述注意力机制时的识别结果与所述标注结果之间的不一致程度。
- 如权利要求1所述的识别方法,其特征在于,所述第三损失函数用于评估所述检查图像在使用所述互补注意力机制时的识别结果与无病变的标注结果之间的不一致程度。
- 如权利要求1所述的识别方法,其特征在于,利用所述总损失函数对所述人工神经网络模块进行优化以使所述总损失函数最小化。
- 如权利要求1所述的识别方法,其特征在于,组织病变为眼底病变。
- 一种基于人工神经网络的组织病变识别的识别系统,其特征在于,使用权利要求1至11中的任一项所述的识别方法进行组织病变识别。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/091227 WO2022226949A1 (zh) | 2021-04-29 | 2021-04-29 | 基于人工神经网络的组织病变识别的识别方法及识别系统 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/091227 WO2022226949A1 (zh) | 2021-04-29 | 2021-04-29 | 基于人工神经网络的组织病变识别的识别方法及识别系统 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022226949A1 true WO2022226949A1 (zh) | 2022-11-03 |
Family
ID=83846729
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/091227 WO2022226949A1 (zh) | 2021-04-29 | 2021-04-29 | 基于人工神经网络的组织病变识别的识别方法及识别系统 |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2022226949A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118279303A (zh) * | 2024-05-31 | 2024-07-02 | 成都全景德康医学影像诊断中心有限公司 | 基于深度学习的医学影像智能诊断方法及系统 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150339589A1 (en) * | 2014-05-21 | 2015-11-26 | Brain Corporation | Apparatus and methods for training robots utilizing gaze-based saliency maps |
CN110826560A (zh) * | 2019-11-06 | 2020-02-21 | 山东省计算中心(国家超级计算济南中心) | 一种食管癌病理图像标注方法 |
CN111079862A (zh) * | 2019-12-31 | 2020-04-28 | 西安电子科技大学 | 基于深度学习的甲状腺乳头状癌病理图像分类方法 |
US20200334809A1 (en) * | 2019-04-16 | 2020-10-22 | Covera Health | Computer-implemented machine learning for detection and statistical analysis of errors by healthcare providers |
CN112862746A (zh) * | 2019-11-28 | 2021-05-28 | 深圳硅基智控科技有限公司 | 基于人工神经网络的组织病变识别的识别方法及识别系统 |
-
2021
- 2021-04-29 WO PCT/CN2021/091227 patent/WO2022226949A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150339589A1 (en) * | 2014-05-21 | 2015-11-26 | Brain Corporation | Apparatus and methods for training robots utilizing gaze-based saliency maps |
US20200334809A1 (en) * | 2019-04-16 | 2020-10-22 | Covera Health | Computer-implemented machine learning for detection and statistical analysis of errors by healthcare providers |
CN110826560A (zh) * | 2019-11-06 | 2020-02-21 | 山东省计算中心(国家超级计算济南中心) | 一种食管癌病理图像标注方法 |
CN112862746A (zh) * | 2019-11-28 | 2021-05-28 | 深圳硅基智控科技有限公司 | 基于人工神经网络的组织病变识别的识别方法及识别系统 |
CN111079862A (zh) * | 2019-12-31 | 2020-04-28 | 西安电子科技大学 | 基于深度学习的甲状腺乳头状癌病理图像分类方法 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118279303A (zh) * | 2024-05-31 | 2024-07-02 | 成都全景德康医学影像诊断中心有限公司 | 基于深度学习的医学影像智能诊断方法及系统 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10482603B1 (en) | Medical image segmentation using an integrated edge guidance module and object segmentation network | |
WO2020238734A1 (zh) | 图像分割模型的训练方法、装置、计算机设备和存储介质 | |
US10496884B1 (en) | Transformation of textbook information | |
EP3921776B1 (en) | Method and system for classification and visualisation of 3d images | |
JP2020518915A (ja) | 自動眼底画像分析用のシステムおよび方法 | |
Ghosh et al. | Effective deep learning for semantic segmentation based bleeding zone detection in capsule endoscopy images | |
CN112862745B (zh) | 基于人工神经网络的组织病变识别的训练方法及训练系统 | |
Vij et al. | A systematic review on diabetic retinopathy detection using deep learning techniques | |
Raut et al. | Gastrointestinal tract disease segmentation and classification in wireless capsule endoscopy using intelligent deep learning model | |
Sulam et al. | Maximizing AUC with Deep Learning for Classification of Imbalanced Mammogram Datasets. | |
Sonia et al. | Segmenting and classifying skin lesions using a fruit fly optimization algorithm with a machine learning framework | |
KR20200110111A (ko) | 의료영상정보 딥러닝 기반의 동적 다차원 질병진단 방법 및 동적 다차원 질병진단 장치 | |
WO2022226949A1 (zh) | 基于人工神经网络的组织病变识别的识别方法及识别系统 | |
Naeem et al. | DVFNet: A deep feature fusion-based model for the multiclassification of skin cancer utilizing dermoscopy images | |
CN117523350A (zh) | 基于多模态特征的口腔影像识别方法、系统及电子设备 | |
Sridhar et al. | Lung Segment Anything Model (LuSAM): A Prompt-integrated Framework for Automated Lung Segmentation on ICU Chest X-Ray Images | |
Monroy et al. | Automated chronic wounds medical assessment and tracking framework based on deep learning | |
US20240087115A1 (en) | Machine learning enabled system for skin abnormality interventions | |
Islam et al. | An improved deep learning-based hybrid model with ensemble techniques for brain tumor detection from MRI image | |
RS | CoC-ResNet-classification of colorectal cancer on histopathologic images using residual networks | |
Thangavel et al. | Effective deep learning approach for segmentation of pulmonary cancer in thoracic CT image | |
Akram et al. | Recognizing Breast Cancer Using Edge-Weighted Texture Features of Histopathology Images. | |
Inani et al. | AI-enabled dental caries detection using transfer learning and gradient-based class activation mapping | |
Sun | Review on Computer Vision in Gastric Cancer: Potential Efficient Tools for Diagnosis | |
Gizaw | DEVELOPING A BREAST CANCER DISEASE DETECTION MODEL USING CNN APPROACH |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21938427 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21938427 Country of ref document: EP Kind code of ref document: A1 |