WO2020151307A1 - 病灶自动识别方法、装置及计算机可读存储介质 - Google Patents

病灶自动识别方法、装置及计算机可读存储介质 Download PDF

Info

Publication number
WO2020151307A1
WO2020151307A1 PCT/CN2019/116558 CN2019116558W WO2020151307A1 WO 2020151307 A1 WO2020151307 A1 WO 2020151307A1 CN 2019116558 W CN2019116558 W CN 2019116558W WO 2020151307 A1 WO2020151307 A1 WO 2020151307A1
Authority
WO
WIPO (PCT)
Prior art keywords
lesion
fundus image
image
fundus
image data
Prior art date
Application number
PCT/CN2019/116558
Other languages
English (en)
French (fr)
Inventor
刘莉红
马进
王健宗
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020151307A1 publication Critical patent/WO2020151307A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to a method, device and computer-readable storage medium for automatic lesion identification.
  • Diabetic retinopathy is a major cause of blindness. However, if diabetic patients can find out in time and receive standardized treatment, most of them can get rid of the danger of blindness. Almost all eye diseases can occur in diabetic patients, such as fundus hemangioma, fundus hemorrhage, dacryocystitis, glaucoma, cataract, vitreous opacity, optic nerve atrophy, macular degeneration, and retinal detachment. Moreover, the risk of these eye diseases in diabetic patients is significantly higher than that of non-diabetic people.
  • the key issue in the diagnosis and treatment of diabetic retinopathy is how to be safer and more accurate, and to find related symptoms earlier, so as to take corresponding measures to prevent visual impairment.
  • the main diagnosis method still relies on manual diagnosis to manually identify whether there are lesions from the fundus color film.
  • traditional manual diagnosis has exposed problems such as high cost, low efficiency, and large accidental factors.
  • Automated auxiliary diagnosis methods are urgently needed.
  • Deep learning has been widely used in medical imaging.
  • Medical image analysis has been widely used in clinical screening, diagnosis, grading, and treatment decision-making for major diseases such as benign and malignant tumors, brain function and mental disorders, and cardiovascular and cerebrovascular diseases. And guidance, efficacy evaluation, etc.
  • Medical image classification and recognition, localization and detection, tissue organs and lesion segmentation are the main application areas of current medical image analysis deep learning methods, but at the same time, medical image analysis with different imaging principles and natural image analysis in the field of computer vision are relatively large The difference.
  • the present application provides a method, a device, and a computer-readable storage medium for automatic lesion identification, the main purpose of which is to provide a solution for automatic identification of fundus lesions.
  • the automatic lesion identification method of this application includes:
  • the fundus image that needs to be identified by the lesion is uniformly generated by using the preset N pixels as the step size, and the lesion recognition model is used for each image block to obtain the probability value of the image block, and the probability value for all image blocks Perform an averaging operation to obtain the probability value of the lesion in the input fundus image.
  • the present application also provides a device that includes a memory and a processor.
  • the memory stores an automatic lesion identification program that can be run on the processor, and the automatic lesion identification program is
  • the processor implements the following steps when executing:
  • the fundus image that needs to be identified by the lesion is uniformly generated by using the preset N pixels as the step size, and the lesion recognition model is used for each image block to obtain the probability value of the image block, and the probability value for all image blocks Perform an averaging operation to obtain the probability value of the lesion in the input fundus image.
  • the present application also provides a computer-readable storage medium having an automatic lesion identification program stored on the computer-readable storage medium, and the automatic lesion identification program may be executed by one or more processors, In order to realize the steps of the above-mentioned automatic lesion identification method.
  • the method, device and computer-readable storage medium for automatic lesion identification proposed in this application collect fundus image data and perform preprocessing operations on the image data; use the fundus image data after the above preprocessing operation to train the lesion recognition model; and use The above-mentioned trained lesion recognition model performs lesion recognition on the fundus image and outputs the probability value of the lesion. Therefore, this application can realize automatic recognition of fundus lesions.
  • FIG. 1 is a schematic flowchart of a method for automatically identifying a lesion according to an embodiment of the application
  • FIG. 2 is a schematic diagram of the internal structure of a device provided by an embodiment of the application.
  • FIG. 3 is a schematic diagram of modules of an automatic lesion recognition program in a device provided by an embodiment of the application.
  • This application provides a method for automatically identifying lesions.
  • FIG. 1 it is a schematic flowchart of a method for automatic lesion identification according to an embodiment of this application.
  • the method can be executed by a device, and the device can be implemented by software and/or hardware.
  • S1. Collect fundus image data, and perform preprocessing operations on the image data.
  • Deep learning model training requires a large amount of training data.
  • the preferred embodiment of this application uses a digital fundus camera with a 50-degree field of view (FOV) (such as Kowa VX-10 ⁇ ) to obtain fundus images, and all images are required to be centered and close to the macula. .
  • FOV field of view
  • the resolution of the acquired fundus image is 4288 ⁇ 2848 pixels, and is stored in a jpg file format for use.
  • a preferred embodiment of the present application further preprocesses the image data to generate a suitable training data set.
  • the preprocessing operation includes image trimming processing and normalization processing.
  • the image clipping process is used to extract the part of the fundus from the background of the image data.
  • the threshold image segmentation method is used for image cropping.
  • the threshold image segmentation method is an area-based image segmentation technology.
  • the principle is to divide the image pixel set according to the gray level, and each obtained subset forms an area corresponding to the real scene.
  • Each area has Consistent attributes, and adjacent areas do not have this consistent attribute.
  • Such division can be achieved by selecting one or more thresholds starting from the gray level.
  • the fundus image In the fundus image, the human fundus and the background are clearly separated.
  • the fundus image is composed of a bright object and a deep background.
  • Such a composition method divides the object and background pixels with gray levels into two different dominating modes.
  • the method of extracting objects from the background is to select a threshold value T, where, in the fundus image, the pixel point (x, y) with gray value f(x, y)>T is called the object point; gray value f
  • This application uses the threshold segmentation method to cut and subtract the background in the fundus image to obtain the target area, that is, the fundus area.
  • the normalization process is to convert the original image to be processed into a corresponding unique standard form through a series of transformations (that is, using the invariant moments of the image to find a set of parameters to eliminate the influence of other transformation functions on the image transformation) (This standard form image has invariant characteristics to affine transformations such as translation, rotation, and scaling).
  • the normalization process of the fundus image uses the following formula to convert the color fundus image from the RGB color space to the LUV color space:
  • b is a fixed value, u'and v'are chromaticity coordinates.
  • the fundus image data is converted back to the RGB color space to complete the normalization process.
  • the L channel in the LUV model describes the brightness of the graphics, so a normalization algorithm (such as ordinary averaging) is used on the L channel to eliminate unwanted features such as local contrast and uneven brightness.
  • the lesion recognition model is a Convolutional Neural Networks (CNN for short) model.
  • the convolutional neural network is a feedforward neural network, and its artificial neurons can respond to surrounding units in a part of the coverage area, and have excellent performance for large-scale image processing.
  • CNN includes a convolutional layer and a pooling layer. After obtaining pixel features from an image, it performs layer-by-layer convolution and pooling operations to complete the training of the target.
  • This application uses a CNN image segmentation model based on image blocks, and uses the image blocks extracted from the fundus image in a preset manner as the training data samples of the convolutional neural network model.
  • this application extracts a 256x256 small image block from the fundus image with a step size of 64 pixels.
  • this application adopts a data enhancement method to increase the number of training samples.
  • the data enhancement method described in this application includes mirror processing, random rotation of 90, 180 and 270 degrees, and color enhancement of the data.
  • the color enhancement mainly includes performing PCA processing on the RGB values of the pixels, and for each image, adding a Gaussian random variable whose size is proportional to the corresponding feature value to the principal component.
  • the preferred embodiment of the present application puts image blocks containing lesions into the training set of positive samples, and image blocks without lesions into the training set of negative samples.
  • the image block of the lesion containing the hard exudate only occupies a small part of the entire fundus picture. Therefore, in the training data, the number of negative samples is often larger than the number of positive samples.
  • the training set with unbalanced positive and negative samples will also affect the recognition effect of the last convolutional neural network model.
  • a preferred embodiment of the present application uses a bootstrap method to resample the positive samples multiple times so that the ratio of positive and negative samples is close to 1:1.
  • this paper also designs the following loss function, which combines the dice function and the cross entropy function:
  • x i,j and y i,j represent pixel coordinates
  • w 2 , w 10 and w 11 are preset coefficients to adjust the imbalance of positive and negative samples.
  • w 10 can be set to 0.7
  • w 11 can be set to 0.3.
  • This application uses the training set of positive and negative samples obtained above to train the convolutional neural network model.
  • the architecture of the convolutional neural network consists of a down-sampling path and an up-sampling path.
  • the down-sampling path has 2 down-sampling blocks (TD), 2 extended blocks (DL), and up-sampling path Including two upsampling blocks, DenseNets (DB) consists of four layers.
  • the expansion block (DL) merges and connects the output of the convolution block with the expansion rate (1, 3, 5) as the input of the convolution to utilize multi-scale features.
  • the fundus image that needs to be identified by the lesion is uniformly generated with image blocks with a preset N pixels as the step size, and the lesion recognition model is used for each image block to obtain the probability value of the image block.
  • the probability value is averaged to obtain the probability value of the lesion in the input fundus image.
  • the model is applied to all the extracted image blocks to obtain the probability map of the entire fundus image.
  • the preferred embodiment of the present application can uniformly generate image blocks for fundus images with preset N pixels, such as 32 pixels, as steps, and apply the lesion recognition model to each image block to obtain The probability value of the image block is calculated by averaging the probability values of all the image blocks to obtain the probability value of the lesion in the input fundus image.
  • the application also provides a device for performing automatic lesion recognition.
  • FIG. 2 it is a schematic diagram of the internal structure of a device provided by an embodiment of this application.
  • the device 1 may be a terminal device such as a smart phone, a tablet computer, a portable computer, etc., a PC (Personal Computer, personal computer), or a server, a server group, and the like.
  • the device 1 at least includes a memory 11, a processor 12, a communication bus 13, and a network interface 14.
  • the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, and the like.
  • the memory 11 may be an internal storage unit of the device 1 in some embodiments, such as a hard disk of the device 1. In other embodiments, the memory 11 may also be an external storage device of the device 1, such as a plug-in hard disk, a smart media card (SMC), and a secure digital (SD) card equipped on the device 1. Flash Card, etc. Further, the memory 11 may also include both an internal storage unit of the apparatus 1 and an external storage device.
  • the memory 11 can be used not only to store application software and various data installed in the device 1, such as the code of the automatic lesion identification program 01, etc., but also to temporarily store data that has been output or will be output.
  • the processor 12 may be a central processing unit (CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments, and is used to run the program code or processing stored in the memory 11 Data, such as the implementation of automatic lesion identification program 01, etc.
  • CPU central processing unit
  • controller microcontroller
  • microprocessor or other data processing chip in some embodiments, and is used to run the program code or processing stored in the memory 11 Data, such as the implementation of automatic lesion identification program 01, etc.
  • the communication bus 13 is used to realize the connection and communication between these components.
  • the network interface 14 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface), and is usually used to establish a communication connection between the device 1 and other electronic devices.
  • the device 1 may further include a user interface.
  • the user interface may include a display (Display) and an input unit such as a keyboard (Keyboard).
  • the optional user interface may also include a standard wired interface and a wireless interface.
  • the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, etc.
  • the display can also be appropriately called a display screen or a display unit, which is used to display the information processed in the device 1 and to display a visualized user interface.
  • Figure 2 only shows the device 1 with components 11-14 and the automatic lesion identification program 01.
  • the structure shown in Figure 1 does not constitute a limitation on the device 1, and may include a Fewer or more components, or combinations of certain components, or different component arrangements.
  • the memory 11 stores the automatic lesion identification program 01; when the processor 12 executes the automatic lesion identification program 01 stored in the memory 11, the following steps are implemented:
  • Step 1 Collect fundus image data, and perform preprocessing operations on the image data.
  • Deep learning model training requires a large amount of training data.
  • the preferred embodiment of this application uses a digital fundus camera (such as Kowa VX-10 ⁇ ) with a 50-degree field of view (FOV) to obtain fundus images, and all images are required to be centered and close to the macula. .
  • a digital fundus camera such as Kowa VX-10 ⁇
  • FOV field of view
  • the resolution of the acquired fundus image is 4288 ⁇ 2848 pixels, and is stored in a jpg file format for use.
  • a preferred embodiment of the present application further preprocesses the image data to generate a suitable training data set.
  • the preprocessing operation includes: image trimming processing, normalization processing, and the like.
  • the image clipping process is used to extract the part of the fundus from the background of the image data.
  • the threshold image segmentation method is used for image cropping.
  • the threshold image segmentation method is an area-based image segmentation technology.
  • the principle is to divide the image pixel set according to the gray level, and each obtained subset forms an area corresponding to the real scene.
  • Each area has Consistent attributes, and adjacent areas do not have this consistent attribute.
  • Such division can be achieved by selecting one or more thresholds starting from the gray level.
  • the fundus image In the fundus image, the human fundus and the background are clearly separated.
  • the fundus image is composed of a bright object and a deep background.
  • Such a composition method divides the object and background pixels with gray levels into two different dominating modes.
  • This application uses the threshold segmentation method to cut and subtract the background in the fundus image to obtain the target area, that is, the fundus area.
  • the normalization process is to convert the original image to be processed into a corresponding unique standard form through a series of transformations (that is, using the invariant moments of the image to find a set of parameters to eliminate the influence of other transformation functions on the image transformation) (This standard form image has invariant characteristics to affine transformations such as translation, rotation, and scaling).
  • the normalization process of the fundus image uses the following formula to convert the color fundus image from the RGB color space to the LUV color space:
  • b is a fixed value, u'and v'are chromaticity coordinates.
  • the fundus image data is converted back to the RGB color space to complete the normalization process.
  • the L channel in the LUV model describes the brightness of the graphics, so a normalization algorithm (such as ordinary averaging) is used on the L channel to eliminate unwanted features such as local contrast and uneven brightness.
  • Step 2 Use the fundus image data after the preprocessing operation to train the lesion recognition model.
  • the lesion recognition model is a Convolutional Neural Networks (CNN for short) model.
  • the convolutional neural network is a feedforward neural network, and its artificial neurons can respond to surrounding units in a part of the coverage area, and have excellent performance for large-scale image processing.
  • CNN includes a convolutional layer and a pooling layer. After obtaining pixel features from an image, it performs layer-by-layer convolution and pooling operations to complete the training of the target.
  • This application uses a CNN image segmentation model based on image blocks, and uses the image blocks extracted from the fundus image in a preset manner as the training data samples of the convolutional neural network model.
  • this application extracts a 256 ⁇ 256 small image block from the fundus image with a step size of 64 pixels to form an initial training sample.
  • this application adopts a data enhancement method to increase the initial training number of samples.
  • the data enhancement method described in this application includes mirror processing, random rotation of 90, 180 and 270 degrees, and color enhancement of the data.
  • the color enhancement mainly includes performing PCA processing on the RGB values of the pixels, and for each image, adding a Gaussian random variable whose size is proportional to the corresponding feature value to the principal component.
  • the preferred embodiment of the present application puts image blocks containing lesions into the training set of positive samples, and image blocks without lesions into the training set of negative samples.
  • the image block of the lesion containing the hard exudate only occupies a small part of the entire fundus picture. Therefore, in the training data, the number of negative samples is often larger than the number of positive samples.
  • the training set with unbalanced positive and negative samples will also affect the recognition effect of the last convolutional neural network model.
  • a preferred embodiment of the present application uses a bootstrap method to resample the positive samples multiple times so that the ratio of positive and negative samples is close to 1:1.
  • this paper also designs the following loss function, which combines the dice function and the cross entropy function:
  • x i,j and y i,j represent pixel coordinates
  • w 2 , w 10 and w 11 are preset coefficients to adjust the imbalance of positive and negative samples.
  • w 10 can be set to 0.7
  • w 11 can be set to 0.3.
  • This application uses the training set of positive and negative samples obtained by the above operations to train the convolutional neural network model.
  • the architecture of the convolutional neural network consists of a down-sampling path and an up-sampling path.
  • the down-sampling path has 2 down-sampling blocks (TD), 2 extended blocks (DL), and up-sampling path Including two upsampling blocks, DenseNets (DB) consists of four layers.
  • the expansion block (DL) merges and connects the output of the convolution block with the expansion rate (1, 3, 5) as the input of the convolution to utilize multi-scale features.
  • Step 3 The fundus image that needs to be identified by the lesion is uniformly generated by using the preset N pixels as the step size, and the lesion recognition model is used for each image block to obtain the probability value of the image block. For all image blocks The probability value of is averaged to obtain the probability value of the lesion in the input fundus image.
  • the model is applied to all the extracted image blocks to obtain the probability map of the entire fundus image.
  • the preferred embodiment of the present application can uniformly generate image blocks for fundus images with preset N pixels, such as 32 pixels, as steps, and apply the lesion recognition model to each image block to obtain The probability value of the image block is calculated by averaging the probability values of all the image blocks to obtain the probability value of the lesion in the input fundus image.
  • the automatic lesion identification program 01 may also be divided into one or more modules, and the one or more modules are stored in the memory 11 and are executed by one or more processors ( This embodiment is executed by the processor 12) to complete the application.
  • the module referred to in this application refers to a series of computer program instruction segments that can complete specific functions, and is used to describe the execution process of the automatic lesion identification program in the device .
  • the automatic lesion identification program 01 can be divided into a data collection module 10, a model training module 20, And the lesion identification module 30.
  • the data collection module 10 is used to collect fundus image data, and perform preprocessing operations on the fundus image data.
  • the preprocessing operation includes image trimming processing and normalization processing.
  • the image clipping process extracts the part of the fundus from the background of the fundus image data through a threshold image segmentation method
  • the normalization process converts the colored fundus image from the RGB color space to the LUV color space through the following formula:
  • b is a fixed value, u'and v'are chromaticity coordinates. Substitute the values of X, Y, Z obtained in formula 1 and the chromaticity coordinates u'and v'into formula 2 to obtain adjustment After the L channel value L* and the U and V channel values u* and v*, the fundus image data is converted back to the RGB color space to complete the normalization process.
  • the model training module 20 is used for training a lesion recognition model by using fundus image data after preprocessing operation.
  • the lesion recognition model is a convolutional neural network model
  • the training of the lesion recognition model using fundus image data after the above preprocessing operation includes:
  • the positive samples are resampled multiple times so that the ratio of positive and negative samples is close to 1:1;
  • x i,j and y i,j represent pixel coordinates
  • w 2 , w 10 and w 11 are preset coefficients to adjust the imbalance of positive and negative samples.
  • the lesion recognition module 30 is configured to: uniformly generate image blocks from the fundus image that needs to be recognized by the lesion with a preset N pixels as the step size, and apply the lesion recognition model to each image block to obtain the probability of the image block Value, the probability value of all image blocks is averaged to obtain the probability value of the lesion in the input fundus image.
  • an embodiment of the present application also proposes a computer-readable storage medium, the computer-readable storage medium stores an automatic lesion identification program, and the automatic lesion identification program can be executed by one or more processors to achieve the following operating:
  • the fundus image that needs to be identified by the lesion is uniformly generated by using the preset N pixels as the step size, and the lesion recognition model is used for each image block to obtain the probability value of the image block, and the probability value for all image blocks Perform an averaging operation to obtain the probability value of the lesion in the input fundus image.

Abstract

本申请涉及一种人工智能技术,揭露了一种病灶自动识别方法,该方法包括:采集眼底图像数据,并对所述眼底图像数据执行预处理操作;利用预处理操作之后的眼底图像数据训练病灶识别模型;及将需要进行病灶识别的眼底图像以预设的N个像素为步长均匀地产生图像块,对每个图像块运用所述病灶识别模型得到该图像块的概率值,对所有图像块的概率值进行平均运算,得到该输入的眼底图像中病灶的概率值。本申请还提出一种装置以及一种计算机可读存储介质。本申请可以实现眼底病灶的自动识别。

Description

病灶自动识别方法、装置及计算机可读存储介质
本申请基于巴黎公约申明享有2019年1月23日递交的申请号为CN201910064338.6、名称为“病灶自动识别方法、装置及计算机可读存储介质”的中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种病灶自动识别方法、装置及计算机可读存储介质。
背景技术
糖尿病视网膜病变是一种主要的致盲疾病,然而糖尿病患者如果能及时发现并且获得规范的治疗,多数可以摆脱失明的危险。几乎所有的眼病都可能发生在糖尿病患者身上,如眼底血管瘤、眼底出血、泪囊炎、青光眼、白内障、玻璃体浑浊、视神经萎缩、黄斑变性、视网膜脱落。而且糖尿病患者发生这些眼病的几率明显高于非糖尿病人群。
目前,糖尿病视网膜病变诊治过程中存在的关键问题是如何更加安全准确,更加及早的发现相关症状,从而采取相应措施预防视力障碍。目前主要的诊断方法仍然是依靠人工的诊断,从眼底彩片中人工识别是否含有病灶,而随着患者数目的增加,传统的人工诊暴露出了成本高、效率低、偶然因素大等问题,亟待自动化的辅助诊断手段。
深度学习在医学影像中已经得到了广泛成熟的应用,医学图像分析已广泛应用于良恶性肿瘤、脑功能与精神障碍、心脑血管疾病等重大疾病的临床辅助筛查、诊断、分级、治疗决策与引导、疗效评估等方面。医学图像分类与识别、定位与检测、组织器官与病灶分割是当前医学图像分析深度学习方法研究的主要应用领域,但同时不同成像原理的医学图像分析和计算机视觉领域中的自然图像分析存在较大的差别。
发明内容
本申请提供一种病灶自动识别方法、装置及计算机可读存储介质,其主要目的在于提供一种实现眼底病灶的自动识别方案。
为实现上述目的,本申请的病灶自动识别方法,包括:
采集眼底图像数据,并对所述眼底图像数据执行预处理操作;
利用预处理操作之后的眼底图像数据训练病灶识别模型;及
将需要进行病灶识别的眼底图像以预设的N个像素为步长均匀地产生图像块,对每个图像块运用所述病灶识别模型得到该图像块的概率值,对所有图像块的概率值进行平均运算,得到该输入的眼底图像中病灶的概率值。
此外,为实现上述目的,本申请还提供一种装置,该装置包括存储器和处理器,所述存储器中存储有可在所述处理器上运行的病灶自动识别程序,所述病灶自动识别程序被所述处理器执行时实现如下步骤:
采集眼底图像数据,并对所述眼底图像数据执行预处理操作;
利用预处理操作之后的眼底图像数据训练病灶识别模型;及
将需要进行病灶识别的眼底图像以预设的N个像素为步长均匀地产生图像块,对每个图像块运用所述病灶识别模型得到该图像块的概率值,对所有图像块的概率值进行平均运算,得到该输入的眼底图像中病灶的概率值。
此外,为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有病灶自动识别程序,所述病灶自动识别程序可被一个或者多个处理器执行,以实现如上所述的病灶自动识别方法的步骤。
本申请提出的病灶自动识别方法、装置及计算机可读存储介质采集眼底图像数据,并对所述图像数据进行预处理操作;利用经上述预处理操作之后的眼底图像数据训练病灶识别模型;及利用上述训练好的病灶识别模型对眼底图像进行病灶识别,输出病灶的概率值。因此,本申请可以实现眼底病灶的自动识别。
附图说明
图1为本申请一实施例提供的病灶自动识别方法的流程示意图;
图2为本申请一实施例提供的装置的内部结构示意图;
图3为本申请一实施例提供的装置中病灶自动识别程序的模块示意图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,所述“第一”、“第二”等的描述仅用于描述目的,而不能理解为指示或暗示其相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。
进一步地,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
另外,各个实施例之间的技术方案可以相互结合,但是必须是以本领域普通技术人员能够实现为基础,当技术方案的结合出现相互矛盾或无法实现时应当认为这种技术方案的结合不存在,也不在本申请要求的保护范围之内。
本申请提供一种病灶自动识别方法。
详细地,参照图1所示,为本申请一实施例提供的病灶自动识别方法的流程示意图。该方法可以由一个装置执行,该装置可以由软件和/或硬件实现。
S1、采集眼底图像数据,并对所述图像数据进行预处理操作。
深度学习的模型训练需要获取大量的训练数据,本申请较佳实施例使用具有50度视场(FOV)的数字眼底相机(比如Kowa VX-10α)获取眼底图像,并且所有图像都要求居中靠近黄斑。
较佳地,所获取的眼底图像的分辨率为4288×2848像素,并以jpg文件格式存储以待使用。
通常,从数字眼底相机中获取的照片无法直接利用于模型训练,因此,本申请较佳实施例进一步对所述图像数据进行预处理操作,以生成合适的训练数据集。
本申请较佳实施例中,所述预处理操作包括图像剪裁处理及归一化处理等。
所述图像剪裁处理用于把眼底的部分从所述图像数据的背景中提取出来。本文采用阈值图像分割法进行图像剪裁处理。
所述阈值图像分割法是一种基于区域的图像分割技术,原理是要按照灰度级,对图像像素集合进行划分,得到的每个子集形成一个与现实景物相对应的区域,各个区域内部具有一致的属性,而相邻区域不具有这种一致属性。这样的划分可以通过从灰度级出发选取一个或多个阈值来实现。
在眼底图像中,人的眼底和背景有明显的分界。眼底图像由亮的对象和深的背景组成,这样的组成方式将对象和背景具有灰度级的像素分成两组不同的支配模式。从背景中提取对象的方法是选择一个门限值T,其中,在眼底图像中,灰度值f(x,y)>T的像素点(x,y)称为对象点;灰度值f(x,y)<=T的像素点(x,y)称为背景点。
本申请通过所述阈值分割法,裁减去掉眼底图像中的背景,得到目标区域,即眼底区域。
所述归一化处理是通过一系列变换(即利用图像的不变矩寻找一组参数使其能够消除其他变换函数对图像变换的影响),将待处理的原始图像转换 成相应的唯一标准形式(该标准形式图像对平移、旋转、缩放等仿射变换具有不变特性)。
本申请较佳实施例中,对眼底图像的归一化处理通过下述公式,将彩色的眼底图像从RGB颜色空间转换为LUV颜色空间:
Figure PCTCN2019116558-appb-000001
Figure PCTCN2019116558-appb-000002
其中,b为定值,u'和v'是色度坐标,将式子①中得到的X、Y、Z的值以及所述色度坐标u'和v'代入式子②中,得到调整后的L通道值L*与U和V通道的值u*和v*后,再将所述眼底图像数据转换回RGB色彩空间,完成归一化处理。LUV模型中L通道描述图形的亮度,因此在L通道上利用归一化算法(如普通的取平均值)消除局部对比度和不均匀亮度这些不需要的特征。
S2、利用预处理操作之后的眼底图像数据训练病灶识别模型。
本申请较佳实施例中,所述病灶识别模型为卷积神经网络(Convolutional Neural Networks,简称CNN)模型。所述卷积神经网络是一种前馈神经网络,它的人工神经元可以响应一部分覆盖范围内的周围单元,对于大型图像处理有出色表现。CNN包括卷积层和池化层,从图像中获取像素特征后进行层层的卷积与池化操作完成对目标的训练。
本申请使用基于图像块的CNN图像分割模型,将所述眼底图像按照预设方式提取的图像块作为卷积神经网络模型的训练数据样本。
为了制作所述训练数据样本,本申请以64个像素为步长从所述眼底图像提取256x256的小图像块。
为了提高所述卷积神经网络模型训练出来的识别效果,模型的训练需要大量的训练数据,因此本申请采用数据增强方法来增加训练数量样本。本申请所述数据增强方法包括镜面处理,90、180和270度的随机旋转,以及对数据的色彩增强。所述色彩增强主要包括对像素的RGB值进行PCA处理,对于每张图,对主成分增加一个其大小与对应的特征值成比例的高斯随机变量。
进一步地,对于有硬性渗出物的眼底图片,本申请较佳实施例将包含有病灶的图像块放入正样本的训练集,将没有包含有病灶的图像块放入负样本的训练集。
通常,包含硬性渗出物的病灶的图像块只占整张眼底图片的一小部分,因此在训练数据中,负样本的数量往往要大于正样本的数量。正负样本不平衡的训练集也会影响最后所述卷积神经网络模型的识别效果。
为了解决训练数据样本中,正负样本严重不平衡的问题,本申请较佳实施例通过自举法,对正样本进行多次重采样,使得正负样本的比例接近1:1。
进一步地,为了解决正负样本不均衡的问题,本文还设计了如下损失函数,该损失函数结合dice函数和交叉熵函数:
L=-mean(w 10*y i,j*log(x i,j)+w 11*(1-y i,j)*log(1-x i,j))
+w 2*dice(y)
其中,x i,j、y i,j表示像素坐标,w 2,w 10及w 11为预先设定的系数,以对正负样本的不平衡进行调节。本申请较佳实施例中,所述w 10可以设成0.7,w 11可以设成0.3。
本申请利用上述得到的正负样本的训练集训练所述卷积神经网络模型。
本申请较佳实施例中,所述卷积神经网络的架构由下采样路径和上采样路径组成,下采样路径具有2个下采样块(TD),2个扩展块(DL),上采样路径包括两个上采样块,DenseNets(DB)由四层构成。其中,扩展块(DL) 将膨胀率(1,3,5)的卷积块输出合并连接起来作为卷积的输入,以利用多尺度特征。
S3、将需要进行病灶识别的眼底图像以预设的N个像素为步长均匀地产生图像块,对每个图像块运用所述病灶识别模型得到该图像块的概率值,对所有图像块的概率值进行平均运算,得到该输入的眼底图像中病灶的概率值。
本申请较佳实施例在所述病灶识别模型训练好后,将此模型应用于所有提取出来的图像块来得到整个眼底图像的概率图。在病灶的识别测试阶段,本申请较佳实施例可以对眼底图像以预设的N个像素,如32个像素为步长均匀地产生图像块,对每个图像块运用所述病灶识别模型得到该图像块的概率值,对所有图像块的概率值进行平均运算,得到该输入的眼底图像中病灶的概率值。
本申请还提供一种执行病灶自动识别的装置。参照图2所示,为本申请一实施例提供的装置的内部结构示意图。
在本实施例中,所述装置1可以是智能手机、平板电脑、便携计算机等终端设备,可以是PC(Personal Computer,个人电脑),也可以是服务器、服务器群组等。该装置1至少包括存储器11、处理器12,通信总线13,以及网络接口14。
其中,存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、磁性存储器、磁盘、光盘等。存储器11在一些实施例中可以是装置1的内部存储单元,例如该装置1的硬盘。存储器11在另一些实施例中也可以是装置1的外部存储设备,例如装置1上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器11还可以既包括装置1的内部存储单元也包括外部存储设备。存储器11不仅可以用于存储安装于装置1的应用软件及各类数据,例如病灶自动识别程序01的代码等,还可以用于暂时地存储已经输出或者将要输出的数据。
处理器12在一些实施例中可以是一中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器或其他数据处理芯片,用于运行存储器11中存储的程序代码或处理数据,例如执行病灶自动识别程序01等。
通信总线13用于实现这些组件之间的连接通信。
网络接口14可选的可以包括标准的有线接口、无线接口(如WI-FI接口),通常用于在该装置1与其他电子设备之间建立通信连接。
可选地,该装置1还可以包括用户接口,用户接口可以包括显示器(Display)、输入单元比如键盘(Keyboard),可选的用户接口还可以包括标准的有线接口、无线接口。可选地,在一些实施例中,显示器可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。其中,显示器也可以适当的称为显示屏或显示单元,用于显示在装置1中处理的信息以及用于显示可视化的用户界面。
图2仅示出了具有组件11-14以及病灶自动识别程序01的装置1,本领域技术人员可以理解的是,图1示出的结构并不构成对装置1的限定,可以包括比图示更少或者更多的部件,或者组合某些部件,或者不同的部件布置。
在图2所示的装置1实施例中,存储器11中存储有病灶自动识别程序01;处理器12执行存储器11中存储的病灶自动识别程序01时实现如下步骤:
步骤一、采集眼底图像数据,并对所述图像数据进行预处理操作。
深度学习的模型训练需要获取大量的训练数据,本申请较佳实施例使用具有50度视场(FOV)的数字眼底相机(比如Kowa VX-10α)获取眼底图像,并且所有图像都要求居中靠近黄斑。
较佳地,所获取的眼底图像的分辨率为4288×2848像素,并以jpg文件格式存储以待使用。
通常,从数字眼底相机中获取的照片无法直接利用于模型训练,因此,本申请较佳实施例进一步对所述图像数据进行预处理操作,以生成合适的训练数据集。
本申请较佳实施例中,所述预处理操作包括:图像剪裁处理、归一化处理等。
所述图像剪裁处理用于把眼底的部分从所述图像数据的背景中提取出来。本文采用阈值图像分割法进行图像剪裁处理。
所述阈值图像分割法是一种基于区域的图像分割技术,原理是要按照灰度级,对图像像素集合进行划分,得到的每个子集形成一个与现实景物相对应的区域,各个区域内部具有一致的属性,而相邻区域不具有这种一致属性。这样的划分可以通过从灰度级出发选取一个或多个阈值来实现。
在眼底图像中,人的眼底和背景有明显的分界。眼底图像由亮的对象和深的背景组成,这样的组成方式将对象和背景具有灰度级的像素分成两组不同的支配模式。从背景中提取对象的方法是选择一个门限值T,其中,在眼底图像中,灰度值f(x,y)>T的像素点(x,y)称为对象点;f(x,y)<=T的像素点(x,y)称为背景点。
本申请通过所述阈值分割法,裁减去掉眼底图像中的背景,得到目标区域,即眼底区域。
所述归一化处理是通过一系列变换(即利用图像的不变矩寻找一组参数使其能够消除其他变换函数对图像变换的影响),将待处理的原始图像转换成相应的唯一标准形式(该标准形式图像对平移、旋转、缩放等仿射变换具有不变特性)。
本申请较佳实施例中,对眼底图像的归一化处理通过下述公式,将彩色的眼底图像从RGB颜色空间转换为LUV颜色空间:
Figure PCTCN2019116558-appb-000003
Figure PCTCN2019116558-appb-000004
其中,b为定值,u'和v'是色度坐标,将式子①中得到的X、Y、Z的值以及所述色度坐标u'和v'代入式子②中,得到调整后的L通道值L*与U和V通道的值u*和v*后,再将所述眼底图像数据转换回RGB色彩空间,完成归一化处理。LUV模型中L通道描述图形的亮度,因此在L通道上利用归一化算法(如普通的取平均值)消除局部对比度和不均匀亮度这些不需要的特征。
步骤二、利用预处理操作之后的眼底图像数据训练病灶识别模型。
本申请较佳实施例中,所述病灶识别模型为卷积神经网络(Convolutional Neural Networks,简称CNN)模型。所述卷积神经网络是一种前馈神经网络,它的人工神经元可以响应一部分覆盖范围内的周围单元,对于大型图像处理有出色表现。CNN包括卷积层和池化层,从图像中获取像素特征后进行层层的卷积与池化操作完成对目标的训练。
本申请使用基于图像块的CNN图像分割模型,将所述眼底图像按照预设方式提取的图像块作为卷积神经网络模型的训练数据样本。
为了制作所述训练数据样本,本申请以64个像素为步长从所述眼底图像提取256x256的小图像块,形成初始训练样本。
为了提高所述卷积神经网络模型训练出来的识别效果,模型的训练需要大量的训练数据,因此本申请采用数据增强方法来增加所述初始训练数量样本。本申请所述数据增强方法包括镜面处理,90、180和270度的随机旋转,以及对数据的色彩增强。所述色彩增强主要包括对像素的RGB值进行PCA处理,对于每张图,对主成分增加一个其大小与对应的特征值成比例的高斯随机变量。
进一步地,对于有硬性渗出物的眼底图片,本申请较佳实施例将包含有病灶的图像块放入正样本的训练集,将没有包含有病灶的图像块放入负样本的训练集。
通常,包含硬性渗出物的病灶的图像块只占整张眼底图片的一小部分,因此在训练数据中,负样本的数量往往要大于正样本的数量。正负样本不平衡的训练集也会影响最后所述卷积神经网络模型的识别效果。
为了解决训练数据样本中,正负样本严重不平衡的问题,本申请较佳实施例通过自举法,对正样本进行多次重采样,使得正负样本的比例接近1:1。
进一步地,为了解决正负样本不均衡的问题,本文还设计了如下损失函数,该损失函数结合dice函数和交叉熵函数:
L=-mean(w 10*y i,j*log(x i,j)+w 11*(1-y i,j)*log(1-x i,j))
+w 2*dice(y)
其中,x i,j、y i,j表示像素坐标,w 2,w 10及w 11为预先设定的系数,以对正负样本的不平衡进行调节。本申请较佳实施例中,所述w 10可以设成0.7,w 11可以设成0.3。
本申请利用上述操作获得的正负样本的训练集训练所述卷积神经网络模型。
本申请较佳实施例中,所述卷积神经网络的架构由下采样路径和上采样路径组成,下采样路径具有2个下采样块(TD),2个扩展块(DL),上采样路径包括两个上采样块,DenseNets(DB)由四层构成。其中,扩展块(DL)将膨胀率(1,3,5)的卷积块输出合并连接起来作为卷积的输入,以利用多尺度特征。
步骤三、将需要进行病灶识别的眼底图像以预设的N个像素为步长均匀地产生图像块,对每个图像块运用所述病灶识别模型得到该图像块的概率值,对所有图像块的概率值进行平均运算,得到该输入的眼底图像中病灶的概率值。
本申请较佳实施例在所述病灶识别模型训练好后,将此模型应用于所有提取出来的图像块来得到整个眼底图像的概率图。在病灶的识别测试阶段,本申请较佳实施例可以对眼底图像以预设的N个像素,如32个像素为步长均匀地产生图像块,对每个图像块运用所述病灶识别模型得到该图像块的概率值,对所有图像块的概率值进行平均运算,得到该输入的眼底图像中病灶的概率值。
可选地,在本申请实施例中,所述病灶自动识别程序01还可以被分割为一个或者多个模块,一个或者多个模块被存储于存储器11中,并由一个或多个处理器(本实施例为处理器12)所执行以完成本申请,本申请所称的模块是指能够完成特定功能的一系列计算机程序指令段,用于描述病灶自动识别程序在所述装置中的执行过程。
例如,参照图3所示,为本申请装置一实施例中的病灶自动识别程序的程序模块示意图,该实施例中,病灶自动识别程序01可以被分割为数据采集模块10、模型训练模块20、及病灶识别模块30。
示例性地:
所述数据采集模块10用于:采集眼底图像数据,并对所述眼底图像数据执行预处理操作。
本申请较佳实施例中,所述预处理操作包括图像剪裁处理以及归一化处理。
优选地,所述图像剪裁处理通阈值图像分割法把眼底的部分从所述眼底图像数据的背景中提取出来;及
所述归一化处理通过下述公式,将彩色的眼底图像从RGB颜色空间转换为LUV颜色空间:
Figure PCTCN2019116558-appb-000005
Figure PCTCN2019116558-appb-000006
其中,b为定值,u'和v'是色度坐标,将式子①中得到的X、Y、Z的值以及所述色度坐标u'和v'代入式子②中,得到调整后的L通道值L*与U和V通道的值u*和v*后,再将所述眼底图像数据转换回RGB色彩空间,完成归一化处理。
所述模型训练模块20用于:利用预处理操作之后的眼底图像数据训练病灶识别模型。
本申请较佳实施例中,所述病灶识别模型为卷积神经网络模型,以及所述利用经上述预处理操作之后的眼底图像数据训练病灶识别模型包括:
以64个像素为步长从所述眼底图像提取256x256的小图像块,形成初始训练样本;
采用数据增强方法增加所述初始训练数量样本;
将包含有病灶的图像块放入正样本的训练集,将没有包含有病灶的图像块放入负样本的训练集;
通过自举法,对正样本进行多次重采样,使得正负样本的比例接近1:1;及
利用正负样本的训练集训练所述卷积神经网络模型。
本申请较佳实施例进一步地利用损失函数调节所述正负样本比例:
L=-mean(w 10*y i,j*log(x i,j)+w 11*(1-y i,j)*log(1-x i,j))
+w 2*dice(y),
其中,x i,j、y i,j表示像素坐标,w 2,w 10及w 11为预先设定的系数,以对正负样本的不平衡进行调节。
所述病灶识别模块30用于:将需要进行病灶识别的眼底图像以预设的N个像素为步长均匀地产生图像块,对每个图像块运用所述病灶识别模型得到该图像块的概率值,对所有图像块的概率值进行平均运算,得到该输入的眼底图像中病灶的概率值。
上述数据采集模块10、模型训练模块20及病灶识别模块30等程序模块被执行时所实现的功能或操作步骤与上述实施例大体相同,在此不再赘述。
此外,本申请实施例还提出一种计算机可读存储介质,所述计算机可读存储介质上存储有病灶自动识别程序,所述病灶自动识别程序可被一个或多个处理器执行,以实现如下操作:
采集眼底图像数据,并对所述图像数据进行预处理操作;
利用预处理操作之后的眼底图像数据训练病灶识别模型;及
将需要进行病灶识别的眼底图像以预设的N个像素为步长均匀地产生图像块,对每个图像块运用所述病灶识别模型得到该图像块的概率值,对所有图像块的概率值进行平均运算,得到该输入的眼底图像中病灶的概率值。
本申请计算机可读存储介质具体实施方式与上述病灶自动识别装置和方法各实施例基本相同,在此不作累述。
需要说明的是,上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。并且本文中的术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种病灶自动识别方法,其特征在于,所述方法包括:
    采集眼底图像数据,并对所述眼底图像数据执行预处理操作;
    利用预处理操作之后的眼底图像数据训练病灶识别模型;及
    将需要进行病灶识别的眼底图像以预设的N个像素为步长均匀地产生图像块,对每个图像块运用所述病灶识别模型得到该图像块的概率值,对所有图像块的概率值进行平均运算,得到该输入的眼底图像中病灶的概率值。
  2. 如权利要求1所述的病灶自动识别方法,其特征在于,所述预处理操作包括图像剪裁处理,所述图像剪裁处理通过阈值图像分割法把眼底部分从所述眼底图像数据中提取出来。
  3. 如权利要求1或2所述的病灶自动识别方法,其特征在于,所述预处理操作包括归一化处理,所述归一化处理通过下述公式,将所述眼底图像数据从RGB颜色空间转换为LUV颜色空间:
    Figure PCTCN2019116558-appb-100001
    Figure PCTCN2019116558-appb-100002
    u *=13L *·(u′-u′ n)
    v *=13L *·(v′-v′ n)       ,②
    其中,b为定值,u'和v'是色度坐标,将式子①中得到的X、Y、Z的值以及所述色度坐标u'和v'代入式子②中,得到调整后的L通道值L*与U和V通道的值u*和v*后,再将所述眼底图像数据转换回RGB色彩空间,完成归一化处理。
  4. 如权利要求1所述的病灶自动识别方法,其特征在于,所述病灶识别模型为卷积神经网络模型,以及所述利用预处理操作之后的眼底图像数据训练病灶识别模型包括:
    以64个像素为步长从所述眼底图像提取256x256的小图像块,形成初始训练样本;
    采用数据增强方法增加所述初始训练数量样本;
    将包含有病灶的图像块放入正样本的训练集,将没有包含有病灶的图像块放入负样本的训练集;
    通过自举法,对正样本进行多次重采样,使得正负样本的比例接近1:1;及
    利用正负样本的训练集训练所述卷积神经网络模型。
  5. 如权利要求4所述的病灶自动识别方法,其特征在于,所述数据增强方法包括镜面处理,90、180和270度的随机旋转,以及对数据的色彩增强。
  6. 如权利要求4所述的病灶自动识别方法,其特征在于,所述卷积神经网络的架构由下采样路径和上采样路径组成,所述下采样路径包括2个下采样块、2个扩展块,所述上采样路径包括2个上采样块。
  7. 如权利要求4所述的病灶自动识别方法,其特征在于,所述病灶自动识别方法还包括:
    利用损失函数调节所述正负样本比例,其中,所述损失函数结合dice函数和交叉熵函数:
    L=-mean(w 10*y i,j*log(x i,j)+w 11*(1-y i,j)*log(1-x i,j))
    +w 2*dice(y)                  ,
    其中,x i,j、y i,j表示像素坐标,w 2,w 10及w 11为预先设定的系数,以对正负样本的不平衡进行调节。
  8. 一种病灶自动识别装置,其特征在于,所述装置包括存储器和处理器,所述存储器上存储有可在所述处理器上运行的病灶自动识别程序,所述病灶自动识别程序被所述处理器执行时实现如下步骤:
    采集眼底图像数据,并对所述眼底图像数据执行预处理操作;
    利用预处理操作之后的眼底图像数据训练病灶识别模型;及
    将需要进行病灶识别的眼底图像以预设的N个像素为步长均匀地产生图像块,对每个图像块运用所述病灶识别模型得到该图像块的概率值,对所有图像块的概率值进行平均运算,得到该输入的眼底图像中病灶的概率值。
  9. 如权利要求8所述的病灶自动识别装置,其特征在于,所述预处理操作包括图像剪裁处理,所述图像剪裁处理通过阈值图像分割法把眼底部分从所述眼底图像数据中提取出来。
  10. 如权利要求8或9所述的病灶自动识别装置,其特征在于,所述预处理操作包括归一化处理,所述归一化处理通过下述公式,将所述眼底图像数据从RGB颜色空间转换为LUV颜色空间:
    Figure PCTCN2019116558-appb-100003
    Figure PCTCN2019116558-appb-100004
    u *=13L *·(u′-u′ n)
    v *=13L *·(v′-v′ n)       ,②
    其中,b为定值,u'和v'是色度坐标,将式子①中得到的X、Y、Z的值以及所述色度坐标u'和v'代入式子②中,得到调整后的L通道值L*与U和V通道的值u*和v*后,再将所述眼底图像数据转换回RGB色彩空间,完成归一化处理。
  11. 如权利要求8所述的病灶自动识别装置,其特征在于,所述病灶识别模型为卷积神经网络模型,以及所述利用预处理操作之后的眼底图像数据训练病灶识别模型包括:
    以64个像素为步长从所述眼底图像提取256x256的小图像块,形成初始训练样本;
    采用数据增强方法增加所述初始训练数量样本;
    将包含有病灶的图像块放入正样本的训练集,将没有包含有病灶的图像块放入负样本的训练集;
    通过自举法,对正样本进行多次重采样,使得正负样本的比例接近1:1;及
    利用正负样本的训练集训练所述卷积神经网络模型。
  12. 如权利要求11所述的病灶自动识别装置,其特征在于,所述数据增强方法包括镜面处理,90、180和270度的随机旋转,以及对数据的色彩增强。
  13. 如权利要求11所述的病灶自动识别装置,其特征在于,所述卷积神经网络的架构由下采样路径和上采样路径组成,所述下采样路径包括2个下采样块、2个扩展块,所述上采样路径包括2个上采样块。
  14. 如权利要求11所述的病灶自动识别装置,其特征在于,所述病灶自动识别程序被所述处理器执行时还实现如下步骤:
    利用损失函数调节所述正负样本比例,其中,所述损失函数结合dice函数和交叉熵函数:
    L=-mean(w 10*y i,j*log(x i,j)+w 11*(1-y i,j)*log(1-x i,j))
    +w 2*dice(y)                   ,
    其中,x i,j、y i,j表示像素坐标,w 2,w 10及w 11为预先设定的系数,以对正负样本的不平衡进行调节。
  15. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有病灶自动识别程序,所述病灶自动识别程序可被一个或者多个处理器执行,以实现如下步骤:
    采集眼底图像数据,并对所述眼底图像数据执行预处理操作;
    利用预处理操作之后的眼底图像数据训练病灶识别模型;及
    将需要进行病灶识别的眼底图像以预设的N个像素为步长均匀地产生图像块,对每个图像块运用所述病灶识别模型得到该图像块的概率值,对所有图像块的概率值进行平均运算,得到该输入的眼底图像中病灶的概率值。
  16. 如权利要求15所述的计算机可读存储介质,其特征在于,所述预处理操作包括图像剪裁处理,所述图像剪裁处理通过阈值图像分割法把眼底部分从所述眼底图像数据中提取出来。
  17. 如权利要求15或16所述的计算机可读存储介质,其特征在于,所述预处理操作包括归一化处理,所述归一化处理通过下述公式,将所述眼底图像数据从RGB颜色空间转换为LUV颜色空间:
    Figure PCTCN2019116558-appb-100005
    Figure PCTCN2019116558-appb-100006
    u *=13L *·(u′-u′ n)
    v *=13L *·(v′-v′ n)       ,②
    其中,b为定值,u'和v'是色度坐标,将式子①中得到的X、Y、Z的值以及所述色度坐标u'和v'代入式子②中,得到调整后的L通道值L*与U和V通道的值u*和v*后,再将所述眼底图像数据转换回RGB色彩空间,完成归一化处理。
  18. 如权利要求15所述的计算机可读存储介质,其特征在于,所述病灶识别模型为卷积神经网络模型,以及所述利用预处理操作之后的眼底图像数据训练病灶识别模型包括:
    以64个像素为步长从所述眼底图像提取256x256的小图像块,形成初始训练样本;
    采用数据增强方法增加所述初始训练数量样本;
    将包含有病灶的图像块放入正样本的训练集,将没有包含有病灶的图像块放入负样本的训练集;
    通过自举法,对正样本进行多次重采样,使得正负样本的比例接近1:1;及
    利用正负样本的训练集训练所述卷积神经网络模型。
  19. 如权利要求18所述的计算机可读存储介质,其特征在于,所述数据增强方法包括镜面处理,90、180和270度的随机旋转,以及对数据的色彩增强。
  20. 如权利要求18所述的计算机可读存储介质,其特征在于,所述卷积神经网络的架构由下采样路径和上采样路径组成,所述下采样路径包括2个下采样块、2个扩展块,所述上采样路径包括2个上采样块。
PCT/CN2019/116558 2019-01-23 2019-11-08 病灶自动识别方法、装置及计算机可读存储介质 WO2020151307A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910064338.6A CN109902717A (zh) 2019-01-23 2019-01-23 病灶自动识别方法、装置及计算机可读存储介质
CN201910064338.6 2019-01-23

Publications (1)

Publication Number Publication Date
WO2020151307A1 true WO2020151307A1 (zh) 2020-07-30

Family

ID=66944109

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/116558 WO2020151307A1 (zh) 2019-01-23 2019-11-08 病灶自动识别方法、装置及计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN109902717A (zh)
WO (1) WO2020151307A1 (zh)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112016634A (zh) * 2020-09-30 2020-12-01 北京百度网讯科技有限公司 医学图像识别方法、装置、设备以及存储介质
CN112561918A (zh) * 2020-12-31 2021-03-26 中移(杭州)信息技术有限公司 卷积神经网络的训练方法和病灶分割方法
CN112580530A (zh) * 2020-12-22 2021-03-30 泉州装备制造研究所 一种基于眼底图像的身份识别方法
CN113077464A (zh) * 2021-05-06 2021-07-06 吴国军 一种医学图像处理方法、医学图像识别方法及装置
CN113077434A (zh) * 2021-03-30 2021-07-06 零氪智慧医疗科技(天津)有限公司 基于多模态信息的肺癌识别方法、装置及存储介质
CN113706514A (zh) * 2021-08-31 2021-11-26 平安科技(深圳)有限公司 基于模板图像的病灶定位方法、装置、设备及存储介质
CN115578554A (zh) * 2021-06-21 2023-01-06 数坤(北京)网络科技股份有限公司 一种血管病灶识别方法、装置、电子设备和可读存储介质
CN117152128A (zh) * 2023-10-27 2023-12-01 首都医科大学附属北京天坛医院 神经影像的病灶识别方法、装置、电子设备和存储介质

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109902717A (zh) * 2019-01-23 2019-06-18 平安科技(深圳)有限公司 病灶自动识别方法、装置及计算机可读存储介质
CN110378885B (zh) * 2019-07-19 2023-07-04 王晓骁 一种基于机器学习的wsi病灶区域自动标注方法及系统
CN110428377B (zh) * 2019-07-26 2023-06-30 北京康夫子健康技术有限公司 数据扩充方法、装置、设备和介质
CN111259986B (zh) * 2020-02-20 2023-10-31 中南大学 自由瞬目条件下的眼表指标数据分类方法
CN111915609B (zh) * 2020-09-22 2023-07-14 平安科技(深圳)有限公司 病灶检测分析方法、装置、电子设备及计算机存储介质
CN112541906B (zh) * 2020-12-17 2022-10-25 上海鹰瞳医疗科技有限公司 一种数据处理的方法及装置、电子设备、存储介质
CN113111960B (zh) * 2021-04-25 2024-04-26 北京文安智能技术股份有限公司 图像处理方法和装置以及目标检测模型的训练方法和系统
CN114332128B (zh) * 2021-12-30 2022-07-26 推想医疗科技股份有限公司 医学图像处理方法及其装置、电子设备和计算机存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108021916A (zh) * 2017-12-31 2018-05-11 南京航空航天大学 基于注意力机制的深度学习糖尿病视网膜病变分类方法
CN108520278A (zh) * 2018-04-10 2018-09-11 陕西师范大学 一种基于随机森林的路面裂缝检测方法及其评价方法
CN108665447A (zh) * 2018-04-20 2018-10-16 浙江大学 一种基于眼底照相深度学习的青光眼图像检测方法
CN108846835A (zh) * 2018-05-31 2018-11-20 西安电子科技大学 基于深度可分离卷积网络的图像变化检测方法
WO2018200840A9 (en) * 2017-04-27 2018-11-29 Retinopathy Answer Limited System and method for automated funduscopic image analysis
CN109902717A (zh) * 2019-01-23 2019-06-18 平安科技(深圳)有限公司 病灶自动识别方法、装置及计算机可读存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018200840A9 (en) * 2017-04-27 2018-11-29 Retinopathy Answer Limited System and method for automated funduscopic image analysis
CN108021916A (zh) * 2017-12-31 2018-05-11 南京航空航天大学 基于注意力机制的深度学习糖尿病视网膜病变分类方法
CN108520278A (zh) * 2018-04-10 2018-09-11 陕西师范大学 一种基于随机森林的路面裂缝检测方法及其评价方法
CN108665447A (zh) * 2018-04-20 2018-10-16 浙江大学 一种基于眼底照相深度学习的青光眼图像检测方法
CN108846835A (zh) * 2018-05-31 2018-11-20 西安电子科技大学 基于深度可分离卷积网络的图像变化检测方法
CN109902717A (zh) * 2019-01-23 2019-06-18 平安科技(深圳)有限公司 病灶自动识别方法、装置及计算机可读存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHEN-ZHEN CAI, TANG PENG , HU JIAN-BIN , JIN WEI-DONG: "Auto-detection of Hard Exudates Based on Deep Convolutional Neural Network", COMPUTER SCIENCE, vol. 45, no. 11A, 30 November 2018 (2018-11-30), pages 203 - 207, XP055723673 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112016634B (zh) * 2020-09-30 2023-07-28 北京百度网讯科技有限公司 医学图像识别方法、装置、设备以及存储介质
CN112016634A (zh) * 2020-09-30 2020-12-01 北京百度网讯科技有限公司 医学图像识别方法、装置、设备以及存储介质
CN112580530A (zh) * 2020-12-22 2021-03-30 泉州装备制造研究所 一种基于眼底图像的身份识别方法
CN112561918A (zh) * 2020-12-31 2021-03-26 中移(杭州)信息技术有限公司 卷积神经网络的训练方法和病灶分割方法
CN113077434B (zh) * 2021-03-30 2023-01-24 零氪智慧医疗科技(天津)有限公司 基于多模态信息的肺癌识别方法、装置及存储介质
CN113077434A (zh) * 2021-03-30 2021-07-06 零氪智慧医疗科技(天津)有限公司 基于多模态信息的肺癌识别方法、装置及存储介质
CN113077464A (zh) * 2021-05-06 2021-07-06 吴国军 一种医学图像处理方法、医学图像识别方法及装置
CN115578554A (zh) * 2021-06-21 2023-01-06 数坤(北京)网络科技股份有限公司 一种血管病灶识别方法、装置、电子设备和可读存储介质
CN115578554B (zh) * 2021-06-21 2024-02-02 数坤(上海)医疗科技有限公司 一种血管病灶识别方法、装置、电子设备和可读存储介质
CN113706514A (zh) * 2021-08-31 2021-11-26 平安科技(深圳)有限公司 基于模板图像的病灶定位方法、装置、设备及存储介质
CN113706514B (zh) * 2021-08-31 2023-08-11 平安科技(深圳)有限公司 基于模板图像的病灶定位方法、装置、设备及存储介质
CN117152128A (zh) * 2023-10-27 2023-12-01 首都医科大学附属北京天坛医院 神经影像的病灶识别方法、装置、电子设备和存储介质
CN117152128B (zh) * 2023-10-27 2024-02-27 首都医科大学附属北京天坛医院 神经影像的病灶识别方法、装置、电子设备和存储介质

Also Published As

Publication number Publication date
CN109902717A (zh) 2019-06-18

Similar Documents

Publication Publication Date Title
WO2020151307A1 (zh) 病灶自动识别方法、装置及计算机可读存储介质
US20220076420A1 (en) Retinopathy recognition system
CN110033456B (zh) 一种医疗影像的处理方法、装置、设备和系统
CN108510482B (zh) 一种基于阴道镜图像的宫颈癌检测装置
Bilal et al. A Transfer Learning and U-Net-based automatic detection of diabetic retinopathy from fundus images
WO2022088665A1 (zh) 病灶分割方法、装置及存储介质
Wannous et al. Supervised tissue classification from color images for a complete wound assessment tool
WO2020151149A1 (zh) 微动脉瘤自动检测方法、装置及计算机可读存储介质
WO2020001236A1 (zh) 提取医学图像标注的方法及装置
WO2020259453A1 (zh) 3d图像的分类方法、装置、设备及存储介质
WO2021057538A1 (zh) 基于人工智能的图像处理方法、显微镜、系统和介质
CN111062947B (zh) 一种基于深度学习的x光胸片病灶定位方法及系统
WO2021042823A1 (zh) 图片检测方法及装置
WO2021159811A1 (zh) 青光眼辅助诊断装置、方法及存储介质
CN113313680B (zh) 一种结直肠癌病理图像预后辅助预测方法及系统
Sánchez et al. Improving hard exudate detection in retinal images through a combination of local and contextual information
CN104182723B (zh) 一种视线估计的方法和装置
Vodrahalli et al. TrueImage: a machine learning algorithm to improve the quality of telehealth photos
WO2023155488A1 (zh) 基于多源多尺度特征融合的眼底图像质量评价方法和装置
CN112102926A (zh) 一种图像处理方法、装置、设备和存储介质
CN110473176B (zh) 图像处理方法及装置、眼底图像处理方法、电子设备
CN113129390B (zh) 一种基于联合显著性的色盲图像重新着色方法及系统
WO2019073962A1 (ja) 画像処理装置及びプログラム
Fu et al. M3ResU-Net: a deep residual network for multi-center colorectal polyp segmentation based on multi-scale learning and attention mechanism
CN112288697B (zh) 用于量化异常程度的方法、装置、电子设备及可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19911554

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 14.09.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19911554

Country of ref document: EP

Kind code of ref document: A1