WO2019037654A1 - 3d图像检测方法、装置、电子设备及计算机可读介质 - Google Patents

3d图像检测方法、装置、电子设备及计算机可读介质 Download PDF

Info

Publication number
WO2019037654A1
WO2019037654A1 PCT/CN2018/100838 CN2018100838W WO2019037654A1 WO 2019037654 A1 WO2019037654 A1 WO 2019037654A1 CN 2018100838 W CN2018100838 W CN 2018100838W WO 2019037654 A1 WO2019037654 A1 WO 2019037654A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
sub
super
pixel grid
detecting
Prior art date
Application number
PCT/CN2018/100838
Other languages
English (en)
French (fr)
Inventor
李正龙
Original Assignee
京东方科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司 filed Critical 京东方科技集团股份有限公司
Priority to US16/335,098 priority Critical patent/US10929643B2/en
Publication of WO2019037654A1 publication Critical patent/WO2019037654A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/2163Partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • G06V10/763Non-hierarchical techniques, e.g. based on statistics of modelling distributions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Definitions

  • the present disclosure relates to the field of image data processing technologies, and in particular, to a 3D image detection method, apparatus, electronic device, and computer readable medium.
  • Medical image data plays an extremely important role in the medical diagnosis process.
  • deep learning technology has been widely used in medical image processing.
  • Deep learning technology can effectively use a large amount of image data, learn to acquire knowledge, assist doctors to read and judge case images, such as X-ray film, DR, ultrasound and other two-dimensional images, and deep learning has been practically applied in hospitals.
  • Embodiments of the present disclosure relate to a 3D image detecting method, apparatus, electronic device, and computer readable medium.
  • a 3D image detecting method including:
  • the size of the 3D image is C ⁇ H ⁇ W, where C, H, and W are the number of sheets of the 2D image and the height and width of the 2D image, respectively.
  • Layering includes:
  • the performing intra-layer clustering on the 3D sub-image includes:
  • the LxM grids are used as initial values, and clustering is performed using a super pixel algorithm.
  • the superpixel grid is input into a neural network for detection, and the first classifier is used for detection.
  • the 3D sub-image forming the target superpixel grid is detected, and the second classifier is used for detection, the second classifier having a higher precision than the first classifier Precision.
  • a 3D image detecting apparatus comprising:
  • a layering module configured to layer the 3D image to obtain at least one 3D sub-image, wherein the 3D sub-image includes a plurality of 2D images;
  • a clustering module configured to perform intra-layer clustering on the 3D sub-image to obtain a super-pixel grid
  • a first detecting module configured to input the super pixel grid into a deep neural network for detection
  • the second detecting module is configured to detect, in response to detecting the target in the super pixel grid, the 3D sub-image forming the super pixel grid containing the target, and obtain and output the detection result.
  • the size of the 3D image is C ⁇ H ⁇ W, where C, H, and W are the number of sheets of the 2D image and the height and width of the 2D image, respectively, and the layering module further
  • the first detecting module detects the super pixel grid by using a first classifier
  • the second detecting module uses the second classifier to cluster the super pixel grid containing the target The 3D sub-image is detected, wherein the accuracy of the second classifier is greater than the accuracy of the first classifier.
  • an electronic device comprising a processor; a memory; and computer executable instructions stored in the memory, such that when the computer executable instructions are executed by the processor, The processor performs one or more steps of a 3D image detection method provided by at least one embodiment of the present disclosure.
  • a computer readable medium having stored thereon computer executable instructions that, when executed by a processor, implement the 3D image detection method provided by at least one of the above embodiments One or more steps.
  • FIG. 1 is a schematic diagram of a 3D image detecting method provided in an embodiment of the present disclosure.
  • FIG. 2 shows a schematic diagram of step S12 in an embodiment of the present disclosure.
  • FIG. 3 shows a schematic diagram of a 3D image detecting apparatus provided in another embodiment of the present disclosure.
  • FIG. 4 is a schematic structural diagram of a computer system of an electronic device provided in still another embodiment of the present disclosure.
  • a neural network has a wide range of applications in the processing of 2D images, applying it to the processing of three-dimensional images may have at least one of the following problems:
  • three-dimensional images generally need to consider the information of adjacent multiple images when interpreting lesions.
  • the neural network is used to judge the lesion
  • the three-dimensional image is directly input into the neural network, and an end-to-end training process is implemented.
  • the number of slices of CT data is 100, and the height and width of the image remain unchanged
  • the number of pixels of the CT data is 100 times that of the two-dimensional image, even if the image processor is completely used.
  • the GPU performs processing, and the amount of GPU memory used is much larger than that of a 2D image of the same size. Therefore, for three-dimensional images, the amount of memory used for training using neural networks is too large.
  • the size of the training model is limited by the size of the GPU memory, and it is difficult to use a highly complex neural network structure.
  • CNN Convolutional Neural Networks
  • a traditional convolutional neural network usually consists of an input layer, a convolutional layer, a pooled layer, and a fully connected layer, that is, INPUT (input layer)-CONV (convolution layer)-POOL (pooling layer)-FC (full Connection layer) - OUTPUT (output layer).
  • the convolution layer performs feature extraction; the pooling layer reduces the dimension of the input feature map; the full connection layer is used to connect all the features and output.
  • convolutional neural networks in addition to the traditional convolutional neural networks listed above, can be full convolutional neural networks FCN, segmentation networks SegNet, hole convolution Dilated Convolutions, depth based on atrous convolution Neural network DeepLab (V1 & V2), deep-scale neural network DeepLab (V3) based on multi-scale convolution, multi-channel segmentation neural network RefineNet and so on.
  • the so-called image may be various types of images, such as medical images.
  • medical images may include ultrasound images, Computed Tomography (CT), Magnetic Resonance Imaging (MRI) images, Digital Subtraction Angiography (DSA), and positrons. Positron Emission Computed Tomography PET, etc.
  • the medical image may include a brain tissue nuclear magnetic resonance image, a spinal MRI image, a fundus image, a blood vessel image, a pancreatic CT image, and a lung CT image.
  • an image can be acquired by an image capture device.
  • the image capture device may include, for example, an ultrasound device, an X-ray device, a nuclear magnetic resonance device, a nuclear medicine device, a medical optical device, and a thermal imaging device.
  • 3D images such as 3D medical images, include Computed Tomography (CT), Magnetic Resonance Imaging (MRI) images, Digital Subtraction Angiography (3D DSA), and positive Positron Emission Computed Tomography PET, etc.
  • CT Computed Tomography
  • MRI Magnetic Resonance Imaging
  • 3D DSA Digital Subtraction Angiography
  • 3D DSA Digital Subtraction Angiography
  • PET positive Positron Emission Computed Tomography PET
  • the image may also be a character image, an animal or plant image or a landscape image, etc.
  • the corresponding 3D image may be formed by a 3D camera such as a 3D light field camera, a ToF camera, a multi-lens camera, an RGB-D camera, or the like. .
  • the image may be a grayscale image or a color image.
  • it can be decomposed into R, G, B single color channel images.
  • FIG. 1 is a schematic diagram of a 3D image detection method provided by some embodiments of the present disclosure.
  • the 3D image detection method includes the following steps:
  • step S11 the 3D image is layered to obtain at least one 3D sub-image, wherein each 3D sub-image includes a plurality of 2D images.
  • step S12 intra-layer clustering is performed on the 3D sub-image to obtain a super-pixel grid.
  • step S13 the super pixel grid is input to the deep neural network for detection.
  • step S14 if a target is detected in the super pixel grid, the 3D sub-image before clustering of the super pixel mesh containing the target is detected, and the detection result is obtained and output.
  • the 3D image detecting method provided by the embodiment of the present disclosure obtains a super-pixel grid by hierarchical clustering, performs low-precision detection on the super-pixel grid, and performs high-precision detection on the super-pixel grid in which the target is detected, thereby achieving Reduce the purpose of using memory.
  • layering the 3D image includes: marking the 3D image I C ⁇ H ⁇ W by a size of C ⁇ H ⁇ W, where C, H, and W are the number of channels of the image respectively (ie, composing a 3D image) The number of sheets of the 2D image), the height and width of the 2D image.
  • Splitting the 3D image I C ⁇ H ⁇ W into K 3D sub-images in sequence (for example, from top to bottom, etc.) Where C i C/K, K is a natural number greater than 1, such that each 3D sub-image There are C i ⁇ H ⁇ W pixels. That is, the 3D image is split on the channel according to each C i picture.
  • a 3 ⁇ image of 100 ⁇ 320 ⁇ 420 is 100 images of 320 ⁇ 420, 100 is the channel number, 320 is the height of the image, and 420 is the width of the image.
  • An image of size 320x420 is a 25-dimensional vector.
  • an operation of performing intra-layer clustering on a 3D sub-image in some embodiments of the present disclosure to obtain a super-pixel grid includes the following steps:
  • step S21 the 3D sub-image is divided into LxM grids in the height and width directions, where L and M are natural numbers greater than one.
  • L and M may be unequal, or may be equal (ie, LxL or MxM), only to divide multiple meshes into 3D sub-images of high H and wide W.
  • the size of the grid i.e., the value of L and / or M
  • the shape i.e., whether L and M are equal
  • step S22 LxM grids are used as initial values, and clustering is performed using a super pixel algorithm to obtain a super pixel grid.
  • the 3D sub-image is divided into LxL grids in the height and width directions, that is, the grid may be a square grid (which may also be a rectangular grid).
  • each grid is represented by a single vector, which will be clustered. Expressed as Can be seen as an LxL size image, each pixel is a C i -dimensional vector.
  • embodiments of the present disclosure obtain a corresponding super-pixel grid by clustering 3D sub-images, and each super-pixel grid includes a plurality of grids generated by dividing in the high and wide directions.
  • the number of pixels of the 3D sub-image is reduced from C i ⁇ H ⁇ W to LxL, that is, through step S11.
  • S12 perform intra-layer clustering on sub-images to reduce 3D sub-images
  • the information is redundant, and the related pixels are fused to achieve the purpose of dimension reduction, the number of pixels is reduced, and the resolution is also reduced.
  • Super-pixels can be generated by over-segmentation of various image segmentation methods. For example, SLIC (Simple Linear Iterative Cluster), SEEDS (Superpixels Extracted via Energy-Driven Sampling), LSC (Linear Spectral Clustering), etc.
  • SLIC Simple Linear Iterative Cluster
  • SEEDS Superpixels Extracted via Energy-Driven Sampling
  • LSC Linear Spectral Clustering
  • the SLIC algorithm is used to generate superpixels as an example.
  • the general idea of the SLIC algorithm is: Converting from RGB color space to CIE-Lab color space, corresponding to each pixel's (L, a, b) color value and (x, y) coordinates to form a 5-dimensional vector V[L, a, b, x, y]
  • V[L, a, b, x, y] The similarity of two pixels can be measured by their vector distance. The larger the distance, the smaller the similarity.
  • the SLIC algorithm first generates K seed points, then searches for the nearest pixels of the seed point in the surrounding space of each seed point, and classifies them as the seed point until all the pixels are classified. Then calculate the average vector value of all the pixels in the K superpixels, regenerate the K cluster centers, and then use the K centers to search for the most similar pixels around them, all the pixels are classified and then re-classified. Get K superpixels, update the cluster center, iterate again, and repeat until convergence.
  • the class center, then the 2S*2S around the cluster center is its search space, and the most similar points are searched in this space.
  • the cluster center is moved to the region with the smallest gradient in the 3*3 window, and the gradient is defined as
  • G(x,y) [V(x+1,y)-V(x-1,y)] ⁇ 2+[V(x,y+1)-V(x,y-1)] ⁇ 2
  • L, a, b are limited in the CIE-Lab color space, the size of L, a, b is limited, and the image size is not limited. If the size of the image is large, it will cause the space distance (x, y) when measuring the vector distance. The impact is too large.
  • x, y is normalized in order to modulate the effect of the spatial distance (x, y).
  • the measure of the improved vector distance is as follows:
  • D_lab [(Lk-Li) ⁇ 2+(ak-ai) ⁇ 2+(bk-bi) ⁇ 2] ⁇ 0.5
  • m is the weight used to adjust d_xy, which can be selected as 1-20, for example, set to 10.
  • the small area d is reclassified as the largest super connected to this small area d. Go in the pixels to ensure the integrity of each super pixel.
  • the "detection" in step S13 and step S14 refers to classification using a neural network, except that the super-pixel grid is input into the deep neural network for detection in step S13.
  • the device performs detection, and the 3D sub-image is detected in step S14, that is, the 3D sub-images forming the super-pixel grid containing the target are detected one by one, and the second classifier is used for detection, and the accuracy of the second classifier is detected. Greater than the accuracy of the first classifier.
  • the result of the detection of step S13 is to determine whether there is a detected target in the super pixel grid, such as whether or not there is a lesion. If the target is detected in the super-pixel grid, it indicates that the cluster constitutes a super-pixel grid, and there is a lesion in the 3D sub-image before the cluster, which requires further fine detection, that is, step S14; if in the super-pixel network No target is detected in the grid, indicating that the cluster constitutes this super-pixel grid. There is no lesion in the 3D sub-image before clustering, and the detection is over, and no more detailed detection is needed.
  • the 3D sub-image after the dimensional reduction may be first detected to determine whether a pulmonary nodule may exist, and the first used in the process.
  • a classifier has a low false alarm rate while allowing a false alarm rate to be higher.
  • the two probabilities of the missed alarm rate and the false alarm rate are as small as possible, in general, the two cannot achieve the optimal at the same time.
  • the false alarm probability is smaller and the probability of the missed alarm is greater; The smaller the probability of a missed alarm, the greater the probability of false alarms.
  • the probability of making the other error is as small as possible, that is, the low alarm rate is guaranteed as much as possible in the coarse detection in this embodiment. That is, the target is not missed, but the false alarm rate will be higher at this time, but the purpose of the first level check (for example, it can be called a rough check) is to select the super pixel grid containing the target.
  • step S14 after the target is detected, the 3D sub-image before the clustering of the super-pixel grid containing the target is further subjected to the second-level detection (for example, it may be referred to as a fine check), since the clustering is performed in step S13.
  • the detection by the super pixel is for the image with lower resolution.
  • the fine detection in step S14 is detected according to the resolution of the 2D image constituting the 3D sub image, so the accuracy of the second classifier utilized is used. Also higher than the accuracy of the first classifier.
  • the detection result obtained by performing the fine detection in step S14 in this embodiment may be the position or shape of the lesion on the 2D image, or may directly output whether or not there is a lesion.
  • the specific type of detection result can be adjusted by selecting the corresponding neural network structure.
  • the neural network can output the lesion or the type of lesion by the SoftMax classifier, LR classifier, etc.; select the neural network for image segmentation.
  • R-CNN, SegNet, etc. can be used to output the position or shape of the lesion on the 2D image. .
  • the 3D image detection method provided by the embodiment of the present disclosure performs layering and clustering on the 3D image, and performs layered detection on the super pixel grid obtained by clustering. If there is a target, further fine detection is performed, which can reduce The demand for the computing power of the system not only occupies less memory, but also has simple training and shorter running time, which improves the efficiency of detection.
  • FIG. 3 also shows a schematic diagram of a 3D image detecting apparatus provided in another embodiment of the present disclosure.
  • the apparatus 300 includes a layering module 310, a clustering module 320, a first detecting module 330, and a second detecting module 340.
  • the layering module 310 is configured to layer the 3D image to obtain at least one 3D sub-image, wherein the 3D sub-image includes a plurality of 2D images;
  • the clustering module 320 is configured to perform intra-layer clustering on the 3D sub-image to obtain a super-pixel grid
  • the first detecting module 330 is configured to input the super pixel grid into the deep neural network for detection
  • the second detecting module 340 is configured to detect the 3D sub-image before the clustering of the super pixel mesh containing the target when the target is detected in the super pixel grid, and obtain and output the detection result.
  • the first detecting module 330 detects the super pixel grid by using the first classifier
  • the second detecting module 340 uses the second classifier to cluster the 3D sub-pixels of the super pixel grid containing the target. The image is detected, wherein the accuracy of the second classifier is greater than the accuracy of the first classifier.
  • the 3D image detecting device in this embodiment can achieve the same technical effects as the above-described 3D image detecting method, and details are not described herein again.
  • embodiments of the present disclosure also provide an electronic device including a processor; a memory; and computer executable instructions stored in the memory, when the computer executable instructions are executed by the processor Having the processor perform one or more steps of the 3D image detection method described below:
  • the 3D sub-image forming the super pixel grid containing the target is detected, and the detection result is obtained and output.
  • FIG. 4 there is shown a block diagram of a computer system 400 suitable for use in implementing an electronic device of an embodiment of the present disclosure.
  • the electronic device shown in FIG. 4 is merely an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
  • computer system 800 includes one or more processors 801 that can perform various operations in accordance with program instructions stored in memory 802 (eg, program instructions are stored in read only memory (ROM) or conventional disk storage.
  • the memory 802 is loaded and loaded into a random access memory (RAM).
  • RAM random access memory
  • various programs and data required for the operation of the computer system 800 are also stored.
  • the processor 801 and the memory 802 are connected to each other through a bus 803.
  • An input/output (I/O) interface 804 is also coupled to bus 803.
  • I/O interface 804 A variety of components can be coupled to I/O interface 804 to effect the output and output of information.
  • an input device 805 including a keyboard, a mouse, etc.
  • an output device 806 including a cathode ray tube (CRT), a liquid crystal display (LCD), and the like
  • a communication device 807 including a network interface card such as a LAN card, a modem, and the like.
  • the communication device 807 performs communication processing via a network such as the Internet.
  • Driver 808 is also connected to I/O interface 803 as needed.
  • a removable medium 809 such as a magnetic disk, an optical disk, a flash memory or the like, is connected or mounted to the drive 808 as needed.
  • the processor 801 can be a central processing unit (CPU) or a field programmable logic array (FPGA) or a single chip microcomputer (MCU) or a digital signal processor (DSP) or an application specific integrated circuit (ASIC), etc., having data processing capabilities and/or A logic operation device for program execution capability.
  • CPU central processing unit
  • FPGA field programmable logic array
  • MCU single chip microcomputer
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • the bus 803 may be a Front Side Bus (FSB), a QuickPath Interconnect (QPI), a direct media interface (DMI), a Peripheral Component Interconnect (PCI), a Peripheral Component Interconnect Express (PCI-E), a HyperTransport (HT), or the like.
  • FBB Front Side Bus
  • QPI QuickPath Interconnect
  • DMI direct media interface
  • PCI Peripheral Component Interconnect
  • PCI-E Peripheral Component Interconnect Express
  • HT HyperTransport
  • an embodiment of the present disclosure includes a computer program product comprising a computer program carried on a computer readable medium, the computer program comprising program code for performing the image processing method of at least one embodiment of the present disclosure.
  • the computer program can be downloaded and installed from the network via communication device 807, and/or installed from removable media 809.
  • the above-described functions defined in the system of the present disclosure are executed when the computer program is executed by the processor 801.
  • the computer readable medium shown in the present disclosure may be a computer readable signal medium or a computer readable medium or any combination of the two.
  • the computer readable medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above. More specific examples of computer readable media may include, but are not limited to, electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable Read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • a computer readable medium can be any tangible medium that can contain or store a program, which can be used by or in connection with an instruction execution system, apparatus, or device.
  • a computer readable signal medium may include a data signal that is propagated in the baseband or as part of a carrier, carrying computer readable program code. Such propagated data signals can take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer readable signal medium can also be any computer readable medium other than a computer readable medium that can transmit, propagate or transport a program for use by or in connection with an instruction execution system, apparatus or device.
  • Program code embodied on a computer readable medium can be transmitted by any suitable medium, including but not limited to wireless, wire, optical cable, RF, etc., or any suitable combination of the foregoing.
  • each block of the flowchart or block diagrams can represent a module, a program segment, or a portion of code that includes one or more Executable instructions.
  • the functions noted in the blocks may also occur in a different order than that illustrated in the drawings. For example, two successively represented blocks may in fact be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams or flowcharts, and combinations of blocks in the block diagrams or flowcharts can be implemented by a dedicated hardware-based system that performs the specified function or operation, or can be used A combination of dedicated hardware and computer instructions is implemented.
  • the units described in the embodiments of the present disclosure may be implemented by software or by hardware.
  • the described unit may also be provided in the processor, for example, as a processor comprising a transmitting unit, an obtaining unit, a determining unit and a first processing unit.
  • the name of these units does not constitute a limitation on the unit itself in some cases.
  • the sending unit may also be described as “a unit that sends a picture acquisition request to the connected server”.
  • the present disclosure also provides a computer readable medium, which may be included in the apparatus described in the above embodiments, or may be separately present and not incorporated into the apparatus.
  • the computer readable medium described above carries one or more programs that, when executed by a processor of the device, cause the processor to perform one or more steps of the 3D image detection method described below:
  • the 3D sub-image forming the super pixel grid containing the target is detected, and the detection result is obtained and output.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

提供了一种3D图像检测方法、装置、电子设备及计算机可读介质。该3D图像检测方法包括:对3D图像进行分层,得到至少一个3D子图像,其中所述3D子图像包含多张2D图像;对所述3D子图像进行层内聚类,得到超像素网格;将所述超像素网格输入到神经网络中进行检测;响应于在所述超像素网格中检测到目标,对形成含有目标的超像素网格的所述3D子图像进行检测,得到并输出检测结果。

Description

3D图像检测方法、装置、电子设备及计算机可读介质
交叉引用
本公开要求于2017年8月23日提交的申请号为201710731517.1、发明名称为“3D图像检测方法、装置、电子设备及计算机可读介质”的中国专利申请的优先权,该中国专利申请的全部内容通过引用全部并入本文。
技术领域
本公开涉及图像数据处理技术领域,具体而言,涉及一种3D图像检测方法、装置、电子设备及计算机可读介质。
背景技术
在医疗诊断过程中,医疗图像数据起着极为重要的作用。目前深度学习技术在医疗图像处理上获得了极为广泛的应用。
深度学习技术可以有效利用大量图像数据,学习获得知识,辅助医生读取判断病例图像,例如X光片、DR、超声等二维图像,深度学习已在医院被实际应用。
需要说明的是,在上述背景技术部分公开的信息仅用于加强对本公开的背景的理解,因此可以包括不构成对本领域普通技术人员已知的现有技术的信息。
发明内容
本公开的实施例涉及一种3D图像检测方法、装置、电子设备及计算机可读介质。
本公开的其他特性和优点将通过下面的详细描述变得清晰,或者部分地通过本公开的实践而习得。
根据本公开的一个方面,提供一种3D图像检测方法,包括:
对3D图像进行分层,得到至少一个3D子图像,其中所述3D子图像包含多张2D图像;
对所述3D子图像进行层内聚类,得到超像素网格;
将所述超像素网格输入到神经网络中进行检测;
响应于所述神经网络在所述超像素网格中检测到目标,对形成所述含有目标的超像素网格的所述3D子图像按照所述2D图像的分辨率进行检测,得到并输出检测结果。
在一些实施例中,所述3D图像的大小为C×H×W,其中C,H,W分别为所述2D图像的张数以及所述2D图像的高和宽,所述对3D图像进行分层包括:
将所述3D图像拆分为K个3D子图像,C i=C/K,其中K为大于1的自然数,所 述3D子图像中含有C i×H×W个像素。
在一些实施例中,所述对所述3D子图像进行层内聚类包括:
将所述3D子图像在高和宽方向划分为LxM个网格,其中L和M为大于1的自然数;
将所述LxM个网格作为初始值,采用超像素算法进行聚类。
在一些实施例中,将所述超像素网格输入到神经网络中进行检测,采用第一分类器进行检测。
在一些实施例中,对形成所述含有目标的超像素网格的所述3D子图像进行检测,采用第二分类器进行检测,所述第二分类器的精度大于所述第一分类器的精度。
根据本公开的第二方面,还提供一种3D图像检测装置,包括:
分层模块,配置为对3D图像进行分层,得到至少一个3D子图像,其中所述3D子图像包含多张2D图像;
聚类模块,配置为对所述3D子图像进行层内聚类,得到超像素网格;
第一检测模块,配置为将所述超像素网格输入到深度神经网络中进行检测;
第二检测模块,配置为响应于在所述超像素网格中检测到目标,对形成含有目标的超像素网格的所述3D子图像进行检测,得到并输出检测结果。
在一些实施例中,所述3D图像的大小为C×H×W,其中C,H,W分别为所述2D图像的张数以及所述2D图像的高和宽,所述分层模块进一步配置成将所述3D图像拆分为K个3D子图像,C i=C/K,所述3D子图像中含有C i×H×W个像素,其中C i和K为大于1的自然数。
在一些实施例中,所述第一检测模块利用第一分类器对所述超像素网格进行检测,所述第二检测模块利用第二分类器对含有目标的超像素网格聚类前的所述3D子图像进行检测,其中所述第二分类器的精度大于所述第一分类器的精度。
根据本公开的第三方面,还提供一种电子设备,包括处理器;存储器;和存储在所述存储器中的计算机可执行指令,在所述计算机可执行指令被所述处理器运行时,使得所述处理器执行本公开至少一个实施例所提供的3D图像检测方法的一个或多个步骤。
根据本公开的第四方面,还提供一种计算机可读介质,其上存储有计算机可执行指令,所述计算机可执行指令被处理器执行时实现以上至少一个实施例所提供的3D图像检测方法的一个或多个步骤。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施 例,并与说明书一起用于解释本公开的原理。显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出本公开一实施例中提供的一种3D图像检测方法的示意图。
图2示出本公开一实施例中步骤S12的示意图。
图3示出本公开另一实施例中提供的一种3D图像检测装置的示意图。
图4示出本公开再一实施例中提供的电子设备的计算机系统的结构示意图。
具体实施方式
现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本公开将更加全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。附图仅为本公开的示意性图解,并非一定是按比例绘制。图中相同的附图标记表示相同或类似的部分,因而将省略对它们的重复描述。
此外,所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施方式中。在下面的描述中,提供许多具体细节从而给出对本公开的实施方式的充分理解。然而,本领域技术人员将意识到,可以实践本公开的技术方案而省略所述特定细节中的一个或更多,或者可以采用其它的方法、组元、装置、步骤等。在其它情况下,不详细示出或描述公知结构、方法、装置、实现、材料或者操作以避免喧宾夺主而使得本公开的各方面变得模糊。
附图中所示的一些方框图是功能实体,不一定必须与物理或逻辑上独立的实体相对应。可以采用软件形式来实现这些功能实体,或在一个或多个硬件模块或集成电路中实现这些功能实体,或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。
据本公开的发明人所知,尽管神经网络在2D图像的处理上具有广泛应用,然而将其应用在三维图像的处理上,会存在以下问题的至少一个:
首先,不同于二维图像,三维图像在判读病灶的时候一般需要考虑相邻多张图像的信息。在利用神经网络对病灶进行判断的时候,将三维图像直接输入到神经网络,实施端到端(end-to-end)的训练过程。以常规的CT图像为例,假设CT数据的切片数为100,在图像的高和宽保持不变的情况下,CT数据的像素个数是二维图像的100倍,即使完全以图像处理器GPU执行处理过程,所使用的GPU显存数量也要远远大于同样大小的二维图像。因此对于三维图像而言,采用神经网络进行训练的内存使用量太大,训练模型的大小受制于GPU显存的大小,很难使用复杂度较高的神经网络结构。
其次,3D图像输入到神经网络后训练较为困难,在连接数较大的情况下,训练 过程会增加时间,且加大收敛到局部极值的难度。
在本公开中,所使用的神经网络的概念,或当前本领域应用更为广泛的深度学习(具有特征学习能力的多层神经网络,例如卷积神经网络)的概念,在处理图像中的目标识别、目标检测、目标分类等诸多问题中显示了良好的应用性能,广泛的应用各种图像处理。卷积神经网络(Constitutional Neural Networks,CNN),例如包含多个卷积层的卷积神经网络,由于可以通过不同的卷积层检测图像中不同区域和维度的特征,从而使得基于卷积神经网络发展起来的深度学习方法对图像进行分类和识别。
已经发展了多种结构的卷积神经网络。一种传统的卷积神经网络通常由输入层、卷积层、池化层、全连接层组成,即INPUT(输入层)-CONV(卷积层)-POOL(池化层)-FC(全连接层)-OUTPUT(输出层)。其中卷积层进行特征提取;池化层对输入的特征图进行降维;全连接层用于连接所有的特征,并进行输出。
如上所述,本公开以卷积神经网络描述了神经网络在图像处理领域应用的基本概念,这仅是示意性的。在机器学习领域,存在多种结构的神经网络可用于图像处理等应用。即使是卷积神经网络,除上述所列举的传统的卷积神经网络,还可以是全卷积神经网络FCN、分割网络SegNet、空洞卷积Dilated Convolutions、基于带孔卷积(atrous convolution)的深度神经网络DeepLab(V1&V2)、基于多尺度卷积的深度神经网络DeepLab(V3)、多通道分割神经网络RefineNet等。
在本公开中,所称的图像可以为各种类型的图像,例如可以为医学图像。按照获取医学图像的设备划分,医学图像可以包括超声图像、X射线计算机断层摄影(Computed Tomography,CT)、核磁共振(Magnetic Resonance Imaging,MRI)图像、数字血管剪影(Digital Subtraction Angiography,DSA)和正电子断层摄影(Positron Emission Computed Tomography PET)等。按照医学图像的内容划分,医学图像可以包括脑组织核磁共振图像、脊髓核磁共振图像、眼底图像、血管图像、胰腺CT图像和肺部CT图像等。
例如,图像可以通过图像采集装置获取。当图像为医学图像时,图像采集装置例如可以包括超声设备、X线设备、核磁共振设备、核医学设备、医用光学设备以及热成像设备等。
在本公开中,3D图像,例如3D医学图像,包括X射线计算机断层摄影(Computed Tomography,CT)、核磁共振(Magnetic Resonance Imaging,MRI)图像、三维数字血管剪影(Digital Subtraction Angiography,3D DSA)和正电子断层摄影(Positron Emission Computed Tomography PET)等。
需要说明的是,图像也可以为人物图像、动植物图像或风景图像等,相应的3D图像可以是通过3D相机,如3D光场相机、ToF相机、多镜头相机、RGB-D相机等形成的。
在本公开中,图像可以为灰度图像,也可以为彩色图像。对于彩色图像,可以分 解为R、G、B单色彩通道图像。
图1示出本公开一些实施例提供的一种3D图像检测方法的示意图。该3D图像检测方法包括如下步骤:
如图1所示,在步骤S11中,对3D图像进行分层,得到至少一个3D子图像,其中每个3D子图像包含多张2D图像。
如图1所示,在步骤S12中,对3D子图像进行层内聚类,得到超像素网格。
如图1所示,在步骤S13中,将超像素网格输入到深度神经网络中进行检测。
如图1所示,在步骤S14中,如果在超像素网格中检测到目标,则对含有目标的超像素网格聚类前的3D子图像进行检测,得到并输出检测结果。
本公开实施例提供的3D图像检测方法,通过分层聚类得到超像素网格,对超像素网格进行低精度检测,并对检测到目标的超像素网格进行高精度的检测,从而达到减少使用内存的目的。
在一些实施例中,对3D图像进行分层包括:标记3D图像I C×H×W的大小为C×H×W,其中C,H,W分别为图像的通道数(即组成3D图像的2D图像的张数)、2D图像的高和宽。将3D图像I C×H×W依次(例如从上到下等)拆分为K个3D子图像
Figure PCTCN2018100838-appb-000001
其中C i=C/K,K为大于1的自然数,这样每个3D子图像
Figure PCTCN2018100838-appb-000002
有C i×H×W个像素。也就是在通道上对3D图像按照每C i张图进行拆分,例如,100x320x420的3D图像为100张大小为320x420的图像,100为通道数,320为图像的高度,420为图像的宽度。假设K为4,那么拆分为4个3D子图像,每个3D子图像的大小为25x320x420,即C i=C/K=100/4=2,5也就是每个3D子图像包含25张大小为320x420的图像,是一个25维的矢量。
参考图2,显示了本公开一些实施例中对3D子图像进行层内聚类,得到超像素网格的操作,具体包括以下步骤:
如图2所示,在步骤S21中,将3D子图像在高和宽方向划分为LxM个网格,其中L和M为大于1的自然数。
需要说明的是,L和M可以是不等的,也可以是相等(即LxL或MxM)的,仅是为了将高为H和宽为W的3D子图像中划分出多个网格,网格的大小(即L和/或M的数值)和形状(即L和M是否相等)可以根据需要选择。
如图2所示,在步骤S22中,将LxM个网格作为初始值,采用超像素算法进行聚类,得到超像素网格。
例如,将3D子图像在高和宽方向划分为LxL个网格,即其中的网格可以是正方形网格(也可以是长方形网格)。聚类完成之后,每个网格由一个单一的矢量表征,通过聚类将
Figure PCTCN2018100838-appb-000003
表示为
Figure PCTCN2018100838-appb-000004
可以看做一张LxL大小的图像,每个像素是一个C i维的矢量。
由此,本公开实施例通过将3D子图像经过聚类得到一个对应的超像素网格,而 每个超像素网格包括沿高和宽方向划分产生的多个网格。以将3D子图像在高和宽方向划分为LxL个网格为例,分层聚类完成之后,3D子图像的像素个数由C i×H×W减少到LxL个,也就是通过步骤S11和S12对子图像进行层内聚类,减少3D子图像
Figure PCTCN2018100838-appb-000005
的信息冗余,同时将相关的像素进行融合,可以达到降维的目的,像素个数减少,分辨率也降低。
可以采用多种图像分割方法的过度分割(over-segmentation),生成超像素(superpixel)。如SLIC(Simple Linear Iterative Cluster)、SEEDS(Superpixels Extracted via Energy-Driven Sampling)、LSC((Linear Spectral Clustering)等。以采用SLIC生成超像素为例进行示意性说明。SLIC算法大致思想是:将图像从RGB颜色空间转换到CIE-Lab颜色空间,对应每个像素的(L,a,b)颜色值和(x,y)坐标组成一个5维向量V[L,a,b,x,y],两个像素的相似性即可由它们的向量距离来度量,距离越大,相似性越小。
SLIC算法首先生成K个种子点,然后在每个种子点的周围空间里搜索距离该种子点最近的若干像素,将它们归为与该种子点一类,直到所有像素点都归类完毕。然后计算这K个超像素里所有像素点的平均向量值,重新得到K个聚类中心,然后再以这K个中心去搜索其周围与其最为相似的若干像素,所有像素都归类完后重新得到K个超像素,更新聚类中心,再次迭代,如此反复直到收敛。
SLIC算法接受一个参数K,用于指定生成的超像素数目。设原图有N个像素,则分割后每块超像素大致有N/K个像素,每块超像素的边长大致为S=[N/K]^0.5,每隔S个像素取一个聚类中心,然后以这个聚类中心的周围2S*2S为其搜索空间,与其最为相似的若干点即在此空间中搜寻。
可选地,为了避免所选的聚类中心是边缘和噪声这样的不合理点,在3*3的窗口中将聚类中心移动到梯度最小的区域,梯度定义为
G(x,y)=[V(x+1,y)-V(x-1,y)]^2+[V(x,y+1)-V(x,y-1)]^2
这样就可以避免上面所说的情况。
因为L,a,b在CIE-Lab颜色空间,L,a,b的大小有限制,而图像尺寸则没有限制,如果图片的尺寸比较大,会造成衡量向量距离时空间距离(x,y)的影响过大。
可选地,为了调制空间距离(x,y)的影响,对x,y进行标准化。改进向量距离的度量如下:
d_lab=[(Lk-Li)^2+(ak-ai)^2+(bk-bi)^2]^0.5
d_xy=[(Xi-Xk)^2+(Yk-Yi)^2]^0.5
Ds=d_lab+(m/S)*d_xy
其中m为用来调整d_xy的权值,可以选择为1-20,例如设置为10。
为了避免可能出现一些小的区域d被标记为归属某一块超像素但却与这块超像素没有连接的情况,把这块小区域d重新归类为与这块小区域d连接的最大的超像素 中去,以保证每块超像素的完整。
在本实施例中,步骤S13和步骤S14中的“检测”是指利用神经网络进行分类,不同之处在于:步骤S13中将超像素网格输入到深度神经网络中进行检测时采用第一分类器进行检测,步骤S14中对3D子图像进行检测,就是对形成含有目标的超像素网格的3D子图像进行逐张检测,检测时采用第二分类器进行检测,且第二分类器的精度大于第一分类器的精度。
例如,步骤S13的检测的结果是判断该超像素网格中是否存在有检测的目标,例如是否存在病灶。如果在超像素网格中检测到目标,就说明聚类构成这一超像素网格在聚类前的3D子图像中存在病灶,需要进行进一步精细检测,即进行步骤S14;如果在超像素网格中没有检测到目标,说明聚类构成这一超像素网格在聚类前的3D子图像中不存在病灶,检测结束,不需要进行更加精细的检测。也就是仅仅对含有目标的超像素网格聚类前的3D子图像进行精细检测,其他不含有目标的超像素网格聚类前的3D子图像无需进行精细检测,因此训练起来更加简单,而且训练的时间也大大缩短。
以肺结节为例,如果步骤S13中检测的目标是3D图像上的肺结节,可以首先对降维之后的3D子图像进行检测,判断是否可能存在肺结节,此过程中使用的第一分类器具有低的漏警率,同时允许虚警率可以较高。
尽管漏警率和虚警率这两个概率都是越小越好,但是一般来说两者无法同时达到最优,在其他条件一定时,虚警概率越小漏警概率就会越大;漏警概率越小虚警概率就会越大。通常,给定其中一个错误的概率不超过某个值(比如不超过1%)时,让另一个错误的概率尽量小,也就是本实施例中在粗检测时尽量保证很低的漏警率,即不漏掉目标,但是这时虚警率也会高一些,但是第一级检查(例如,可称之为粗略检查)的目的就是要把含有目标的超像素网格选出来。
在步骤S14中,在检测到目标后,对含有目标的超像素网格聚类前的3D子图像再进行第二级检测(例如可称之为精细检查),由于步骤S13中按照聚类后的超像素进行的检测是针对分辨率较低的图像,相对而言,步骤S14中的精细检测就是按照构成3D子图像的2D图像的分辨率进行检测,因此所利用的第二分类器的精度也要高于第一分类器的精度。
需要说明的是,本实施例中经过步骤S14进行精细检测得到的检测结果可以是病灶在2D图像上的位置或形状,也可以是直接输出是否存在病灶。检测结果的具体类型可以通过选择相应的神经网络结构来调整,例如可以通过SoftMax分类器、LR分类器等使得神经网络输出是否存在病灶或者存在何种类型的病灶;选择用于图像分割的神经网络,如R-CNN、SegNet等可以用来输出病灶在2D图像上的位置或形状。。
本公开实施例提供的3D图像检测方法通过对3D图像进行分层和聚类,并对聚类得到的超像素网格先进行分层检测,如果存在目标则再进一步做精细检测,这样可 以降低对系统运算能力的需求,不仅占用内存小,而且训练简单,运行时间也较短,提高检测的效率。
图3还示出本公开另一实施例中提供的一种3D图像检测装置的示意图,该装置300包括:分层模块310、聚类模块320、第一检测模块330和第二检测模块340。
分层模块310配置为对3D图像进行分层,得到至少一个3D子图像,其中3D子图像包含多张2D图像;
聚类模块320配置为对3D子图像进行层内聚类,得到超像素网格;
第一检测模块330配置为将超像素网格输入到深度神经网络中进行检测;
第二检测模块340配置为当在超像素网格中检测到目标时,对含有目标的超像素网格聚类前的3D子图像进行检测,得到并输出检测结果。
分层模块310和聚类模块320具体的工作过程可参考上述方法中对应的描述,此处不再赘述。
在本公开的实施例中,第一检测模块330利用第一分类器对超像素网格进行检测,第二检测模块340利用第二分类器对含有目标的超像素网格聚类前的3D子图像进行检测,其中第二分类器的精度大于第一分类器的精度。
需要说明的是,该3D图像检测装置中各个模块的功能参见上述方法实施例中的相关描述,此处不再赘述。
本实施例中的3D图像检测装置可以实现与上述3D图像检测方法相同的技术效果,此处不再赘述。
另一方面,本公开的实施例还提供了一种电子设备,包括处理器;存储器;和存储在所述存储器中的计算机可执行指令,在所述计算机可执行指令被所述处理器运行时,使得所述处理器执行下述3D图像检测方法的一个或多个步骤:
对3D图像进行分层,得到至少一个3D子图像,其中3D子图像包含多张2D图像;
对3D子图像进行层内聚类,得到超像素网格;
将超像素网格输入到神经网络中进行检测;
响应于所述超像素网格中检测到目标,对形成含有目标的超像素网格的3D子图像进行检测,得到并输出检测结果。
该3D图像检测方法的一个或多个步骤的更进一步描述可参考上述的3D图像检测方法所描述。
下面参考图4,其示出了适于用来实现本公开实施例的电子设备的计算机系统400的结构示意图。图4示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图4所示,计算机系统800包括一个或多个处理器801,其可以根据存储器802中存储的程序指令执行各种操作(例如,程序指令存储在只读存储器(ROM)或者 传统的磁盘存储器等存储器802并加载到随机访问存储器(RAM)中)。在存储器802中,还存储有计算机系统800操作所需的各种程序和数据。处理器801、存储器802通过总线803彼此相连。输入/输出(I/O)接口804也连接至总线803。
可以有多种部件连接至I/O接口804以实现信息的输出和输出。例如包括键盘、鼠标等的输入装置805;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出装置806;包括诸如LAN卡、调制解调器等的网络接口卡的通信装置807。通信装置807经由诸如因特网的网络执行通信处理。驱动器808也根据需要连接至I/O接口803。可拆卸介质809,诸如磁盘、光盘、闪存等根据需要连接或安装在驱动器808。
其中,处理器801可以是中央处理器(CPU)或者现场可编程逻辑阵列(FPGA)或者单片机(MCU)或者数字信号处理器(DSP)或者专用集成电路(ASIC)等具有数据处理能力和/或程序执行能力的逻辑运算器件。
其中,总线803可以是Front Side Bus(FSB),QuickPath Interconnect(QPI),direct media interface(DMI),Peripheral Component Interconnect(PCI)、Peripheral Component Interconnect Express(PCI-E)、HyperTransport(HT)等。
根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行本公开至少一个实施例所述图像处理方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置807从网络上被下载和安装,和/或从可拆卸介质809被安装。在该计算机程序被处理器801执行时,执行本公开的系统中限定的上述功能。需要说明的是,本公开所示的计算机可读介质可以是计算机可读信号介质或者计算机可读介质或者是上述两者的任意组合。计算机可读介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、 光缆、RF等等,或者上述的任意合适的组合。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图或流程图中的每个方框、以及框图或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器包括发送单元、获取单元、确定单元和第一处理单元。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定,例如,发送单元还可以被描述为“向所连接的服务端发送图片获取请求的单元”。
另一方面,本公开还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的设备中所包含的;也可以是单独存在,而未装配入该设备中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被设备的处理器执行时,使得所述处理器执行下述3D图像检测方法的一个或多个步骤:
对3D图像进行分层,得到至少一个3D子图像,其中3D子图像包含多张2D图像;
对3D子图像进行层内聚类,得到超像素网格;
将超像素网格输入到神经网络中进行检测;
响应于所述超像素网格中检测到目标,对形成含有目标的超像素网格的3D子图像进行检测,得到并输出检测结果。
该3D图像检测方法的一个或多个步骤的更进一步描述可参考上述的3D图像检测方法所描述。
应清楚地理解,本公开描述了如何形成和使用特定示例,但本公开的原理不限于这些示例的任何细节。相反,基于本公开公开的内容的教导,这些原理能够应用于许多其它实施方式。
以上具体地示出和描述了本公开的示例性实施方式。应可理解的是,本公开不限于这里描述的详细结构、设置方式或实现方法;相反,本公开意图涵盖包含在所附权利要求的精神和范围内的各种修改和等效设置。

Claims (10)

  1. 一种3D图像检测方法,包括:
    对3D图像进行分层,得到至少一个3D子图像,其中所述3D子图像包含多张2D图像;
    对所述3D子图像进行层内聚类,得到超像素网格;
    将所述超像素网格输入到神经网络中进行检测;
    响应于在所述超像素网格中检测到目标,对形成含有目标的超像素网格的所述3D子图像进行检测,得到并输出检测结果。
  2. 根据权利要求1所述的3D图像检测方法,其中,所述3D图像的大小为C×H×W,其中C、H、W分别为所述2D图像的张数以及所述2D图像的高和宽,所述对3D图像进行分层包括:
    将所述3D图像拆分为K个3D子图像,C i=C/K,所述3D子图像中含有C i×H×W个像素,其中C i和K为大于1的自然数。
  3. 根据权利要求1或2所述的3D图像检测方法,其中,所述对所述3D子图像进行层内聚类包括:
    将所述3D子图像在高和宽方向划分为LxM个网格,其中L和M为大于1的自然数;
    将所述LxM个网格作为初始值,采用超像素算法进行聚类。
  4. 根据权利要求1至3中任意一项所述的3D图像检测方法,其中,将所述超像素网格输入到神经网络中进行检测,包括采用第一分类器进行检测。
  5. 根据权利要求4所述的3D图像检测方法,其中,对形成含有目标的超像素网格的所述3D子图像进行检测,包括采用第二分类器进行检测,所述第二分类器的精度大于所述第一分类器的精度。
  6. 一种3D图像检测装置,包括:
    分层模块,配置为对3D图像进行分层,得到多个3D子图像,其中所述3D子图像包含多张2D图像;
    聚类模块,配置为对所述3D子图像进行层内聚类,得到至少一个超像素网格;
    第一检测模块,配置为将所述至少一个超像素网格输入到神经网络中进行检测;
    第二检测模块,配置为响应于所述超像素网格中检测到目标,对形成含有目标的超像素网格的所述3D子图像进行检测,得到并输出检测结果。
  7. 根据权利要求6所述的3D图像检测装置,其中,所述3D图像的大小为C×H×W,其中C、H、W分别为所述2D图像的张数以及所述2D图像的高和宽,所述对3D图像进行分层包括:将所述3D图像拆分为K个3D子图像,C i=C/K,所述3D子图像中含有C i×H×W个像素,其中C i和K为大于1的自然数。
  8. 根据权利要求6或7所述的3D图像检测装置,其中,所述第一检测模块利用第一分类器对所述超像素网格进行检测,所述第二检测模块利用第二分类器对含有目标的超像素 网格聚类前的所述3D子图像进行检测,其中所述第二分类器的精度大于所述第一分类器的精度。
  9. 一种电子设备,包括:
    包括处理器;
    存储器;
    和存储在所述存储器中的计算机可执行指令,在所述计算机可执行指令被所述处理器运行时,使得所述处理器执行如权利要求1-5中任一项所述方法的一个或多个步骤。
  10. 一种计算机可读介质,其上存储有计算机可执行指令,所述计算机可执行指令被处理器执行时实现如权利要求1-5中任一项的3D图像检测方法的一个或多个步骤。
PCT/CN2018/100838 2017-08-23 2018-08-16 3d图像检测方法、装置、电子设备及计算机可读介质 WO2019037654A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/335,098 US10929643B2 (en) 2017-08-23 2018-08-16 3D image detection method and apparatus, electronic device, and computer readable medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710731517.1A CN107688783B (zh) 2017-08-23 2017-08-23 3d图像检测方法、装置、电子设备及计算机可读介质
CN201710731517.1 2017-08-23

Publications (1)

Publication Number Publication Date
WO2019037654A1 true WO2019037654A1 (zh) 2019-02-28

Family

ID=61153667

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/100838 WO2019037654A1 (zh) 2017-08-23 2018-08-16 3d图像检测方法、装置、电子设备及计算机可读介质

Country Status (3)

Country Link
US (1) US10929643B2 (zh)
CN (1) CN107688783B (zh)
WO (1) WO2019037654A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110517244A (zh) * 2019-08-23 2019-11-29 首都医科大学宣武医院 一种基于dsa图像的定位方法及系统
CN115035381A (zh) * 2022-06-13 2022-09-09 湖北工业大学 一种SN-YOLOv5的轻量化目标检测网络及农作物采摘检测方法
CN115063573A (zh) * 2022-06-14 2022-09-16 湖北工业大学 一种基于注意力机制的多尺度目标检测方法
CN116563615A (zh) * 2023-04-21 2023-08-08 南京讯思雅信息科技有限公司 基于改进多尺度注意力机制的不良图片分类方法

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107688783B (zh) 2017-08-23 2020-07-07 京东方科技集团股份有限公司 3d图像检测方法、装置、电子设备及计算机可读介质
CN108921196A (zh) * 2018-06-01 2018-11-30 南京邮电大学 一种改进全卷积神经网络的语义分割方法
CN110084181B (zh) * 2019-04-24 2021-04-20 哈尔滨工业大学 一种基于稀疏MobileNetV2网络的遥感图像舰船目标检测方法
CN113344947B (zh) * 2021-06-01 2022-05-10 电子科技大学 一种超像素聚合分割方法
CN114511513B (zh) * 2022-01-14 2024-06-28 清华大学 基于深度卷积神经网络的脑动脉瘤三维检测分割方法
CN118097310B (zh) * 2024-04-24 2024-07-09 陕西丰京建材有限公司 一种数字化检测混凝土表面缺陷的方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106408610A (zh) * 2015-04-16 2017-02-15 西门子公司 用边缘空间深度神经网络进行解剖对象检测的方法和系统
CN106780460A (zh) * 2016-12-13 2017-05-31 杭州健培科技有限公司 一种用于胸部ct影像的肺结节自动检测系统
CN107067039A (zh) * 2017-04-25 2017-08-18 西安电子科技大学 基于超像素的sar图像舰船目标快速检测方法
CN107688783A (zh) * 2017-08-23 2018-02-13 京东方科技集团股份有限公司 3d图像检测方法、装置、电子设备及计算机可读介质

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5305204A (en) * 1989-07-19 1994-04-19 Kabushiki Kaisha Toshiba Digital image display apparatus with automatic window level and window width adjustment
CN104616289A (zh) * 2014-12-19 2015-05-13 西安华海盈泰医疗信息技术有限公司 一种3d ct图像中骨组织的移除方法及系统
US9679192B2 (en) * 2015-04-24 2017-06-13 Adobe Systems Incorporated 3-dimensional portrait reconstruction from a single photo
CN105957063B (zh) * 2016-04-22 2019-02-15 北京理工大学 基于多尺度加权相似性测度的ct图像肝脏分割方法及系统
US11137462B2 (en) * 2016-06-10 2021-10-05 Board Of Trustees Of Michigan State University System and method for quantifying cell numbers in magnetic resonance imaging (MRI)
CN106296653B (zh) * 2016-07-25 2019-02-01 浙江大学 基于半监督学习的脑部ct图像出血区域分割方法及系统
AU2017324069B2 (en) * 2016-09-06 2019-12-19 Elekta, Inc. Neural network for generating synthetic medical images

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106408610A (zh) * 2015-04-16 2017-02-15 西门子公司 用边缘空间深度神经网络进行解剖对象检测的方法和系统
CN106780460A (zh) * 2016-12-13 2017-05-31 杭州健培科技有限公司 一种用于胸部ct影像的肺结节自动检测系统
CN107067039A (zh) * 2017-04-25 2017-08-18 西安电子科技大学 基于超像素的sar图像舰船目标快速检测方法
CN107688783A (zh) * 2017-08-23 2018-02-13 京东方科技集团股份有限公司 3d图像检测方法、装置、电子设备及计算机可读介质

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110517244A (zh) * 2019-08-23 2019-11-29 首都医科大学宣武医院 一种基于dsa图像的定位方法及系统
CN110517244B (zh) * 2019-08-23 2023-04-28 首都医科大学宣武医院 一种基于dsa图像的定位方法及系统
CN115035381A (zh) * 2022-06-13 2022-09-09 湖北工业大学 一种SN-YOLOv5的轻量化目标检测网络及农作物采摘检测方法
CN115063573A (zh) * 2022-06-14 2022-09-16 湖北工业大学 一种基于注意力机制的多尺度目标检测方法
CN116563615A (zh) * 2023-04-21 2023-08-08 南京讯思雅信息科技有限公司 基于改进多尺度注意力机制的不良图片分类方法
CN116563615B (zh) * 2023-04-21 2023-11-07 南京讯思雅信息科技有限公司 基于改进多尺度注意力机制的不良图片分类方法

Also Published As

Publication number Publication date
US10929643B2 (en) 2021-02-23
US20190347468A1 (en) 2019-11-14
CN107688783B (zh) 2020-07-07
CN107688783A (zh) 2018-02-13

Similar Documents

Publication Publication Date Title
WO2019037654A1 (zh) 3d图像检测方法、装置、电子设备及计算机可读介质
US11861829B2 (en) Deep learning based medical image detection method and related device
US11929174B2 (en) Machine learning method and apparatus, program, learned model, and discrimination apparatus using multilayer neural network
Dutande et al. LNCDS: A 2D-3D cascaded CNN approach for lung nodule classification, detection and segmentation
CN109685060B (zh) 图像处理方法和装置
WO2020108525A1 (zh) 图像分割方法、装置、诊断系统、存储介质及计算机设备
CN111429421B (zh) 模型生成方法、医学图像分割方法、装置、设备及介质
US20210383534A1 (en) System and methods for image segmentation and classification using reduced depth convolutional neural networks
CN111028242A (zh) 一种肿瘤自动分割系统、方法及电子设备
Guo et al. Dual attention enhancement feature fusion network for segmentation and quantitative analysis of paediatric echocardiography
WO2021128825A1 (zh) 三维目标检测及模型的训练方法及装置、设备、存储介质
WO2023137914A1 (zh) 图像处理方法、装置、电子设备及存储介质
Tan et al. Automated vessel segmentation in lung CT and CTA images via deep neural networks
WO2020168648A1 (zh) 一种图像分割方法、装置及计算机可读存储介质
Hu et al. PolyBuilding: Polygon transformer for building extraction
Duan et al. A novel GA-based optimized approach for regional multimodal medical image fusion with superpixel segmentation
WO2020110774A1 (ja) 画像処理装置、画像処理方法、及びプログラム
CN110648331A (zh) 用于医学图像分割的检测方法、医学图像分割方法及装置
CN117710760B (zh) 残差的注意神经网络用于胸部x线病灶检测的方法
Han et al. BiRPN-YOLOvX: A weighted bidirectional recursive feature pyramid algorithm for lung nodule detection
KR101923962B1 (ko) 의료 영상의 열람을 지원하는 방법 및 이를 이용한 장치
US11416994B2 (en) Method and system for detecting chest x-ray thoracic diseases utilizing multi-view multi-scale learning
Selvadass et al. SAtUNet: Series atrous convolution enhanced U‐Net for lung nodule segmentation
CN111341438B (zh) 图像处理方法、装置、电子设备及介质
CN111209946B (zh) 三维图像处理方法、图像处理模型训练方法及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18848963

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17.08.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18848963

Country of ref document: EP

Kind code of ref document: A1