WO2020168648A1 - 一种图像分割方法、装置及计算机可读存储介质 - Google Patents

一种图像分割方法、装置及计算机可读存储介质 Download PDF

Info

Publication number
WO2020168648A1
WO2020168648A1 PCT/CN2019/088975 CN2019088975W WO2020168648A1 WO 2020168648 A1 WO2020168648 A1 WO 2020168648A1 CN 2019088975 W CN2019088975 W CN 2019088975W WO 2020168648 A1 WO2020168648 A1 WO 2020168648A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional
image
images
layer
output
Prior art date
Application number
PCT/CN2019/088975
Other languages
English (en)
French (fr)
Inventor
马进
王健宗
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020168648A1 publication Critical patent/WO2020168648A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • This application relates to the field of image segmentation technology, and in particular to an image segmentation method, device and computer-readable storage medium.
  • cartilage degradation can often predict osteoarthritis and become a major cause of work disability.
  • cartilage image segmentation after knee MRI (Magnetic Resonance Imaging) scan has become one of the important choices for the quantitative assessment and analysis of cartilage degradation.
  • the cartilage image resolution is completed by the imaging physician comparing each layer of the image separately, which is undoubtedly very time-consuming and energy-consuming.
  • the difference between the observer and different observers is also great, which will greatly affect the image resolution effect. It can be seen that for the purpose of reducing labor costs and improving recognition accuracy and effect, the application of automatic image segmentation programs has great potential in the fields of research and production.
  • the embodiments of the present application provide an image segmentation method, which can complete three-dimensional image segmentation through a simpler two-dimensional network structure and lower resource requirements, and achieve an effect similar to a three-dimensional network.
  • an image method which includes:
  • the segmented image of the three-dimensional image is generated according to the classification result of the multiple three-dimensional checkered images.
  • an embodiment of the present application provides an image segmentation device, the image segmentation device includes a unit for executing the method of the first aspect, the image segmentation device includes:
  • the segmentation unit is used to segment the three-dimensional image to be segmented into multiple three-dimensional grid images
  • a conversion unit configured to perform a two-dimensional image conversion process on the plurality of three-dimensional checkered images to obtain a two-dimensional image group corresponding to each of the plurality of three-dimensional checkered images;
  • the classification unit is configured to input the two-dimensional image groups corresponding to the multiple three-dimensional grid images into the trained image classification model to obtain the classification results of the multiple three-dimensional grids;
  • the generating unit is configured to generate a segmented image of the three-dimensional image according to the classification results of the multiple three-dimensional checkered images.
  • an embodiment of the present application provides an image segmentation device, including a processor, a memory, and a communication module, wherein the memory is used to store program code, and the processor is used to call the program code to execute the above-mentioned The method in one aspect and any of its alternative methods.
  • an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, the computer program includes program instructions, and the program instructions are executed by a processor to Perform the method of the first aspect above.
  • the embodiment of the application obtains the two-dimensional image data after preprocessing the three-dimensional image data to be processed, so that the two-dimensional image data is input to the image segmentation model to perform image segmentation processing on the three-dimensional image to be segmented, thereby reducing the need for the image classification model. This reduces the training time and difficulty of the image segmentation model and satisfies the effect similar to the three-dimensional network.
  • FIG. 1 is a schematic flowchart of an image segmentation method provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of converting a three-dimensional image to be divided into a two-dimensional image group provided in the implementation of this application;
  • FIG. 3 is a schematic block diagram of an image segmentation device provided by an embodiment of the present application.
  • Fig. 4 is a schematic structural diagram of an image segmentation device provided by an embodiment of the present application.
  • FIG. 1 is a schematic flowchart of an image segmentation method provided by an embodiment of the present application. As shown in the figure, the method may include:
  • the image segmentation device divides the three-dimensional image to be divided into multiple three-dimensional grid images; performs two-dimensional image conversion processing on the multiple three-dimensional grid images to obtain each of the multiple three-dimensional grid images. The corresponding two-dimensional image group.
  • the above-mentioned three-dimensional image to be segmented may be a three-dimensional ultrasound image, a magnetic resonance imaging (MRI), a computer tomography (CT) image, and the like. It is worth noting that the above-mentioned three-dimensional image to be segmented should be of the same type as the three-dimensional training sample image used when training the image classification model.
  • MRI magnetic resonance imaging
  • CT computer tomography
  • the image classification model used to perform image segmentation processing on the three-dimensional image is composed of three two-dimensional convolutional neural networks. Therefore, the input of the aforementioned image classification model should be a two-dimensional image.
  • the image to be processed is a three-dimensional image. Therefore, when performing image segmentation, the original three-dimensional image to be segmented needs to be preprocessed to obtain a two-dimensional image that can be processed by the above-mentioned image classification model.
  • the acquired three-dimensional image to be divided is divided into a number of three-dimensional grid images according to a preset size. Then, from the first three-dimensional checkered image in the above-mentioned several three-dimensional checkered images, three three passing through the center of the three-dimensional checkered image and respectively parallel to the three mutually perpendicular surfaces of the three-dimensional checkered image are intercepted. Two-dimensional slice image.
  • the above three two-dimensional slice images are used as a two-dimensional image group corresponding to the three-dimensional square. And use the same method to obtain the two-dimensional image group corresponding to other three-dimensional grid images.
  • FIG. 2 is a schematic diagram of converting a three-dimensional image to be divided into a two-dimensional image group provided in the implementation of this application.
  • the original three-dimensional image to be segmented is a cube figure as shown in Figure 2(a)
  • the three-dimensional image to be segmented is cut according to the preset size to obtain the figure shown in Figure 2(b) Three-dimensional grid image.
  • the images passing through the origin O and parallel to the xOy, yOz, and xOz planes are intercepted to obtain a set of two-dimensional images as shown in Figure 2(c).
  • This process is equivalent to cutting the image to be divided into "pixels" with a preset size, and then intercepting three slice images of the "pixels" to represent the "pixels".
  • the above-mentioned "pixels" include several real pixels.
  • the three-dimensional image to be segmented may be an irregular three-dimensional image. Therefore, when segmenting the irregular three-dimensional image to be segmented, the three-dimensional grid of the boundary without pixels can be filled with 0.
  • the image segmentation device inputs the two-dimensional image groups corresponding to the multiple three-dimensional grid images into the trained image classification model to obtain the classification results of the multiple three-dimensional grids.
  • the two-dimensional image group is a slice image cut from a uniquely determined three-dimensional grid, that is, there is a one-to-one correspondence between the two-dimensional image group and the three-dimensional grid image. Therefore, the two-dimensional image group corresponding to each three-dimensional square can be input into the classification result obtained from the trained image classification model as the classification result of the three-dimensional square image.
  • the image classification model is used to classify each group of two-dimensional images. Therefore, before using the image classification model to classify each group of two-dimensional images, the network structure of the image classification model needs to be constructed. Then obtain the training sample set of the above-mentioned image classification model. Then, use the training set samples to train the network structure of the image classification model to obtain the trained image classification model.
  • the foregoing construction of the foregoing image classification model network structure may specifically include: constructing three two-dimensional convolutional neural networks, and each of the foregoing three two-dimensional convolutional neural networks includes Three convolutional layers, one sampling layer and one output layer; the three output layers of the above three two-dimensional convolutional neural networks are connected to a softmax classifier to obtain the network structure of the above image classification model.
  • the output of the convolutional layer of the above-mentioned convolutional neural network is:
  • l represents the first layer
  • * represents the convolution operation
  • Representative weight Represents biased, It means that the size conforms to the all 1 matrix output by the current layer
  • the output of the sampling layer of the above convolutional neural network is:
  • m and n represent the sampling layer bias, Is the bias parameter;
  • the output of the above output layer is:
  • the input of the above classifier is obtained after the output of the output layers of the above three convolutional neural networks are spliced;
  • the probability that the k-th training case output by the above classifier meets the u-th category is:
  • is the parameter matrix of the softmax layer, and ⁇ has The size of K represents the number of categories, Is the number of feature maps output by layer l,
  • the size of the two-dimensional image input by the two-dimensional convolutional neural network in the above-mentioned image classification model is 28 ⁇ 28
  • the size of the convolution kernel of the first convolution layer is 5 ⁇ 5
  • the number of convolution kernels is N( In this embodiment, N is actually 28)
  • the convolution step size is 1, and N feature maps are extracted after the first convolution layer, and the size of each feature map is (28-5+1) ⁇ (28 -5+1);
  • the pooling window size of the next first layer of pooling layer (sampling layer) is 2 ⁇ 2, after the pooling process, the average size of the above N feature maps becomes 12 ⁇ 12; then
  • the size of the convolution kernel of the second convolution layer is 5 ⁇ 5, the number of convolution kernels is 2N, and the step size is 1.
  • 2N feature maps are extracted.
  • the size is 8 ⁇ 8; then the size of the convolution kernel of the third convolution layer is 5 ⁇ 5, the number of convolution kernels is 4N, the step size is 1, and 4N features are extracted after the third convolution layer Figure, the size of each feature map is 4 ⁇ 4; after finally passing through the fully connected layer (that is, the output layer), 64N feature maps with a size of 1 ⁇ 1 are obtained.
  • the result calculated by the output layer in the convolutional neural network uses Softmax as the regression model for probability distribution, and the probability that the k-th training case meets the u-th category can be calculated using the above formula (4) .
  • formula (5) is used as the loss function of the convolutional neural network:
  • t (k) represents the true value identification
  • ⁇ T represents all the parameters and offset settings
  • It represents the output synthesis of the fully connected layer of the three planes
  • the weight attenuation parameter ⁇ is set to 10 -2 .
  • a training sample set needs to be obtained in order to use the above-mentioned training sample set to train the image classification model.
  • obtaining the above-mentioned training sample set may specifically include: firstly, obtaining a three-dimensional training sample image; then, sampling the above-mentioned three-dimensional training sample image to obtain a plurality of three-dimensional grid training sample images. Next, perform the two-dimensional image conversion on the multiple three-dimensional grid training sample images to obtain a two-dimensional training sample image group corresponding to each three-dimensional grid training sample image in the multiple three-dimensional grid training sample images. Finally, mark each group of two-dimensional training sample image groups to obtain the above-mentioned training sample set.
  • the three-dimensional training sample image is an image belonging to the same category as the three-dimensional image to be segmented.
  • the three-dimensional training sample image should also be a brain MRI image.
  • each two-dimensional training image group in the training sample set needs to be labeled with a category. Therefore, prior to sampling the three-dimensional training sample image, the existing image segmentation technology can be used to perform image segmentation on the different types of regions of the three-dimensional training sample image to obtain the sample segmented image of the three-dimensional training sample image. Therefore, after obtaining the two-dimensional training sample image group corresponding to each three-dimensional square training sample image in the above-mentioned multiple three-dimensional square training sample images, each group of two-dimensional training sample image groups can be labeled according to the sample segmentation image to Obtain the above training sample set.
  • the above-mentioned image classification model is trained through the training sample set. Since the above-mentioned image classification model is composed of three two-dimensional convolutional neural networks, the training method of training the convolutional neural network can be used to train the above-mentioned image classification model.
  • the training process of convolutional neural network is divided into two stages.
  • the first stage is the stage of data propagation from low-level to high-level, that is, the forward propagation stage.
  • the other stage is the stage in which the error is propagated and trained from the high level to the bottom level when the results obtained by the forward propagation do not match the expectations, that is, the back propagation stage.
  • the three two-dimensional slice images of the two-dimensional training sample image group of the training sample set are respectively input into the three two-dimensional convolutional neural networks in the image classification model, and the training sample The three two-dimensional slice images of the set of two-dimensional training sample image groups are respectively input to the three two-dimensional convolutional neural networks in the image classification model.
  • the propagation process is the forward propagation process. In the forward propagation process, input The two-dimensional image data of the convolutional neural network is processed by the convolution and pooling of the multi-layer convolution layer, and the feature vector is proposed, and the feature vector is passed into the fully connected layer to obtain the classification and recognition result. When the output result matches our expectation, output the result.
  • the back propagation process is performed. Find the error between the result and the expected value, and then return the error layer by layer, calculate the error of each layer, and then update the weight.
  • the main purpose of this process is to adjust the network weights through training samples and expected values.
  • the error transmission process can be understood in this way. First of all, the data passes through the convolutional layer, downsampling layer, and fully connected layer from the input layer to the output layer. However, the process of data transmission between the layers will inevitably cause data damage. Loss, it also leads to errors.
  • the error value caused by each layer is different, so when we find the total error of the network, we need to pass the error into the network to find out how much weight should each layer bear for the total error.
  • the error is sent back to the network, and the errors of the fully connected layer, the down-sampling layer, and the convolutional layer are sequentially obtained.
  • the optimization algorithm is used to update the weights according to the errors of each layer and the parameters returned. Repeat the above training process until the loss function reaches the optimal level, then end the training.
  • the L-BFGS algorithm is used to optimize the parameters (such as weight parameters, bias parameters, etc.) in the convolutional neural network to make the loss function reach a minimum.
  • the L-BFGS algorithm is a quasi-Newtonian
  • An improvement of the algorithm, the basic idea of the L-BFGS algorithm is: the algorithm only saves and uses the curvature information of the last m iterations to construct the approximate matrix of the Hessian matrix.
  • the L-BFGS algorithm has a fast execution speed, and since each iteration can guarantee the positive definiteness of the approximate matrix, the algorithm is robust.
  • the image segmentation device generates segmented images of the three-dimensional image according to the classification results of the multiple three-dimensional checkered images.
  • the multi-lattice three-dimensional checkered image segmented from the three-dimensional image to be segmented may be regarded as a relatively large “pixel” in the image to be segmented. Therefore, when the classification result of each three-dimensional grid is obtained through the image classification model, the classification label of each "pixel" in the three-dimensional image to be segmented is obtained, and the three-dimensional grid image can be used in the three-dimensional image to be segmented. The coordinate position in and the classification label of each three-dimensional grid image generate the segmented image of the three-dimensional image to be segmented.
  • the three-dimensional image to be divided is divided into multiple three-dimensional grid images; the two-dimensional image conversion processing is performed on the multiple three-dimensional grid images to obtain each of the multiple three-dimensional grid images.
  • the two-dimensional image groups corresponding to the multiple three-dimensional grid images are input into the trained image classification model to obtain the classification results of the multiple three-dimensional grids.
  • the segmented image of the three-dimensional image is generated according to the classification result of the multiple three-dimensional checkered images.
  • the two-dimensional image data is obtained by preprocessing the three-dimensional image data to be processed, so that the two-dimensional image data is input to the image segmentation model to perform image segmentation processing on the three-dimensional image to be segmented, thereby reducing image classification.
  • the amount of memory and calculation required by the model reduces the training time and difficulty of the image segmentation model, and at the same time satisfies the effect similar to the three-dimensional network.
  • FIG. 3 is a schematic block diagram of an image segmentation apparatus provided by an embodiment of the present application.
  • the image segmentation device 300 of this embodiment includes: a segmentation unit 310, a first conversion unit 320, a classification unit 330, and a generation unit 340.
  • the segmentation unit 310 is configured to segment the three-dimensional image to be segmented into a plurality of three-dimensional grid images
  • the first conversion unit 320 is configured to perform a two-dimensional image conversion process on the plurality of three-dimensional checkered images to obtain a two-dimensional image group corresponding to each of the plurality of three-dimensional checkered images;
  • the classification unit 330 is configured to input the two-dimensional image groups corresponding to the multiple three-dimensional grid images into the trained image classification model to obtain the classification results of the multiple three-dimensional grids;
  • the generating unit 340 is configured to generate the segmented image of the three-dimensional image according to the classification results of the multiple three-dimensional checkered images.
  • the foregoing image segmentation device further includes:
  • the acquiring unit is used to acquire the training sample set of the above-mentioned image classification model
  • the training unit is used to train the image classification model using the training sample set to obtain the trained image classification model.
  • the aforementioned image classification model includes three two-dimensional convolutional neural networks, and each of the aforementioned three two-dimensional convolutional neural networks includes three convolutional layers, one sampling layer and one output layer;
  • the three output layers of the above three two-dimensional convolutional neural networks are connected to a softmax classifier.
  • l represents the first layer
  • * represents the convolution operation
  • Representative weight Represents biased, It means that the size conforms to the all 1 matrix output by the current layer
  • the output of the sampling layer of the above convolutional neural network is:
  • the output of the above output layer is:
  • the input of the above-mentioned classifier is obtained after the output of the output layers of the above-mentioned three convolutional neural networks are spliced;
  • the probability that the k-th training case output by the above classifier meets the u-th category is:
  • is the parameter matrix of the softmax layer, and ⁇ has The size of K represents the number of categories, Is the number of feature maps output by layer l,
  • loss function of the convolutional neural network is:
  • t (k) represents the true value identification
  • ⁇ T represents all the parameters and offset settings
  • It represents the output synthesis of the fully connected layer of the three planes
  • the weight attenuation parameter ⁇ is set to 10 -2 .
  • the acquiring unit is also used to acquire three-dimensional image training samples
  • the image segmentation device further includes:
  • the second conversion unit is configured to perform the two-dimensional image conversion on the multiple three-dimensional grid training sample images to obtain the two-dimensional image corresponding to each three-dimensional grid training sample image in the multiple three-dimensional grid training sample images.
  • the marking unit is used for marking each training sample image group to obtain the training sample set.
  • the above-mentioned training unit includes:
  • the input unit is used to input the above-mentioned training samples into the above-mentioned image classification model for forward propagation to obtain the classification result of the training sample set;
  • the update unit is used to perform back propagation according to the classification result and the loss function of the training sample set to update the weight parameter of the image classification model.
  • the first conversion unit is configured to obtain three two-dimensional images from a first three-dimensional checkered image, where the first three-dimensional checkered image is any one of the multiple three-dimensional checkered images,
  • the above three two-dimensional images are all slice images that pass through the center of the three-dimensional grid image and are respectively parallel to the three mutually perpendicular surfaces of the three-dimensional grid image; take the three two-dimensional images as the first three-dimensional grid A two-dimensional image group corresponding to the image; using the same method as the first three-dimensional checkered image to obtain a two-dimensional image group of other three-dimensional checkered images in the plurality of three-dimensional checkered images.
  • the classification result includes a classification label corresponding to each three-dimensional square image in the plurality of three-dimensional square images
  • the generating unit is configured to generate the segmented image of the three-dimensional image to be segmented according to the coordinate positions of the multiple three-dimensional grid images in the three-dimensional image to be segmented and the classification labels of the multiple three-dimensional grid images.
  • the three-dimensional image to be divided is divided into multiple three-dimensional grid images; the two-dimensional image conversion processing is performed on the multiple three-dimensional grid images to obtain each of the multiple three-dimensional grid images.
  • the two-dimensional image groups corresponding to the multiple three-dimensional grid images are input into the trained image classification model to obtain the classification results of the multiple three-dimensional grids.
  • the segmented image of the three-dimensional image is generated according to the classification result of the multiple three-dimensional checkered images.
  • the two-dimensional image data is obtained by preprocessing the three-dimensional image data to be processed, so that the two-dimensional image data is input to the image segmentation model to perform image segmentation processing on the three-dimensional image to be segmented, thereby reducing image classification.
  • the amount of memory and calculation required by the model reduces the training time and difficulty of the image segmentation model, and at the same time satisfies the effect similar to the three-dimensional network.
  • FIG. 4 is a schematic structural diagram of an image segmentation device 400 provided by an embodiment of the present application.
  • the image segmentation device 400 includes a processor, a memory, a communication interface, and one or more programs.
  • the above-mentioned one or more programs are different from the above-mentioned one or more application programs, and the above-mentioned one or more programs are stored in the above-mentioned memory and configured to be executed by the above-mentioned processor.
  • the above program includes instructions for performing the following steps: dividing the three-dimensional image to be divided into multiple three-dimensional grid images; performing two-dimensional image conversion processing on the multiple three-dimensional grid images to obtain the multiple three-dimensional grid images The two-dimensional image group corresponding to each three-dimensional grid image; input the two-dimensional image groups corresponding to the multiple three-dimensional grid images into the trained image classification model to obtain the classification results of the multiple three-dimensional grids; The classification result of the plurality of three-dimensional checkered images generates a segmented image of the three-dimensional image.
  • the so-called processor may be a central processing unit (Central Processing Unit, CPU), and the processor may also be other general processors, digital signal processors (Digital Signal Processors, DSPs), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • a computer-readable storage medium stores a computer program.
  • the above-mentioned computer program When executed by a processor, it realizes: dividing a three-dimensional image to be divided into multiple three-dimensional Checkered image; perform two-dimensional image conversion processing on the multiple three-dimensional checkered images to obtain a two-dimensional image group corresponding to each three-dimensional checkered image in the multiple three-dimensional checkered images; correspond to the multiple three-dimensional checkered images
  • the two-dimensional image group is input into the trained image classification model to obtain the classification results of the multiple three-dimensional grids; the segmented image of the three-dimensional image is generated according to the classification results of the multiple three-dimensional grid images.
  • the aforementioned computer-readable storage medium may be the internal storage unit of the aforementioned terminal in any of the aforementioned embodiments, such as the hard disk or memory of the terminal.
  • the above-mentioned computer-readable storage medium may also be an external storage device of the above-mentioned terminal, such as a plug-in hard disk equipped on the above-mentioned terminal, a smart memory card (Smart Media Card, SMC), a Secure Digital (SD) card, and a flash memory card. (Flash Card) etc.
  • the aforementioned computer-readable storage medium may also include both an internal storage unit of the aforementioned terminal and an external storage device.
  • the aforementioned computer-readable storage medium is used to store the aforementioned computer program and other programs and data required by the aforementioned terminal.
  • the aforementioned computer-readable storage medium can also be used to temporarily store data that has been output or will be output.
  • the disclosed system, server, and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the above-mentioned units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
  • the units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present application.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the above integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application is essentially or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium It includes a number of instructions to enable a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the above methods of the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

本申请实施例公开了一种图像分割方法、装置及计算机可读存储介质,涉及图像处理,其中方法包括:将待分割三维图像切分为多个三维方格图像;对上述多个三维方格图像进行二维图像转换处理,得到上述多个三维方格图像中每个三维方格图像对应的二维图像组;将上述多个三维方格图像对应的二维图像组输入到已训练好的图像分类模型中,得到上述多个三维方格的分类结果;根据上述多个三维方格图像的分类结果生成上述三维图像的分割图像。通过本申请实施例,能通过更简单的二维网络构造和更低的资源需求来完成三维图像分割,并达到与三维网络近似的效果。

Description

一种图像分割方法、装置及计算机可读存储介质
本申请要求于2019年2月18日提交中国专利局、申请号为201910124587.X、申请名称为“一种图像分割方法、装置及计算机可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像分割技术领域,尤其涉及一种图像分割方法、装置及计算机可读存储介质。
背景技术
随着图像分割技术的发展,图像分割技术被广泛的应用于医学领域。例如,软骨退化通常能够预示骨关节炎,并且成为工作伤残的一个主要原因。而在治疗检查方面,膝核磁共振(Magnetic Resonance Imaging,MRI)扫描后软骨图像分割已经成为软骨退化量化评估分析的重要选择之一。通常情况下,软骨图像分辨由图像医师每一层图像分别比对完成,毫无疑问是非常消耗时间和精力的。另外,观察者自身和不同观察者之间的差异也很大,会极大的影响图像分辨效果。由此可见,出于降低人工成本同时提高识别准确率和效果的目的,自动图像分割程序的应用在研究与生产领域都富有潜力。
对于卷积神经网络而言现在已经有一个宽泛的扩展使得它们能被应用于三维图像分割,然而这些真正意义上的三维卷积神经网络需要庞大的内存和海量的训练时间,这一性质限制了它们在图像分割领域的应用。
发明内容
本申请实施例提供一种图像分割方法,能通过更简单的二维网络构造和更低的资源需求来完成三维图像分割,并达到与三维网络近似的效果。
第一方面,本申请实施例提供了一种图像方法,该方法包括:
将待分割三维图像切分为多个三维方格图像;
对所述多个三维方格图像进行二维图像转换处理,得到所述多个三维方格图像中每个三维方格图像对应的二维图像组;
将所述多个三维方格图像对应的二维图像组输入到已训练好的图像分类模型中,得到所述多个三维方格的分类结果;
根据所述多个三维方格图像的分类结果生成所述三维图像的分割图像。
第二方面,本申请实施例提供了一种图像分割装置,该图像分割装置包括用于执行上述第一方面的方法的单元,该图像分割装置包括:
切分单元,用于将待分割三维图像切分为多个三维方格图像;
转换单元,用于对所述多个三维方格图像进行二维图像转换处理,得到所述多个三维方格图像中每个三维方格图像对应的二维图像组;
分类单元,用于将所述多个三维方格图像对应的二维图像组输入到已训练好的图像分类模型中,得到所述多个三维方格的分类结果;
生成单元,用于根据所述多个三维方格图像的分类结果生成所述三维图像的分割图像。
第三方面,本申请实施例提供了一种图像分割设备,包括处理器、存储器和通信模块,其中,所述存储器用于存储程序代码,所述处理器用于调用所述程序代码来执行上述第一方面中的方法及其任一种可选方式的方法。
第四方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行,用以执行上述第一方面的方法。
本申请实施例通过对待处理的三维图像数据进行预处理后得到二维图像数据,使得图像分割模型的输入二维的图像数据来对待分割的三维图像进行图像分割处理,减少了图像分类模型所需要的内存、计算量,从而减小了图像分割模型的训练时间和难度,同时满足与三维网络近似的效果。
附图说明
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍。
图1是本申请实施例提供一种图像分割方法的示意流程图;
图2为本申请实施里提供的一种将待分割三维图像转换为二维图像组的示意图;
图3是本申请实施例提供的一种图像分割装置的示意框图;
图4是本申请实施例提供的一种图像分割设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。
参见图1,图1是本申请实施例提供一种图像分割方法的示意流程图,如图所示该方法可包括:
101:图像分割装置将待分割三维图像切分为多个三维方格图像;对上述多个三维方格图像进行二维图像转换处理,得到上述多个三维方格图像中每个三维方格图像对应的二维图像组。
在本申请实施例中,上述待分割三维图像可以是三维超声图像、磁共振成像(Magnetic Resonance Imaging,MRI)、电子计算机断层扫描(Computed Tomography,CT)图像等。值得注意的是,上述待分割三维图像应该与训练图像分类模型时使用的三维训练样本图像属于同一类型。
在本申请实施例中,用于对三维图像进行图像分割处理的图像分类模型为三个二维卷积神经网络构成的。因此,上述图像分类模型的输入应该为二维图像。然而待处理的图像为三维的图像,因此,在进行图像分割时,需要将原始的待分割三维图像经过预处理,得到能够被上述图像分类模型处理的二维图像。
具体的,将获取到的待分割三维图像按照预设的尺寸切分成若干的三维方格图像。然后,从上述若干的三维方格图像中的第一三维方格图像中截取三个穿过该三维方格图像的中心,且分别平行于该三维方格的三个相互垂直的表面的三个二维切片图像。将上述三个二维切片图像作为与该三维方格对应的二维图像组。并采用相同的方法获取与其他三维方格图像对应的二维图像组。
如图2所示,图2为本申请实施里提供的一种将待分割三维图像转换为二维图像组的示意图。假设原始的待分割三维图像为如图2(a)所示的立方体图形,当获取到待分割三维图像之后,对待分割三维图像按照预设的尺寸大小进行切割得到如图2(b)所示的三维方格图像。接着从如图2(b)所示的三维方格中截取穿过原点O,平行于xOy,yOz,xOz平面的图像,得到如图2(c)的一组二维图像。这个过程相当于将待分割图像切割成了具有预设尺寸的“像素点”,然后,截取“像素点”的三个切片图像来代表该“像素点”。实际上上述“像素点”中包括了若干真实像素点。
可以理解的是,待分割的三维图像可以是不规则的立体图像,因此在对不规则的待分割三维图像进行切分时,边界的三维方格没有像素点的地方可以用0 补齐。
102:图像分割装置将上述多个三维方格图像对应的二维图像组输入到已训练好的图像分类模型中,得到上述多个三维方格的分类结果。
在本申请实施例中,由于二维图像组是从唯一确定的三维方格中截取的切片图像,即二维图像组与三维方格图像之间是一一对应的关系。因此,可以将每个三维方格对应的二维图像组输入到以训练好的图像分类模型中得到的分类结果作为三维方格图像的分类结果。
在本申请实施例中,由于要使用图像分类模型对每组二维图像进行分类。因此,在使用图像分类模型对每组二维图像进行分类之前,需要构建图像分类模型的网络结构。然后获取上述图像分类模型的训练样本集。接着,使用上述训练集样本对上述图像分类模型的网络结构进行训练,以得到上述训练好的图像分类模型。
作为一种可选的实施方式,上述构建上述图像分类模型的网络结构具体可以包括:构建三个二维卷积神经网络,上述三个二维卷积神经网络中的每个卷积神经网络包括三个卷积层、一个采样层和一个输出层;将上述三个二维卷积神经网络的三个输出层之后连接一个softmax分类器,得到上述图像分类模型的网络结构。
其中,在上述图像分类模型的三个卷积神经网络中,上述卷积神经网络的卷积层输出为:
Figure PCTCN2019088975-appb-000001
其中,l表示第l层,
Figure PCTCN2019088975-appb-000002
表示第l层输出的第j个特征图,
Figure PCTCN2019088975-appb-000003
则表示来自上一层输入的第i个特征图,*则代表卷积运算,
Figure PCTCN2019088975-appb-000004
代表权值,
Figure PCTCN2019088975-appb-000005
代表偏重,
Figure PCTCN2019088975-appb-000006
则表示尺寸符合当前层输出的全1矩阵;
上述卷积神经网络的采样层输出为:
Figure PCTCN2019088975-appb-000007
其中,
Figure PCTCN2019088975-appb-000008
表示第l层的第j个特征图中(x,y)处的值,S代表采样系数,在本方案中设为2。m与n代表采样层偏置,
Figure PCTCN2019088975-appb-000009
为偏置参数;
上述输出层的输出为:
Figure PCTCN2019088975-appb-000010
其中,
Figure PCTCN2019088975-appb-000011
为上一层特征图的大小;
上述分类器的输入为上述三个卷积神经网络的输出层的输出拼接之后得到;
上述分类器输出的第k次训练案例符合第u个类别的概率为:
Figure PCTCN2019088975-appb-000012
其中,θ为softmax层的参数矩阵,θ具有
Figure PCTCN2019088975-appb-000013
的尺寸,K代表分类的个数,
Figure PCTCN2019088975-appb-000014
为第l层输出的特征图的数量,
Figure PCTCN2019088975-appb-000015
优选的,上述图像分类模型中的二维卷积神经网络输入的二维图像大小为28×28,第一层卷积层卷积核的大小为5×5,卷积核的数量为N(在本实施例中N实际为28),卷积步长为1,经过第一层卷积层后提取出N个特征图,每个特征图的大小为(28-5+1)×(28-5+1);接下来的第一层池化层(采样层)的池化窗口大小为2×2,经过池化处理后上述N个特征图均的大小均变为12×12;然后第二层卷积层的卷积核的大小为5×5,卷积核的数量为2N,步长为1,经过第二层卷积层后提取出2N个特征图,每个特征图的大小为8×8;接着第三层卷积层的卷积核的大小为5×5,卷积核的数量为4N,步长为1,经过第三层卷积层后提取出4N个特征图,每个特征图的大小为4×4;最后经过全连接层(即输出层)之后,得到64N个大小为1×1的特征图。
在本申请实施例中,卷积神经网络中对输出层计算出的结果使用Softmax作为回归模型进行概率分配,第k次训练案例符合第u个类别的概率可以使用上述公式(4)来进行计算。
在本申请实施例中,使用公式(5)来作为卷积神经网络的损失函数:
Figure PCTCN2019088975-appb-000016
其中t (k)代表真值标识,
Figure PCTCN2019088975-appb-000017
则代表二维图像组的三个原始输入图像,Ω T代表所有的参数和偏置设置,
Figure PCTCN2019088975-appb-000018
则代表三个平面的全连接层输出综合,
Figure PCTCN2019088975-appb-000019
权值衰减参数λ则设置为10 -2
进一步的,当上述三维图形分割模型的网络结构构建好之后,需要获取训练样本集,以便使用上述训练样本集对上述图像分类模型进行训练。
在本申请实施例中,获取上述训练样本集具体可以包括:首先,获取三维训练样本图像;然后,对上述三维训练样本图像进行采样,得到多个三维方格训练样本图像。接着,对上述多个三维方格训练样本图像进行上述二维图像转换,以得到上述多个三维方格训练样本图像中每个三维方格训练样本图像对应的二维训练样本图像组。最后,对每组二维训练样本图像组进行标记,以得到上述训练样本集。
其中,上述三维训练样本图像为与上述待分割三维图像属于同一类别的图像。例如,若上述待分割三维图像为脑部MRI图像,则上述三维训练样本图像也应该为脑部MRI图像。
由于训练样本集中的每组二维训练图像组都需要被标注类别。因此可以在对上述三维训练样本图像进行采样之前使用现有的图像分割技术对上述三维训练样本图像的不同类别区域进行图像分割,以得到上述三维训练样本图像的样本分割图像。从而,在得到上述多个三维方格训练样本图像中每个三维方格训练样本图像对应的二维训练样本图像组之后,可以根据样本分割图像对每组二维训练样本图像组进行标记,以得到上述训练样本集。
在得到训练样本集后,通过训练样本集对上述图像分类模型进行训练。由于上述图像分类模型是由三个二维卷积神经网络组成的,因此可以采用训练卷积神经网络的训练方法对上述图像分类模型进性训练。
卷积神经网络的训练过程分为两个阶段。第一个阶段是数据由低层次向高 层次传播的阶段,即前向传播阶段。另外一个阶段是,当前向传播得出的结果与预期不相符时,将误差从高层次向底层次进行传播训练的阶段,即反向传播阶段。
在本实施例中,由于卷积神经网络模型中需要大量的权重和偏置项,在权重的初始化时需要加入少量的噪声来打破对称性以及避免0梯度的问题,可使用正态分布的数据进行初始化;偏置项使用一个较小的正数进行初始化,以避免神经元节点输出恒为0的问题。
在经过权值初始化之后,将上述训练样本集的二维训练样本图像组的三个二维切片图像分别输入到上述图像分类模型中的三个二维卷积神经网络中,上述将上述训练样本集的二维训练样本图像组的三个二维切片图像分别输入到上述图像分类模型中的三个二维卷积神经网络中的传播过程即前向传播过程,在前向传播过程中,输入的二维图像数据经过卷积神经网络的多层卷积层的卷积和池化处理,提出特征向量,将特征向量传入全连接层中,得出分类识别的结果。当输出的结果与我们的期望值相符时,输出结果。由于上述图像分类模型中具有三个卷积神经网络,会得到三个全连接层(输出层)的输出结果,因此在上述图像分类模型中需要将上述三个输出结果合并作为上述分类识别的结果,然后在将合并后的分类识别结果输入到分类器中,得到该次训练的分类概率结果。
当上述图像分类模型输出的分类概率结果与我们的期望值不相符时,则进行反向传播过程。求出结果与期望值的误差,再将误差一层一层的返回,计算出每一层的误差,然后进行权值更新。该过程的主要目的是通过训练样本和期望值来调整网络权值。误差的传递过程可以这样来理解,首先,数据从输入层到输出层,期间经过了卷积层,下采样层,全连接层,而数据在各层之间传递的过程中难免会造成数据的损失,则也就导致了误差的产生。而每一层造成的误差值是不一样的,所以当我们求出网络的总误差之后,需要将误差传入网络中,求得该各层对于总的误差应该承担多少比重。当误差大于我们的期望值时,将误差传回网络中,依次求得全连接层,下采样层,卷积层的误差。最后根据各层的误差以及回传的参数来使用优化算法进行权值更新。重复上述训练过程知道损失函数达到最优时,结束训练。
在本申请实施例中,使用L-BFGS算法来优化卷积神经网络中的参数(例 如权值参数、偏置参数等),以使损失函数达到极小值,L-BFGS算法就是对拟牛顿算法的一个改进,L-BFGS算法的基本思想是:算法只保存并利用最近m次迭代的曲率信息来构造海森矩阵的近似矩阵。L-BFGS算法执行速度快,而且由于每一步迭代都能保证近似矩阵的正定,因此算法的鲁棒性强。
103:图像分割装置根据上述多个三维方格图像的分类结果生成上述三维图像的分割图像。
在本申请实施例中,可以将有待分割三维图像切分出来的多格三维方格图像看作是待分割图像中比较大的“像素点”。因此,当经过图像分类模型得到每个三维方格的分类结果后,即得到上述待分割三维图像中每个“像素点”的分类标签,就可以根据上述三维方格图像在上述待分割三维图像中的坐标位置以及每个三维方格图像的分类标签生成上述待分割三维图像的分割图像。
可以看出,本申请实施例通过将待分割三维图像切分为多个三维方格图像;并对上述多个三维方格图像进行二维图像转换处理,得到上述多个三维方格图像中每个三维方格图像对应的二维图像组。然后,将上述多个三维方格图像对应的二维图像组输入到已训练好的图像分类模型中,得到上述多个三维方格的分类结果。最后,根据上述多个三维方格图像的分类结果生成上述三维图像的分割图像。在本申请实施例中,通过对待处理的三维图像数据进行预处理后得到二维图像数据,使得图像分割模型的输入二维的图像数据来对待分割的三维图像进行图像分割处理,减少了图像分类模型所需要的内存、计算量,从而减小了图像分割模型的训练时间和难度,同时满足与三维网络近似的效果。
本申请实施例还提供一种图像分割装置,该图像分割装置用于执行前述任一项的方法的单元。具体地,参见图3,图3是本申请实施例提供的一种图像分割装置的示意框图。本实施例的图像分割装置300包括:切分单元310、第一转换单元320、分类单元330、生成单元340。
切分单元310,用于将待分割三维图像切分为多个三维方格图像;
第一转换单元320,用于对上述多个三维方格图像进行二维图像转换处理,得到上述多个三维方格图像中每个三维方格图像对应的二维图像组;
分类单元330,用于将上述多个三维方格图像对应的二维图像组输入到已训练好的图像分类模型中,得到上述多个三维方格的分类结果;
生成单元340,用于根据上述多个三维方格图像的分类结果生成上述三维图 像的分割图像。
可选的,上述图像分割装置还包括:
获取单元,用于获取上述图像分类模型的训练样本集;
训练单元,用于使用上述训练样本集对上述图像分类模型进行训练,以得到上述训练好的图像分类模型。
进一步的,上述图像分类模型包括三个二维卷积神经网络,上述三个二维卷积神经网络中的每个卷积神经网络包括三个卷积层、一个采样层和一个输出层;
上述三个二维卷积神经网络的三个输出层之后连接一个softmax分类器。
进一步的,上述卷积神经网络的卷积层输出为:
Figure PCTCN2019088975-appb-000020
其中,l表示第l层,
Figure PCTCN2019088975-appb-000021
表示第l层输出的第j个特征图,
Figure PCTCN2019088975-appb-000022
则表示来自上一层输入的第i个特征图,*则代表卷积运算,
Figure PCTCN2019088975-appb-000023
代表权值,
Figure PCTCN2019088975-appb-000024
代表偏重,
Figure PCTCN2019088975-appb-000025
则表示尺寸符合当前层输出的全1矩阵;
上述卷积神经网络的采样层输出为:
Figure PCTCN2019088975-appb-000026
其中,
Figure PCTCN2019088975-appb-000027
表示第l层的第j个特征图中(x,y)处的值,S代表采样系数,在本方案中设为2,m与n代表采样层偏置,
Figure PCTCN2019088975-appb-000028
为偏置参数;
上述输出层的输出为:
Figure PCTCN2019088975-appb-000029
其中,
Figure PCTCN2019088975-appb-000030
为上一层特征图的大小;
上述分类器的输入为上述三个卷积神经网络的输出层的输出拼接之后得 到;
上述分类器输出的第k次训练案例符合第u个类别的概率为:
Figure PCTCN2019088975-appb-000031
其中,θ为softmax层的参数矩阵,θ具有
Figure PCTCN2019088975-appb-000032
的尺寸,K代表分类的个数,
Figure PCTCN2019088975-appb-000033
为第l层输出的特征图的数量,
Figure PCTCN2019088975-appb-000034
进一步的,所述卷积神经网络的损失函数为:
Figure PCTCN2019088975-appb-000035
其中t (k)代表真值标识,
Figure PCTCN2019088975-appb-000036
则代表二维图像组的三个原始输入图像,Ω T代表所有的参数和偏置设置,
Figure PCTCN2019088975-appb-000037
则代表三个平面的全连接层输出综合,
Figure PCTCN2019088975-appb-000038
权值衰减参数λ则设置为10 -2
进一步的,所述获取单元,还用于获取三维图像训练样本;
所述图像分割装置还包括:
采集单元,用于对所述三维图像训练样本进行采样,得到多个三维方格训练样本图像;
第二转换单元,用于对所述多个三维方格训练样本图像进行所述二维图像转换,得到所述多个三维方格训练样本图像中每个三维方格训练样本图像对应的二维训练样本图像组;
标记单元,用于对每组训练样本图像组进行标记,以得到所述训练样本集。
进一步的,上述训练单元包括:
输入单元,用于将上述训练样本集中输入到上述图像分类模型中进行前向传播,得到训练样本集的分类结果;
更新单元,用于根据上述训练样本集的分类结果和损失函数进行反向传播, 以更新上述图像分类模型的权值参数。
进一步的,上述第一转换单元,用于从第一三维方格图像中获取三个二维图像,上述第一三维方格图像为上述多个三维方格图像中的任意一个三维方格图像,上述三个二维图像均为穿过三维方格图像的中心,且分别与三维方格图像的三个相互垂直的表面平行的切片图像;将上述三个二维图像作为上述第一三维方格图像对应的二维图像组;使用和上述第一三维方格图像相同的方法得到上述多个三维方格图像中其他三维方格图像的二维图像组。
进一步的,所述分类结果包括所述多个三维方格图像中每个三维方格图像对应的分类标签;
所述生成单元,用于根据所述多个三维方格图像在所述待分割三维图像中的坐标位置以及所述多个三维方格图像的分类标签生成所述待分割三维图像的分割图像。
可以看出,本申请实施例通过将待分割三维图像切分为多个三维方格图像;并对上述多个三维方格图像进行二维图像转换处理,得到上述多个三维方格图像中每个三维方格图像对应的二维图像组。然后,将上述多个三维方格图像对应的二维图像组输入到已训练好的图像分类模型中,得到上述多个三维方格的分类结果。最后,根据上述多个三维方格图像的分类结果生成上述三维图像的分割图像。在本申请实施例中,通过对待处理的三维图像数据进行预处理后得到二维图像数据,使得图像分割模型的输入二维的图像数据来对待分割的三维图像进行图像分割处理,减少了图像分类模型所需要的内存、计算量,从而减小了图像分割模型的训练时间和难度,同时满足与三维网络近似的效果。
请参阅图4,图4是本申请实施例提供的一种图像分割设备400的结构示意图,如图4所示,图像分割设备400包括处理器、存储器、通信接口以及一个或多个程序,其中,上述一个或多个程序不同于上述一个或多个应用程序,且上述一个或多个程序被存储在上述存储器中,并且被配置由上述处理器执行。
上述程序包括用于执行以下步骤的指令:将待分割三维图像切分为多个三维方格图像;对上述多个三维方格图像进行二维图像转换处理,得到上述多个三维方格图像中每个三维方格图像对应的二维图像组;将上述多个三维方格图像对应的二维图像组输入到已训练好的图像分类模型中,得到上述多个三维方格的分类结果;根据上述多个三维方格图像的分类结果生成上述三维图像的分 割图像。
应当理解,在本申请实施例中,所称处理器可以是中央处理单元(Central Processing Unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
在本申请的另一实施例中提供一种计算机可读存储介质,上述计算机可读存储介质存储有计算机程序,上述计算机程序被处理器执行时实现:将待分割三维图像切分为多个三维方格图像;对上述多个三维方格图像进行二维图像转换处理,得到上述多个三维方格图像中每个三维方格图像对应的二维图像组;将上述多个三维方格图像对应的二维图像组输入到已训练好的图像分类模型中,得到上述多个三维方格的分类结果;根据上述多个三维方格图像的分类结果生成上述三维图像的分割图像。
上述计算机可读存储介质可以是前述任一实施例上述的终端的内部存储单元,例如终端的硬盘或内存。上述计算机可读存储介质也可以是上述终端的外部存储设备,例如上述终端上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,上述计算机可读存储介质还可以既包括上述终端的内部存储单元也包括外部存储设备。上述计算机可读存储介质用于存储上述计算机程序以及上述终端所需的其他程序和数据。上述计算机可读存储介质还可以用于暂时地存储已经输出或者将要输出的数据。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、服务器和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,上述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口、装置或单元的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。
上述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为 单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本申请实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
上述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例上述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (20)

  1. 一种图像分割方法,其特征在于,包括:
    将待分割三维图像切分为多个三维方格图像;
    对所述多个三维方格图像进行二维图像转换处理,得到所述多个三维方格图像中每个三维方格图像对应的二维图像组;
    将所述多个三维方格图像对应的二维图像组输入到已训练好的图像分类模型中,得到所述多个三维方格的分类结果;
    根据所述多个三维方格图像的分类结果生成所述三维图像的分割图像。
  2. 根据权利要求1所述的方法,其特征在于,在所述将所述多个三维方格图像对应的二维图像组输入到已训练好的图像分类模型中之前,所述方法还包括:
    获取所述图像分类模型的训练样本集;
    使用所述训练样本集对所述图像分类模型进行训练,以得到所述训练好的图像分类模型。
  3. 根据权利要求1所述的方法,其特征在于,所述图像分类模型包括三个二维卷积神经网络,所述三个二维卷积神经网络中的每个卷积神经网络包括三个卷积层、一个采样层和一个输出层;
    所述三个二维卷积神经网络的三个输出层之后连接一个softmax分类器。
  4. 根据权利要求3所述的方法,其特征在于,所述卷积神经网络的卷积层输出为:
    Figure PCTCN2019088975-appb-100001
    其中,l表示第l层,
    Figure PCTCN2019088975-appb-100002
    表示第l层输出的第j个特征图,
    Figure PCTCN2019088975-appb-100003
    则表示来自上一层输入的第i个特征图,*则代表卷积运算,
    Figure PCTCN2019088975-appb-100004
    代表权值,
    Figure PCTCN2019088975-appb-100005
    代表偏重,
    Figure PCTCN2019088975-appb-100006
    则表示尺寸符合当前层输出的全1矩阵;
    所述卷积神经网络的采样层输出为:
    Figure PCTCN2019088975-appb-100007
    其中,
    Figure PCTCN2019088975-appb-100008
    表示第l层的第j个特征图中(x,y)处的值,S代表采样系数,在本方案中设为2,m与n代表采样层偏置,
    Figure PCTCN2019088975-appb-100009
    为偏置参数;
    所述输出层的输出为:
    Figure PCTCN2019088975-appb-100010
    其中,
    Figure PCTCN2019088975-appb-100011
    为上一层特征图的大小;
    所述分类器的输入为所述三个卷积神经网络的输出层的输出拼接之后得到;
    所述分类器输出的第k次训练案例符合第u个类别的概率为:
    Figure PCTCN2019088975-appb-100012
    其中,θ为softmax层的参数矩阵,θ具有
    Figure PCTCN2019088975-appb-100013
    的尺寸,K代表分类的个数,
    Figure PCTCN2019088975-appb-100014
    为第l层输出的特征图的数量,
    Figure PCTCN2019088975-appb-100015
  5. 根据权利要求3所述的方法,其特征在于,所述卷积神经网络的损失函数为:
    Figure PCTCN2019088975-appb-100016
    其中t (k)代表真值标识,
    Figure PCTCN2019088975-appb-100017
    则代表二维图像组的三个原始输入图像,Ω T代表所有的参数和偏置设置,
    Figure PCTCN2019088975-appb-100018
    则代表三个平面的全连接层输出综合,
    Figure PCTCN2019088975-appb-100019
    权值衰减参数λ则设置为10 -2
  6. 根据权利要求2所述的方法,其特征在于,所述获取所述图像分类模型 的训练样本集,包括:
    获取三维图像训练样本;
    对所述三维图像训练样本进行采样,得到多个三维方格训练样本图像;
    对所述多个三维方格训练样本图像进行所述二维图像转换,得到所述多个三维方格训练样本图像中每个三维方格训练样本图像对应的二维训练样本图像组;
    对每组训练样本图像组进行标记,以得到所述训练样本集。
  7. 根据权利要求6所述的方法,其特征在于,所述使用所述训练集样本对上述图像分类模型进行训练,包括:
    将所述训练样本集中输入到上述图像分类模型中进行前向传播,得到训练样本集的分类结果;
    根据所述训练样本集的分类结果和损失函数进行反向传播,以更新所述图像分类模型的权值参数。
  8. 根据权利要求1-6任一项所述的方法,其特征在于,所述对所述多个三维方格图像进行二维图像转换处理,得到所述多个三维方格图像中每个三维方格图像对应的二维图像组,包括:
    从第一三维方格图像中获取三个二维图像,所述第一三维方格图像为所述多个三维方格图像中的任意一个三维方格图像,所述三个二维图像均为穿过三维方格图像的中心,且分别与三维方格图像的三个相互垂直的表面平行的切片图像;
    将所述三个二维图像作为所述第一三维方格图像对应的二维图像组;
    使用和所述第一三维方格图像相同的方法得到所述多个三维方格图像中其他三维方格图像的二维图像组。
  9. 根据权利要求1所述的方法,其特征在于,所述分类结果包括所述多个三维方格图像中每个三维方格图像对应的分类标签;
    所述根据所述多个三维方格图像的分类结果生成所述三维图像的分割图像,包括:
    根据所述多个三维方格图像在所述待分割三维图像中的坐标位置以及所述多个三维方格图像的分类标签生成所述待分割三维图像的分割图像。
  10. 一种图像分割装置,其特征在于,包括:
    切分单元,用于将待分割三维图像切分为多个三维方格图像;
    第一转换单元,用于对所述多个三维方格图像进行二维图像转换处理,得到所述多个三维方格图像中每个三维方格图像对应的二维图像组;
    分类单元,用于将所述多个三维方格图像对应的二维图像组输入到已训练好的图像分类模型中,得到所述多个三维方格的分类结果;
    生成单元,用于根据所述多个三维方格图像的分类结果生成所述三维图像的分割图像。
  11. 根据权利要求10所述的图像分割装置,其特征在于,所述图像分割装置还包括:
    获取单元,用于获取所述图像分类模型的训练样本集;
    训练单元,用于使用所述训练样本集对所述图像分类模型进行训练,以得到所述训练好的图像分类模型。
  12. 根据权利要求10所述的图像分割装置,其特征在于,所述图像分类模型包括三个二维卷积神经网络,所述三个二维卷积神经网络中的每个卷积神经网络包括三个卷积层、一个采样层和一个输出层;
    所述三个二维卷积神经网络的三个输出层之后连接一个softmax分类器。
  13. 根据权利要求12所述的图像分割装置,其特征在于,所述卷积神经网络的卷积层输出为:
    Figure PCTCN2019088975-appb-100020
    其中,l表示第l层,
    Figure PCTCN2019088975-appb-100021
    表示第l层输出的第j个特征图,
    Figure PCTCN2019088975-appb-100022
    则表示来自上一层输入的第i个特征图,*则代表卷积运算,
    Figure PCTCN2019088975-appb-100023
    代表权值,
    Figure PCTCN2019088975-appb-100024
    代表偏重,
    Figure PCTCN2019088975-appb-100025
    则表示尺寸符合当前层输出的全1矩阵;
    所述卷积神经网络的采样层输出为:
    Figure PCTCN2019088975-appb-100026
    其中,
    Figure PCTCN2019088975-appb-100027
    表示第l层的第j个特征图中(x,y)处的值,S代表采样系数,在本方案中设为2,m与n代表采样层偏置,
    Figure PCTCN2019088975-appb-100028
    为偏置参数;
    所述输出层的输出为:
    Figure PCTCN2019088975-appb-100029
    其中,
    Figure PCTCN2019088975-appb-100030
    为上一层特征图的大小;
    所述分类器的输入为所述三个卷积神经网络的输出层的输出拼接之后得到;
    所述分类器输出的第k次训练案例符合第u个类别的概率为:
    Figure PCTCN2019088975-appb-100031
    其中,θ为softmax层的参数矩阵,θ具有
    Figure PCTCN2019088975-appb-100032
    的尺寸,K代表分类的个数,
    Figure PCTCN2019088975-appb-100033
    为第l层输出的特征图的数量,
    Figure PCTCN2019088975-appb-100034
  14. 根据权利要求12所述的图像分割装置,其特征在于,所述卷积神经网络的损失函数为:
    Figure PCTCN2019088975-appb-100035
    其中t (k)代表真值标识,
    Figure PCTCN2019088975-appb-100036
    则代表二维图像组的三个原始输入图像,Ω T代表所有的参数和偏置设置,
    Figure PCTCN2019088975-appb-100037
    则代表三个平面的全连接层输出综合,
    Figure PCTCN2019088975-appb-100038
    权值衰减参数λ则设置为10 -2
  15. 根据权利要求11所述的图像分割装置,其特征在于,所述获取单元,还用于获取三维图像训练样本;
    所述图像分割装置还包括:
    采集单元,用于对所述三维图像训练样本进行采样,得到多个三维方格训练样本图像;
    第二转换单元,用于对所述多个三维方格训练样本图像进行所述二维图像转换,得到所述多个三维方格训练样本图像中每个三维方格训练样本图像对应的二维训练样本图像组;
    标记单元,用于对每组训练样本图像组进行标记,以得到所述训练样本集。
  16. 根据权利要求15所述的图像分割装置,其特征在于,所述训练单元包括:
    输入单元,用于将所述训练样本集中输入到上述图像分类模型中进行前向传播,得到训练样本集的分类结果;
    更新单元,用于根据所述训练样本集的分类结果和损失函数进行反向传播,以更新所述图像分类模型的权值参数。
  17. 根据权利要求10-16任一项所述的图像分割装置,其特征在于,所述第一转换单元,用于从第一三维方格图像中获取三个二维图像,所述第一三维方格图像为所述多个三维方格图像中的任意一个三维方格图像,所述三个二维图像均为穿过三维方格图像的中心,且分别与三维方格图像的三个相互垂直的表面平行的切片图像;将所述三个二维图像作为所述第一三维方格图像对应的二维图像组;使用和所述第一三维方格图像相同的图像分割装置得到所述多个三维方格图像中其他三维方格图像的二维图像组。
  18. 根据权利要求10所述的图像分割装置,其特征在于,所述分类结果包括所述多个三维方格图像中每个三维方格图像对应的分类标签;
    所述生成单元,用于根据所述多个三维方格图像在所述待分割三维图像中的坐标位置以及所述多个三维方格图像的分类标签生成所述待分割三维图像的分割图像。
  19. 一种图像分割设备,其特征在于,所述图像分割装置包括处理器、存储器和通信模块,其中,所述存储器用于存储程序代码,所述处理器用于调用所述程序代码来执行如权利要求1-9任一项所述的方法。
  20. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如权利要求1-9任一项所述的方法。
PCT/CN2019/088975 2019-02-18 2019-05-29 一种图像分割方法、装置及计算机可读存储介质 WO2020168648A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910124587.X 2019-02-18
CN201910124587.XA CN109978888B (zh) 2019-02-18 2019-02-18 一种图像分割方法、装置及计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2020168648A1 true WO2020168648A1 (zh) 2020-08-27

Family

ID=67077046

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/088975 WO2020168648A1 (zh) 2019-02-18 2019-05-29 一种图像分割方法、装置及计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN109978888B (zh)
WO (1) WO2020168648A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111915609A (zh) * 2020-09-22 2020-11-10 平安科技(深圳)有限公司 病灶检测分析方法、装置、电子设备及计算机存储介质
CN112102284A (zh) * 2020-09-14 2020-12-18 推想医疗科技股份有限公司 图像分割模型的训练样本的标记方法、训练方法及装置
CN115937229A (zh) * 2022-12-29 2023-04-07 深圳优立全息科技有限公司 一种基于超体素和图割算法的三维自动分割方法及装置
CN112102284B (zh) * 2020-09-14 2024-05-28 推想医疗科技股份有限公司 图像分割模型的训练样本的标记方法、训练方法及装置

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110443839A (zh) * 2019-07-22 2019-11-12 艾瑞迈迪科技石家庄有限公司 一种骨骼模型空间配准方法及装置
US20240152747A1 (en) * 2022-11-08 2024-05-09 UnitX, Inc. Three-dimensional spatial-channel deep learning neural network

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050251021A1 (en) * 2001-07-17 2005-11-10 Accuimage Diagnostics Corp. Methods and systems for generating a lung report
CN106355194A (zh) * 2016-08-22 2017-01-25 广东华中科技大学工业技术研究院 一种基于激光成像雷达的无人艇水面目标处理方法
CN106803251A (zh) * 2017-01-12 2017-06-06 西安电子科技大学 由ct影像确定主动脉缩窄处压力差的装置与方法
CN107424145A (zh) * 2017-06-08 2017-12-01 广州中国科学院软件应用技术研究所 基于三维全卷积神经网络的核磁共振图像的分割方法
CN107563983A (zh) * 2017-09-28 2018-01-09 上海联影医疗科技有限公司 图像处理方法以及医学成像设备
CN108603922A (zh) * 2015-11-29 2018-09-28 阿特瑞斯公司 自动心脏体积分割
CN108664848A (zh) * 2017-03-30 2018-10-16 杭州海康威视数字技术股份有限公司 图像目标的识别方法及装置
CN109118564A (zh) * 2018-08-01 2019-01-01 湖南拓视觉信息技术有限公司 一种基于融合体素的三维点云标记方法和装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8290305B2 (en) * 2009-02-13 2012-10-16 Harris Corporation Registration of 3D point cloud data to 2D electro-optical image data
US8494285B2 (en) * 2010-12-09 2013-07-23 The Hong Kong University Of Science And Technology Joint semantic segmentation of images and scan data
CN108573491A (zh) * 2017-03-10 2018-09-25 南京大学 一种基于机器学习的三维超声图像分割方法
US10751548B2 (en) * 2017-07-28 2020-08-25 Elekta, Inc. Automated image segmentation using DCNN such as for radiation therapy
CN108717568B (zh) * 2018-05-16 2019-10-22 陕西师范大学 一种基于三维卷积神经网络的图像特征提取与训练方法

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050251021A1 (en) * 2001-07-17 2005-11-10 Accuimage Diagnostics Corp. Methods and systems for generating a lung report
CN108603922A (zh) * 2015-11-29 2018-09-28 阿特瑞斯公司 自动心脏体积分割
CN106355194A (zh) * 2016-08-22 2017-01-25 广东华中科技大学工业技术研究院 一种基于激光成像雷达的无人艇水面目标处理方法
CN106803251A (zh) * 2017-01-12 2017-06-06 西安电子科技大学 由ct影像确定主动脉缩窄处压力差的装置与方法
CN108664848A (zh) * 2017-03-30 2018-10-16 杭州海康威视数字技术股份有限公司 图像目标的识别方法及装置
CN107424145A (zh) * 2017-06-08 2017-12-01 广州中国科学院软件应用技术研究所 基于三维全卷积神经网络的核磁共振图像的分割方法
CN107563983A (zh) * 2017-09-28 2018-01-09 上海联影医疗科技有限公司 图像处理方法以及医学成像设备
CN109118564A (zh) * 2018-08-01 2019-01-01 湖南拓视觉信息技术有限公司 一种基于融合体素的三维点云标记方法和装置

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112102284A (zh) * 2020-09-14 2020-12-18 推想医疗科技股份有限公司 图像分割模型的训练样本的标记方法、训练方法及装置
CN112102284B (zh) * 2020-09-14 2024-05-28 推想医疗科技股份有限公司 图像分割模型的训练样本的标记方法、训练方法及装置
CN111915609A (zh) * 2020-09-22 2020-11-10 平安科技(深圳)有限公司 病灶检测分析方法、装置、电子设备及计算机存储介质
CN111915609B (zh) * 2020-09-22 2023-07-14 平安科技(深圳)有限公司 病灶检测分析方法、装置、电子设备及计算机存储介质
CN115937229A (zh) * 2022-12-29 2023-04-07 深圳优立全息科技有限公司 一种基于超体素和图割算法的三维自动分割方法及装置
CN115937229B (zh) * 2022-12-29 2023-08-04 深圳优立全息科技有限公司 一种基于超体素和图割算法的三维自动分割方法及装置

Also Published As

Publication number Publication date
CN109978888A (zh) 2019-07-05
CN109978888B (zh) 2023-07-28

Similar Documents

Publication Publication Date Title
WO2020168648A1 (zh) 一种图像分割方法、装置及计算机可读存储介质
CN110321920B (zh) 图像分类方法、装置、计算机可读存储介质和计算机设备
CN108921851B (zh) 一种基于3d对抗网络的医学ct图像分割方法
CN111429421B (zh) 模型生成方法、医学图像分割方法、装置、设备及介质
CN111028327B (zh) 一种三维点云的处理方法、装置及设备
CN109377500B (zh) 基于神经网络的图像分割方法及终端设备
JP2019076699A (ja) 偽陽性低減での小結節検出
CN110738235B (zh) 肺结核判定方法、装置、计算机设备及存储介质
TW202125415A (zh) 三維目標檢測及模型的訓練方法、設備、儲存媒體
CN111291825A (zh) 病灶分类模型训练方法、装置、计算机设备和存储介质
WO2019037654A1 (zh) 3d图像检测方法、装置、电子设备及计算机可读介质
CN113177592B (zh) 一种图像分割方法、装置、计算机设备及存储介质
CN112102230A (zh) 超声切面识别方法、系统、计算机设备和存储介质
CN111144449B (zh) 图像处理方法、装置、存储介质及电子设备
CN109961435B (zh) 脑图像获取方法、装置、设备及存储介质
CN113724185A (zh) 用于图像分类的模型处理方法、装置及存储介质
CN115147426B (zh) 基于半监督学习的模型训练与图像分割方法和系统
Wu et al. Semiautomatic segmentation of glioma on mobile devices
CN116188478A (zh) 图像分割方法、装置、电子设备及存储介质
CN115984257A (zh) 一种基于多尺度transformer的多模态医学图像融合方法
CN111126424A (zh) 一种基于卷积神经网络的超声图像分类方法
CN116310194A (zh) 一种配电站房三维模型重建方法、系统、设备和存储介质
JP7337303B2 (ja) 学習装置、及び学習方法
CN109584194A (zh) 基于卷积变分概率模型的高光谱图像融合方法
CN112581513B (zh) 锥束计算机断层扫描图像特征提取与对应方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19916343

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19916343

Country of ref document: EP

Kind code of ref document: A1