WO2023232137A1 - 图像处理模型的训练方法、图像处理方法及装置 - Google Patents

图像处理模型的训练方法、图像处理方法及装置 Download PDF

Info

Publication number
WO2023232137A1
WO2023232137A1 PCT/CN2023/097972 CN2023097972W WO2023232137A1 WO 2023232137 A1 WO2023232137 A1 WO 2023232137A1 CN 2023097972 W CN2023097972 W CN 2023097972W WO 2023232137 A1 WO2023232137 A1 WO 2023232137A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
region growing
image processing
module
pooling
Prior art date
Application number
PCT/CN2023/097972
Other languages
English (en)
French (fr)
Inventor
王纯亮
董嘉慧
张超
赵清华
毛益进
刘伟
Original Assignee
北京阅影科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京阅影科技有限公司 filed Critical 北京阅影科技有限公司
Publication of WO2023232137A1 publication Critical patent/WO2023232137A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • the present disclosure relates to the field of artificial intelligence, and specifically relates to a training method, an image processing method, a device and a computer-readable medium for an image processing model based on a derivable region growing module.
  • CNN convolutional neural networks
  • new network frameworks have been proposed in some situations to constrain the structural topology and connectivity. For example, higher-order neighborhood information can be obtained by using a series of graph convolutions instead of traditional convolutional layers. In addition, it is proposed to use attention network to obtain aggregated CNN features. It is also proposed to integrate graph convolutional networks into a unified CNN architecture to construct a new graph network to jointly learn to represent global image features including connectivity and local appearance.
  • a new connectivity-aware similarity metric (clDice) based on centerline extraction is also proposed to ensure the connectivity of blood vessel segments by calculating the overlap between the morphological skeleton of the processed blood vessel mask and the gold standard mask.
  • clDice cannot achieve satisfactory results in treating small blood vessels.
  • An object of the present disclosure is to provide a training method, an image processing method, a device and a computer-readable medium for an image processing model based on a differentiable region growing module.
  • Embodiments of the present disclosure provide a training method for an image processing model.
  • the method includes: obtaining a training data set, the training data set including a plurality of sample images and a real label image corresponding to each sample image; and based on The training data set is used to train the image processing model to obtain a trained image processing model, wherein the image processing model is composed of a differentiable region growing module and a deep learning network module for performing image processing. , the differentiable region growing module is used to perform a region growing operation to obtain a connected domain feature image of the sample image, wherein training the image processing model is at least based on the connected domain feature image.
  • the combination operation includes any one of the following: combining each pixel in the seed function after the maximum pooling expansion operation with the corresponding pixel in the input image perform a multiplication operation on points; and perform a minimum value operation on each pixel point in the seed function after the maximum pooling expansion operation and the corresponding pixel point in the input image.
  • the differentiable region growing module is connected after the output layer of the deep learning network module, and wherein training the image processing model includes: using the deep learning network module Perform image processing and prediction on the sample image, and output the prediction result image from the output layer; input the prediction result image as the first input image to the differentiable region growing module; add the true label of the sample image image
  • the second input image is input to the differentiable region growing module; using the differentiable region growing module, a region growing operation is performed based on the first input image and the seed function to obtain the first region of the seed function
  • the growth result is used as the first connected domain feature image of the sample image; using the differentiable region growth module, perform a region growth operation based on the second input image and the seed function to obtain the third value of the seed function.
  • the two region growing results are used as the second connected domain feature image of the sample image; and based on the first connected domain feature image and the second connected domain feature image, calculate a target loss function value, and based on the target The loss function value adjusts the parameters of the deep learning network module.
  • X represents the real label image
  • Y represents the prediction result image
  • S is the seed function
  • g(X,S) is the first connected domain feature image
  • g(Y,S) is the second connected domain feature image
  • the loss function L c penalizes disconnected domains more heavily.
  • the seed function is generated based on any one of an equal spacing strategy, a pooling and anti-pooling strategy, and a breakpoint pooling strategy.
  • generating a seed function based on the equal spacing strategy includes: constructing an image with the same dimensions as the sample image, setting pixel points at predetermined intervals in the image as seed points, and The remaining pixels are set as background pixels, and the set image is used as the seed function.
  • generating a seed function based on the pooling and anti-pooling strategy includes: performing a max pooling operation on the real label image to obtain one or more local maxima; The result image after the operation is subjected to an unpooling operation to restore the actual position of the one or more local maxima in the real label image, and the result image after the unpooling operation is used as a seed function.
  • generating a seed function based on the breakpoint pooling strategy includes: subtracting the predicted result image and the real label image; performing the maximum pooling on the subtracted image perform a maximum pooling expansion operation; multiply the image obtained after performing the maximum pooling expansion operation with the prediction result image to obtain an intersection image; and perform a maximum pooling operation on the intersection image to obtain one or more local maximum; perform an unpooling operation on the result image after the maximum pooling operation to restore the actual position of the one or more local maxima in the intersection image, and use the result image after the unpooling operation to as a seed function.
  • the max pooling expansion operation includes: using a pooling kernel to perform a max pooling operation with a step size of 1 on the seed function.
  • the size of the pooling kernel when the sample image is a one-dimensional image, the size of the pooling kernel is 3, and when the sample image is a two-dimensional image, the pooling kernel The size of is 3*3.
  • the size of the pooling kernel is 3*3*3.
  • the size of the pooling kernel is 3. *3*3*3.
  • the input end of the differentiable region growing module is connected to the first intermediate layer of the deep learning network module and the output end of the differentiable region growing module is connected to a connection different from the first intermediate layer of the deep learning network module.
  • training the image processing model includes: inputting the first feature image generated by the first intermediate layer as a third input image to the differentiable region growing Module; using the differentiable region growing module, perform a region growing operation based on the third input image and the seed function, and obtain the third region growing result of the seed function as the third connection of the sample image Domain feature image; input the third connected domain feature image to the second intermediate layer to fuse with the second feature image generated by the second intermediate layer; use the deep learning network module to Perform image processing prediction on the characteristic image; calculate a target loss function value based on the prediction result; and adjust parameters of the deep learning network module based on the target loss function value.
  • the target loss function is one of a cross-entropy loss function, a dice loss function and a focus loss function.
  • the seed function is generated based on any one of an equal spacing strategy and a pooling and anti-pooling strategy.
  • generating a seed function based on the equal spacing strategy includes: constructing an image with the same dimensions as the sample image, setting pixel points at predetermined intervals in the image as seed points, and setting the remaining pixels as seed points. Points are set as background pixels, and the set image is used as the seed function.
  • generating a seed function based on the pooling and anti-pooling strategy includes: receiving a third feature image from a third intermediate layer of the deep learning network module; Perform a max pooling operation to obtain one or more local maxima; perform an unpooling operation on the result image after the max pooling operation to restore the actual value of the one or more local maxima in the third feature image. position, and use the result image after the unpooling operation as the seed function.
  • a convolution layer is further used to perform a convolution operation on the third feature image.
  • inputting the third connected domain feature image to the second intermediate layer to fuse with the second feature image generated by the second intermediate layer includes: The third connected domain feature image and the second feature image generated by the second intermediate layer perform a pixel-by-pixel superposition operation to obtain a fused feature image.
  • Embodiments of the present disclosure also provide a training device for an image processing model.
  • the device includes: an image acquisition component for acquiring a training data set.
  • the training data set includes a plurality of sample images and each a real label image corresponding to the sample image; and a training component for training the image processing model based on the training data set to obtain a trained image processing model, wherein the image processing model is composed of a differentiable region
  • the growth module is connected to a deep learning network module for performing image processing.
  • the differentiable region growth module is used to perform a region growth operation to obtain a connected domain feature image of the sample image, wherein the image processing model is trained. It is performed based on at least the connected domain feature image.
  • Embodiments of the present disclosure also provide a method for image processing, which includes: acquiring an image to be processed; based on the deep learning network module in the trained image processing model, executing on the image to be processed Image processing operations are performed to obtain a processed image with connectivity, the number of connected domains in the processed image is less than a predetermined threshold, wherein the trained image processing model is based on any of the preceding items. Obtained by the training method of the image processing model described above.
  • Embodiments of the present disclosure also provide a method for image processing, including: acquiring an image to be processed; and performing an image processing operation on the image to be processed based on a trained image processing model to obtain A processed image with connectivity, the number of connected domains in the processed image is less than a predetermined threshold, wherein the trained image processing model consists of a differentiable region growing module and a depth for performing image processing
  • the learning network module is connected and constituted, the input end of the differentiable region growing module is connected to the first intermediate layer of the deep learning network module and the output end of the differentiable region growing module is connected to a different intermediate layer than the first intermediate layer.
  • the second intermediate layer wherein the differentiable region growing module is used to perform a region growing operation to obtain the connected domain feature image of the image to be processed, wherein the to-be-processed image is processed based on the trained image processing model
  • the image processing operation performed on the image is at least based on the connected domain feature image.
  • the trained image processing model is obtained based on any of the above image processing model training methods.
  • the middle layer of the learning network module generates a feature image for the image to be processed; performs a maximum pooling expansion operation on the seed function based on the input image; combines the seed function after the maximum pooling expansion operation with the input
  • the image performs a combination operation; the above steps are repeated until the iteration number threshold is reached, and the regional growth result of the seed function is obtained as the connected domain feature image of the image to be processed.
  • the max pooling expansion operation includes: using a pooling kernel to perform a max pooling operation with a step size of 1 on the seed function.
  • the size of the pooling kernel when the image to be processed is a one-dimensional image, the size of the pooling kernel is 3, and when the image to be processed is a two-dimensional image, the size of the pooling kernel is 3.
  • the size of the pooling kernel is 3*3.
  • the size of the pooling kernel is 3*3*3.
  • the size of the pooling kernel is 3*3.
  • the size of the pooling kernel is 3*3*3*3.
  • the combination operation includes any one of the following: combining each pixel in the seed function after the maximum pooling expansion operation with the corresponding pixel in the input image perform a multiplication operation on points; and perform a minimum value operation on each pixel point in the seed function after the maximum pooling expansion operation and the corresponding pixel point in the input image.
  • performing an image processing operation on the image to be processed includes: inputting a first feature image generated by the first intermediate layer for the image to be processed as an input image to the image to be processed.
  • the differentiable region growing module using the differentiable region growing module, perform a region growing operation based on the input image and the seed function, and obtain the region growing result of the seed function as the connected domain of the image to be processed Feature image; input the connected domain feature image to the second intermediate layer to fuse with the second feature image generated by the second intermediate layer; use the deep learning network module to generate a feature image based on the fused feature image Image prediction is performed to obtain processed images with connectivity.
  • the seed function is generated based on any one of an equal spacing strategy and a pooling and anti-pooling strategy.
  • generating a seed function based on the equal spacing strategy includes: constructing an image with the same dimensions as the sample image, setting pixel points at predetermined intervals in the image as seed points, and setting the remaining pixels as seed points. Points are set as background pixels, and the set image is used as the seed function.
  • generating a seed function based on the pooling and anti-pooling strategy includes: receiving a third feature image from a third intermediate layer of the deep learning network module; Perform a max pooling operation to obtain one or more local maxima; perform an unpooling operation on the result image after the max pooling operation to restore the actual value of the one or more local maxima in the third feature image. position, and use the result image after the unpooling operation as the seed function.
  • a convolution layer is further used to perform a convolution operation on the third feature image.
  • inputting the connected domain feature image to the second intermediate layer for fusion with the second feature image generated by the second intermediate layer includes: The feature image and the second feature image generated by the second intermediate layer are superimposed pixel by pixel to obtain a fused feature image.
  • Embodiments of the present disclosure also provide a device for image processing, including: an image acquisition component, used to acquire an image to be processed; a processing component, based on the deep learning network module in the trained image processing model, Perform an image processing operation on the image to be processed to obtain a processed image with connectivity, the number of connected domains in the processed image is less than a predetermined threshold, and an output component is configured to output the processed image with connectivity A sexually processed image, wherein the trained image processing model is obtained based on the training method of the image processing model described in any one of the preceding items.
  • Embodiments of the present disclosure also provide a device for image processing, including: an image acquisition component, used to acquire an image to be processed; a processing component, based on a trained image processing model, to process the image to be processed. performing an image processing operation on an image to obtain a processed image with connectivity, the number of connected domains in the processed image being less than a predetermined threshold; an output component configured to output the processed image with connectivity,
  • the trained image processing model is composed of a differentiable region growing module and a deep learning network module for performing image processing, and the input end of the differentiable region growing module is connected to the third of the deep learning network module.
  • An intermediate layer and the output end of the steerable region growing module is connected to a second intermediate layer different from the first intermediate layer, wherein the steerable region growing module is used to perform a region growing operation to obtain a sample image A connected domain feature image, wherein the processing component performs an image processing operation on the image to be processed based on a trained image processing model based on at least the connected domain feature image.
  • the trained image processing model is obtained based on the training method of the image processing model according to any one of the aforementioned methods.
  • Embodiments of the present disclosure also provide an electronic device, including a memory and a processor, wherein the memory stores a program code readable by the processor, and when the processor executes the program code, the execution is performed according to any of the previously described methods.
  • Embodiments of the present disclosure also provide a computer-readable storage medium having computer-executable instructions stored thereon, and the computer-executable instructions are used to perform any one of the methods described above.
  • Figure 1 shows a schematic diagram of the application architecture of an image processing model training method and an image processing method based on a trained image processing model according to an embodiment of the present disclosure
  • Figure 2 shows a schematic diagram of a traditional UNet network architecture for image segmentation processing
  • FIG. 3 is a schematic diagram schematically illustrating the region growing operation of the derivable region growing module 300 by taking the real label image associated with the sample image as an example;
  • Figure 4 shows a schematic structural diagram of an image processing model 400 based on a differentiable region growing module according to one embodiment of the present disclosure
  • Figure 5 shows three automatic seed function generation strategies: the equal spacing strategy, the pooling anti-pooling strategy and the breakpoint pooling strategy according to an embodiment of the present disclosure
  • Figure 6 describes the effect diagram of the seed function generated by three different seed function generation strategies
  • Figure 7 shows a schematic structural diagram of an image processing model 700 based on a differentiable region growing module according to another embodiment of the present disclosure
  • FIG. 8 shows a flowchart of a training method 800 for training an image processing model according to an embodiment of the present disclosure
  • Figure 9 is a flowchart illustrating example implementation details of training the image processing model in conjunction with the image processing model 400 shown in Figure 4;
  • Figure 10 is a flowchart illustrating example implementation details of training the image processing model in conjunction with the image processing model 700 shown in Figure 7;
  • Figure 11 is a flow chart of an image processing method based on the trained image processing model 400
  • Figure 12 is a flow chart describing another image processing method based on the trained image processing model 700;
  • Figure 13 shows a training device for an image processing model according to an embodiment of the present disclosure
  • Figure 14 shows a schematic structural diagram of an image processing device according to an embodiment of the present disclosure.
  • Figure 15 shows a schematic diagram of a storage medium according to an embodiment of the present disclosure.
  • the present disclosure proposes an improved image processing model and its training method, which can solve the medical image problems caused by the traditional deep learning image processing model by embedding a novel differentiable region growing module into the traditional deep learning network. To solve the problem of discontinuous connection in the organizational structure and ensure the connectivity of the organizational structure.
  • the training method of the image processing model and the image processing method according to the embodiments of the present disclosure are not only applicable to medical images, but are also suitable for processing non-medical images with regional connectivity requirements, and the present disclosure does not apply to this. Make restrictions.
  • Figure 1 shows a schematic diagram of the application architecture of the image processing model training method and the image processing method based on the trained image processing model according to an embodiment of the present disclosure, including a server 100 and a terminal device 200.
  • the terminal device 200 may be, for example, a medical device.
  • the user may view the processing results of the medical image based on the terminal device 200 .
  • the terminal device 200 and the server 100 can be connected through the Internet to realize communication with each other.
  • the Internet described above uses standard communications technologies and/or protocols.
  • the Internet is usually the Internet, but can also be any network, including but not limited to Local Area Network (LAN), Metropolitan Area Network (MAN), Wide Area Network (Wide Area Network, WAN), mobile, wired or wireless networks , private network, or any combination of virtual private networks.
  • data exchanged over the network is represented using technologies and/or formats including HyperText Markup Language (HTML), Extensible Markup Language (XML), etc.
  • HTML HyperText Markup Language
  • XML Extensible Markup Language
  • SSL Secure Socket Layer
  • TLS Transport Layer Security
  • VPN Virtual Private Network
  • IPsec Internet Protocol Security
  • Encryption technology to encrypt all or some links.
  • customized and/or dedicated data communication technologies may also be used in place of or in addition to the above-described data communication technologies.
  • the server 100 can provide various network services for the terminal device 200, where the server 100 can be a server, a server cluster including several servers, or a cloud computing center.
  • the server 100 may include a processor 110 (Center Processing Unit, CPU), a memory 120, an input device 130, an output device 140, and the like.
  • the input device 130 may include a keyboard, a mouse, a touch screen, etc.
  • the output device 140 may include a display device, such as a liquid crystal display (Liquid Crystal Display, LCD), a cathode ray tube (Cathode Ray Tube, CRT), etc.
  • Memory 120 may include read-only memory (ROM) and random access memory (RAM), and provides program instructions and data stored in memory 120 to processor 110 .
  • the memory 120 may be used to store the training method of the image processing model or the program of the image processing method in the embodiment of the present disclosure.
  • the processor 110 calls the program instructions stored in the memory 120, and the processor 110 is configured to execute the steps of any image processing model training method or image processing method in the embodiments of the present disclosure according to the obtained program instructions.
  • the training method of the image processing model or the image processing method is executed by the server 100 side.
  • the terminal device 200 can send the collected medical images to the server 100, and the server 100 Deep learning image processing is performed on the medical image, and the result can be returned to the terminal device 200.
  • the application architecture shown in Figure 1 is explained by taking the application on the server 100 side as an example.
  • the method in the embodiment of the present disclosure can also be executed by the terminal device 200.
  • the terminal device 200 can obtain training from the server 100 side.
  • a good image processing model can be used to process medical images based on the trained image processing model and obtain processing results, which is not limited in the embodiments of the present disclosure.
  • Figure 2 shows a schematic diagram of a traditional UNet network architecture for image segmentation processing.
  • the UNet network architecture includes U-shaped network architecture and jump-layer connections.
  • the UNet network architecture is a symmetrical network architecture, including two paths on the left and right. The path on the left can be regarded as an encoder, which can also be called an upsampling processing path. It includes five convolution sub-modules. Each sub-module includes two convolution layers and a ReLU layer. The convolution layer structure here is unified as 3 ⁇ 3 convolution kernel. Each sub-module is followed by a downsampling layer implemented by max pooling. The convolution sub-module is used to extract features, and the maximum pooling layer is used to reduce the dimension. The resolution of the output feature image after each maximum pooling layer becomes half.
  • the feature map output by the last convolution sub-module is directly input to the decoder on the right without going through max pooling.
  • the path on the right can be regarded as a decoder, which can also be called a downsampling processing path. It contains a basically symmetrical structure with the encoder. It performs 3 ⁇ 3 convolution and upsampling on the input feature map to gradually repair the details of the object. and spatial dimensions.
  • feature fusion is also used in the network. As shown by the dotted arrow in Figure 2, the features of the previous part of the downsampling network and the features of the later upsampling are spliced and fused through skip layer connections to obtain a more accurate contextual information to achieve better processing results.
  • the UNet network model finally outputs a segmentation map of the target image.
  • the pixel value of each pixel in the segmentation map can be a label representing its category.
  • embodiments of the present disclosure propose a novel region growing module and propose to embed it into a traditional deep learning network for image processing.
  • the region growth module as a special layer into the deep learning network, the region growth module (or layer) can directly participate in the training and prediction of the network (optional, not necessarily involved in prediction, but must participate in training) process.
  • the region growing module here serves as a special layer that allows "gradient” to pass through, thereby ensuring the training of the network, so it can be called the “differentiable region growing module” below.
  • the novel differentiable region growing module can perform a region growing operation to obtain the connected domain features of the sample image, so that the traditional deep learning network for image processing embedded with the novel differentiable region growing module can learn from the novel differentiable region growing module.
  • the guided region growing module obtains features about regional connectivity, thereby achieving image processing on the basis of ensuring connectivity.
  • the following takes the UNet network architecture as shown in Figure 2 as a traditional deep learning network for image processing as an example to describe how to embed the novel differentiable region growing module of the embodiment of the present disclosure into it to form a novel Image processing model, and how to train the image processing model embedded with the novel differentiable region growing module, so that it can better learn the connected domain features of the image.
  • image processing here can be, for example, various image processing processes such as image transformation, image recognition, image classification, image segmentation, etc., and the present disclosure is not limited to this.
  • the novel differentiable region growing module proposed by this disclosure is designed to be connected to a traditional deep learning network module in a parallel or series manner, and can be based on the feature image of the sample image received from a certain intermediate layer of the deep learning network module, Differentiable expansion operation is performed on the seed function by using the real label image associated with the sample image or the prediction result image generated for the sample image received from the output layer of the deep learning network module, so that the seed point can be restricted to the sample image. Growth is performed within connected areas (for example, tissue areas such as blood vessels, small blood vessels, organs, etc.) to obtain connected domain features of the sample image.
  • connected areas for example, tissue areas such as blood vessels, small blood vessels, organs, etc.
  • a schematic process of a region growing operation of the steerable region growing module 300 according to an embodiment of the present disclosure will now be described with reference to FIG. 3 .
  • the novel differentiable region growing module of the embodiment of the present disclosure is based on the received image associated with the sample image, and performs a region growing operation on the seed function based on the received image to obtain the connected domain feature image of the sample image.
  • the image associated with the sample image is the middle part of the deep learning network module in which the region growing operation is embedded.
  • One of the feature image generated by the interlayer for the sample image, the prediction result image generated by the output layer of the deep learning network module for the sample image, and the real label image of the sample image is the middle part of the deep learning network module in which the region growing operation is embedded.
  • FIG. 3 schematically illustrates the region growing operation of the derivable region growing module 300 by taking the real label image associated with the sample image as an example.
  • the real label image shown in Figure 3 is a binary image, in which black pixels represent pixels belonging to the tissue area and white pixels represent background pixels.
  • the region growing operation performed by the differentiable region growing module 300 is an iterative process. After each expansion of the seed point X, the expansion result is combined with the real label image, so that the seed point X is restricted to grow within the tissue area.
  • the hyperparameter t represents the number of iterations to be performed. The larger the t setting value, the more likely it is to ensure that the output of the region growing operation contains regional connectivity features that are close to the real label image.
  • the novel differentiable region growth module proposed by the embodiment of the present disclosure can use the maximum pooling expansion operation and the combination operation to realize region growth. .
  • Embodiments of the present disclosure propose a max-pooling expansion operation implemented based on a max-pooling layer.
  • the operation of the region growing module is implemented by utilizing a special network layer (for example, here is the max pooling layer), so that the "gradient" can be allowed to pass through, thus ensuring the training of the network.
  • a max pooling dilation operation based on a max pooling layer may include performing a max pooling operation with a stride of 1 on the seed function using a pooling kernel of size N*N (eg, for a two-dimensional image).
  • the pooling kernel here can have different dimensions based on the dimensions of the image. For example, when the image is a one-dimensional image, the size of the pooling kernel is N. When the image is a two-dimensional image, the pooling kernel The size of the kernel is N*N. When the image is a three-dimensional image, the size of the pooling kernel is N*N*N. When the image is a four-dimensional image, the size of the pooling kernel is N*N*N*N.
  • N can take the value 3.
  • the image after the max pooling dilation operation has the same dimensions as the input sample image. Since the pooling layer in the convolutional network is easier to implement, the expansion operation can also achieve higher computational efficiency compared to other means.
  • the combination operation here may include a multiplication operation and a minimum value operation.
  • the multiplication operation includes adding each pixel in the seed function after the maximum pooling expansion operation to the corresponding pixel in the received image associated with the sample image (for example, the real label image in Figure 3). take.
  • the minimum value operation includes comparing each pixel in the seed function after the maximum pooling expansion operation with the corresponding pixel in the received image associated with the sample image (for example, the real label image in Figure 3) Take the minimum value.
  • the multiplication operation limits the seed points to grow within the tissue connected area, and the minimum operation will reduce the response value of the tissue structure after the breakpoint. Both of them can increase the breakpoint penalty, thereby making the area grow.
  • the image contains connected domain features to obtain the sample image.
  • FIG. 4 shows a schematic structural diagram of an image processing model 400 based on a derivable region growing module according to an embodiment of the present disclosure.
  • the image processing model 400 includes a guidelable region growing module 400A and a deep learning network module 400B for image processing, wherein the guidelable region growing module 400A is connected to the deep learning network module After 400B.
  • the steerable area growth module 400A here may be the steerable area growth module 300 described in FIG. 3 .
  • the deep learning network module 400B here can be the Unet model as described above, which predicts the input sample image and then obtains the predicted category of each pixel in the sample image, thereby achieving segmentation of the image area.
  • the deep learning network module 400B here can also be any other deep learning network suitable for image processing, such as linknet, ResNet, VGG Net, etc., and these processing networks can also be adaptively adjusted according to the actual situation.
  • the traditional deep learning network module 400B for image processing processes at a single pixel level and its output prediction result image often has disconnected or separated pixel points or pixel areas. Therefore, one embodiment of the present disclosure proposes to perform the seed function by combining the predicted output image and the real label image of the deep learning network module 400B after connecting the differentiable region growing module 400A to the output layer of the deep learning network module 400B. grow, and reconstruct the new coDice loss function, thereby discarding all pixels disconnected from the seed function, increasing the penalty for disconnection.
  • the present disclosure in the structure of the image processing model 400 shown in FIG. 4 , in order to give full play to the role of the differentiable region growing module 400A in the image processing model 400 in increasing the disconnection penalty, the present disclosure also proposes three seed function generation strategies for the differentiable region growing module 400A.
  • Figure 5 shows three automatic seed function generation strategies: the equal spacing strategy, the pooling and anti-pooling strategy, and the breakpoint pooling strategy according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic flow chart for constructing a seed function based on the equal spacing strategy.
  • the equal-spacing strategy construction process may include: constructing an image with the same dimensions as the sample image, setting pixel points at predetermined intervals in the image as seed points, setting the remaining pixel points as background pixels, and using the set image as the seed function.
  • FIG. 5 is a schematic flow chart for constructing a seed function based on the pooling and anti-pooling strategy.
  • the pooling and unpooling operation here is a max pooling and unpooling operation.
  • the max pooling operation and the unpooling operation work by obtaining one or more local maxima during pooling and then restoring the one or more local maxima in the image during unpooling.
  • the actual position of the seed point and other pixel values in the sub-region are set to zero, a single seed point can be selected within a square of a given kernel size and other candidate points can be suppressed, ensuring that equally spaced seed points are generated on all tissue structures.
  • the pooling and de-pooling strategy of the embodiment of the present disclosure may include: performing a max pooling operation on the real label image to obtain one or more local maximum values; The result image is subjected to an unpooling operation to restore the actual position of the one or more local maxima in the real label image, and the result image after the unpooling operation is used as a seed function.
  • FIG. 5 is a schematic flow chart for constructing a seed function based on the breakpoint pooling strategy.
  • I seeds maxunpool(dilate(I gt -I pre ) ⁇ I pre ) (3)
  • I pre and I gt are the prediction results and real labels respectively.
  • the predicted result image and the real label image are first subtracted, and then a max pooling expansion operation is performed on the subtracted image (for example, the max pooling operation shown in Figure 5 (C) ) to get the intersection immediately adjacent to the breakpoint.
  • the subtraction process here can be a pixel-by-pixel subtraction of the predicted result image and the real label image, and the absolute value of the negative difference is taken.
  • the max pooling expansion operation here may be the max pooling expansion based on the max pooling layer as described above.
  • max pooling of the image obtained by subtracting the predicted result image and the true label image can be implemented based on a max pooling layer with a convolution kernel size of N*N (for example, for a two-dimensional image) and a stride of 1. Dilation, for example, N here can take the value 3.
  • N here can take the value 3.
  • the image after the max pooling dilation operation has the same dimensions as the input sample image. Since the pooling layer in the convolutional network is easier to implement, the expansion operation can achieve higher computational efficiency compared to other methods.
  • the image after the maximum pooling expansion operation is further multiplied by the prediction result image to obtain an image including the intersection points immediately adjacent to the breakpoints.
  • the intersection points are filtered using the maximum pooling and unpooling operation to ensure that the seeds appear in Near the breakpoint, maximize the role of the steerable area growth module.
  • the max pooling and unpooling operations are the same as the max pooling and unpooling operations described in (B) of Figure 6.
  • the max pooling operation is first performed on the image of the intersection to obtain one of the intersection images. or multiple local maxima, and then perform an unpooling operation to recover the actual position of one or more local maxima in the intersection image.
  • Figure 6 describes the effect diagram of the seed function generated by three different seed function generation strategies.
  • the leftmost image is a binarized real label image, where white represents vascular tissue pixels and black represents background pixels.
  • Figure 6 (A) is a rendering of the seed function constructed based on the equal spacing strategy.
  • Figure 6 (B) is a rendering of the seed function constructed based on the pooling and anti-pooling strategy.
  • Figure 6 (C) is based on breakpoints. Rendering of the seed function constructed by the pooling strategy.
  • the seed function constructed by the pooling anti-pooling strategy and the breakpoint pooling strategy can ensure that the seed point appears in the tissue area, and the seed function constructed by the breakpoint pooling strategy can better ensure that the seed point appears in the tissue area. Appears near the breakpoint to maximize the role of the steerable area growth module.
  • the generation of the seed function can be achieved by using a special network layer (for example, a pooling layer), so that the "gradient" can be allowed to pass through, thus ensuring the training of the network.
  • FIG. 7 shows a schematic structural diagram of an image processing model 700 based on a differentiable region growing module according to another embodiment of the present disclosure.
  • an image processing model 700 includes a differentiable region growing module 700A and a deep learning network module 700B for image processing, wherein the differentiable region growing module 700A is connected to the deep learning network. between the middle layers of module 700B.
  • the steerable area growth module 700A here may be the steerable area growth module 300 described in FIG. 3 .
  • the deep learning network module 700B here can be the Unet model as described above, which processes the input sample image and obtains the predicted category of each pixel in the sample image, thereby achieving segmentation processing of the image area.
  • the deep learning network module 700B here can also be any other deep learning network suitable for any image processing, such as linknet, ResNet, VGG Net, etc., and these processing networks can also be adaptively adjusted according to the actual situation.
  • the input end of the differentiable region growing module 700A is connected after the first intermediate layer of the deep learning network module 700B, and receives the feature image as input from the intermediate layer, and the output end of the differentiable region growing module 700A
  • the second intermediate layer connected to the deep learning network module 700B allows the output of the differentiable region growing module 700A to be fused with the feature image of the second intermediate layer.
  • the steerable region growth can be generated based on the two seed growth strategies mentioned above: the equal spacing strategy and the pooling and anti-pooling strategy. Seed function for module 700A.
  • the equal-spacing strategy construction process may include: constructing an image with the same dimensions as the sample image, setting pixels at predetermined intervals in the image as seed points, setting the remaining pixels as background pixels, and using the set image as the seed function.
  • the pooling and anti-pooling strategy may include: based on receiving a third feature image from a third intermediate layer of the deep learning network module; performing a maximum pooling operation on the third feature image to obtain one or more local parts of the third feature image Maximum value; perform an unpooling operation on the result image after the maximum pooling operation to restore the actual position of the one or more local maxima in the third feature image, and use the result image after the unpooling operation as a seed function .
  • the seed point map of the seed function and the image to which it is combined have the same dimensions.
  • the feature images generated due to the third intermediate layer and the first intermediate layer may have different dimensions. Therefore, in this case, a further convolution operation needs to be performed on the third feature image generated by the third intermediate layer.
  • a convolution layer is also used to perform a convolution operation on the third feature image.
  • first middle layer “second middle layer” and “third middle layer” used here are only to distinguish between different middle layers, and are not numbers of these middle layers. Or limit the order of these intermediate layers.
  • first middle layer is shown in FIG. 7 as the front middle layer
  • second middle layer is shown as the rear middle layer
  • third middle layer is shown as the middle layer.
  • present disclosure is not limited thereto.
  • the three middle layers may have different orders.
  • the “second middle layer” may be the front middle layer
  • the “first middle layer” may be the middle middle layer
  • first middle layer may be the middle middle layer
  • the “third middle layer” may be the middle layer at the back.
  • the differentiable region growing module 700A implements the expansion of the seed points by utilizing the region growing operation, and after each expansion, the expansion result is combined with the feature image received from the first intermediate layer of the deep learning network module 700B as described above.
  • the operation allows the seed points of the seed function to be limited to the tissue area of the sample image for growth.
  • FIG. 8 shows a flowchart of a training method 800 for training an image processing model according to an embodiment of the present disclosure.
  • the image processing model training method 800 may be executed by a server, and the server may be the server 100 shown in FIG. 1 .
  • a training data set is obtained, which includes a plurality of sample images and a real label image corresponding to each sample image.
  • the sample image here may be a medical image including a tissue image.
  • the sample image here can also be any other suitable image except medical images, and the present disclosure does not limit this.
  • sample images here may be obtained through medical imaging technology, or may be obtained through network downloading, or may be obtained through other means, and the embodiments of the present disclosure are not limited to this.
  • the real label image here is a label image that labels the area or category to which each pixel in the corresponding sample image belongs.
  • step S803 the image processing model is trained based on the training data set to obtain a trained image processing model.
  • the image processing model here may be the image processing model 400 as shown above with reference to FIG. 4 or the image processing model 700 as shown with reference to FIG. 7. Both image processing models are composed of the novel differentiable region growing module described above. Connected to traditional deep learning network modules for performing image processing.
  • the differentiable region growing module 400A is connected after the output layer of the deep learning network module 400B, and receives the prediction result image from the output layer.
  • the prediction result image here is the prediction result output by the deep learning network module 400B for the input sample.
  • the prediction result is an image with the same dimensions as the sample image, in which each pixel is marked as the area to which the corresponding pixel in the sample image belongs. or a category label.
  • the differentiable region growing module 700A is connected between two intermediate layers of the deep learning network module 700B, and receives features from one intermediate layer of the deep learning network module 700B. image, and perform a region growing operation as discussed above based on the feature image, and return the connectivity features of the tissue region obtained by the region growing operation to the deep learning network module 700B.
  • the differentiable region growing module can be utilized based on the feature images, predicted images and/or ground truth labels received from one layer in the deep learning network module (e.g., the intermediate layer or the output layer). image to perform the region growing operation as described above on the seed function to obtain a region growing result that contains the regional connectivity features of the sample image, which can increase the penalty for regional disconnection during the training process of the image processing model, thereby increasing Its output predicts regional connectivity in the image.
  • the deep learning network module e.g., the intermediate layer or the output layer.
  • step S901 the deep learning network module is used to perform image processing prediction on the sample image, and the prediction result image is output from the output layer.
  • the deep learning network module here is a traditional deep learning network module used for image processing (for example, the Unet model mentioned above).
  • image processing for example, the Unet model mentioned above.
  • the process of generating prediction results based on sample images is a well-known technical means in the field. This will not be described in detail.
  • the traditional deep learning network module for image processing performs image processing on the basis of pixel level, often ignoring the connectivity of the region, and disconnected pixels or pixels will appear in its output prediction results. area, thereby affecting subsequent analysis steps.
  • one embodiment of the present disclosure proposes to connect the novel differentiable region growing module 400A as described above to the output layer of the traditional deep learning network module 400B, respectively based on the deep learning network module.
  • 400B predicted output images and real label images to perform region growing of the seed function, and reconstruct the new coDice loss function, thereby discarding all pixels disconnected from the seed function and increasing the penalty for disconnection, as shown in step S903 below -S911.
  • step S903 the prediction result image is input to the differentiable region growing model as the first input image. piece.
  • step S905 the real label image of the sample image is input to the differentiable region growing module as the second input image.
  • step S907 the differentiable region growing module is used to perform a region growing operation based on the first input image and the seed function, and a first region growing result of the seed function is obtained as the sample image.
  • the first connected domain feature image is used to perform a region growing operation based on the first input image and the seed function, and a first region growing result of the seed function is obtained as the sample image.
  • step S909 the differentiable region growing module is used to perform a region growing operation based on the second input image and the seed function, and a second region growing result of the seed function is obtained as the sample image. Second connected domain feature image.
  • the differentiable region growing module to perform region growing of the seed function based on the received image associated with the sample image (for example, here the prediction result image generated by the deep learning network module 400B for the sample image and the true label image of the sample image)
  • the region growing result here will contain features about the connectivity of the sample image region.
  • the traditional deep learning network module for image processing often identifies pixels that originally belong to one area as belonging to another area disconnected from the area, therefore, it is directly based on the traditional deep learning network module for image processing. There will be a direct difference between the first connected domain feature image obtained by the region growing operation on the prediction result image of the learning network module and the second connected region feature image obtained by the region growing operation based on the real label image.
  • one goal of training the image processing model 400 is to construct a new loss function, and train the image processing model 400 with the optimization goal of minimizing the difference between the two connected region feature images.
  • step S911 a target loss function value is calculated based on the first connected domain feature image and the second connected domain feature image, and parameters of the deep learning network module are adjusted based on the target loss function value.
  • X represents the real label image
  • Y represents the prediction result image
  • S is the seed function
  • g(X,S) is the first connected domain feature image
  • g(Y,S) is the second connected domain feature image .
  • the parameters of the image processing model 400 shown in FIG. 4 can be adjusted so that as the iterative training continues, the target loss function is finally minimized.
  • the derivable region growing module obtains the regional connectivity features of the two images based on the predicted result image and the real label image respectively, and constructs a new codDice loss function, which can increase the number of regional disconnections during the training process of the image processing model. penalty, thereby increasing the regional connectivity in the predicted image of its output.
  • the differentiable region growing module 700A is connected in parallel between two intermediate layers of the deep learning network module 700B, from one intermediate layer of the deep learning network module 700B (for example, The first intermediate layer) receives the feature image, performs a region growing operation based on the received feature image, and returns the output obtained after the region growing operation to another intermediate layer (eg, the second intermediate layer).
  • one intermediate layer of the deep learning network module 700B for example, The first intermediate layer
  • receives the feature image receives the feature image, performs a region growing operation based on the received feature image, and returns the output obtained after the region growing operation to another intermediate layer (eg, the second intermediate layer).
  • step S1001 the first feature image generated by the first intermediate layer is input to the differentiable region growing module as a third input image.
  • the first feature image here is a feature image generated by the first intermediate layer of the deep learning network module 700B for the input sample image.
  • step S1003 the differentiable region growing module is used to perform a region growing operation based on the third input image and the seed function, and a third region growing result of the seed function is obtained as the sample image.
  • the third connected domain feature image is used to perform a region growing operation based on the third input image and the seed function, and a third region growing result of the seed function is obtained as the sample image.
  • Example steps for utilizing the differentiable region growing module to perform a region growing operation of a seed function based on the received image associated with the sample image are shown in Figure 2 has been described in detail and will not be repeated here.
  • the region growing is performed based on the feature image with the sample image, the region growing results here will contain features about the connectivity of the region in the sample image.
  • step S1005 the third connected domain feature image is input to the second intermediate layer for fusion with the second feature image generated by the second intermediate layer.
  • the feature fusion may include performing a pixel-by-pixel superposition operation on the third connected domain feature image and the second feature image.
  • pixel-by-pixel superposition operation may include performing a pixel-by-pixel superposition operation on the third connected domain feature image and the second feature image.
  • other feature fusion techniques can also be adopted, such as pixel-by-pixel multiplication, etc., and this disclosure does not limit this.
  • step S1007 the deep learning network module is used to perform image processing prediction based on the fused feature image.
  • step S1009 the target loss function value is calculated based on the prediction result.
  • step S1011 the parameters of the deep learning network module are adjusted based on the target loss function value.
  • the loss function here is a loss function designed for the traditional deep learning network module 700B.
  • the loss function can be a cross-entropy loss function, a dice loss function, a focus loss function, etc. This disclosure does not limit this and will not be repeated.
  • embodiments of the present disclosure also provide an image processing method based on the trained image processing model.
  • the image processing methods 1100 and 1200 based on these two trained image processing models will be described below respectively in conjunction with the image processing model 400 and the image processing 700 trained based on the above method.
  • Figure 11 describes an image processing method 1100 based on the trained image processing model 400.
  • step S1101 the image to be processed is obtained.
  • the image to be processed here may be a medical image including a tissue image.
  • the image to be processed here can also be any other suitable image except medical images, and the present disclosure does not limit this.
  • the image to be processed here may be obtained through medical imaging technology, may be obtained through network downloading, or may be obtained through other means, and the embodiments of the present disclosure are not limited to this.
  • step S1103 based on the deep learning network module in the trained image processing model, image processing operations are performed on the image to be processed to obtain a processed image with connectivity.
  • the number of connected domains is less than a predetermined threshold.
  • the traditional deep learning network module for image processing performs image processing on the basis of pixel level, often ignoring the connectivity of the region, and disconnected pixels or pixels will appear in its output prediction results. area.
  • the number of connected domains in the prediction result should be 1.
  • the number of connected domains in the prediction result obtained by the traditional deep learning network module for image processing may be about 50-60.
  • the image processing based on the trained learning network model 400B of the present disclosure can effectively reduce the number of connected domains, for example, to 20-30.
  • FIG. 12 depicts another image processing method 1200 based on the trained image processing model 700.
  • step S1201 an image to be processed is obtained.
  • the image to be processed here may be a medical image including a tissue image.
  • the image to be processed here can also be any other suitable image except medical images, and the present disclosure does not limit this.
  • the image to be processed here may be obtained through medical imaging technology, may be obtained through network downloading, or may be obtained through other means, and the embodiments of the present disclosure are not limited to this.
  • step S1203 based on the trained image processing model, perform image processing operations on the image to be processed. Operate to obtain a processed image with connectivity, the number of connected domains in the processed image being less than a predetermined threshold.
  • the differentiable region growing module 700A in the image processing model 700 is embedded as a special layer in the middle layer of the neural network module 700B. Therefore, in the prediction stage after training is completed, image prediction processing needs to be performed based on the entire image processing model 700.
  • the differentiable region growing module 700A is used to perform a region growing operation to obtain the connected domain feature image of the image to be processed, and perform image processing operations on the image to be processed based on the trained image processing model 700 It is based on at least the connected domain feature image of the image to be processed.
  • the derivable region growing module 700A performs a region growing operation to obtain the connected domain feature image of the image to be processed, including: receiving an input image and a seed function.
  • the input image is generated by the middle layer of the deep learning network module 700B for the image to be processed.
  • Feature image of as the connected domain feature image of the image to be processed.
  • the image processing model 700 performs an image processing operation on the image to be processed, including: inputting the first feature image generated by the first intermediate layer for the image to be processed as an input image to the differentiable region growing module 700A; using the differentiable region growing module 700A, perform a region growing operation based on the input image and the seed function, and obtain the region growing result of the seed function as the connected domain feature image of the image to be processed; input the connected domain feature image to the second intermediate layer to communicate with the second intermediate layer
  • the generated second feature images are fused; the deep learning network module 700B is used to perform image prediction based on the fused feature images to obtain a processed image with connectivity.
  • Figure 13 shows a training device 1300 for an image processing model according to an embodiment of the present disclosure, including, for example, an image acquisition component 1301 and a training component 1303.
  • the image acquisition component 1301 is used to acquire a training data set, which includes a plurality of sample images and a real label image corresponding to each sample image.
  • the training component 1303 is used to train the image processing model based on the training data set to obtain a trained image processing model.
  • the image processing model is composed of a differentiable region growing module and a deep learning network module used to perform image processing.
  • the differentiable region growing module is used to perform a region growing operation to obtain a connected domain feature image of the sample image, and where, The image processing model is trained based on at least the connected domain feature image.
  • FIG. 14 shows a schematic structural diagram of an image processing device 1400 according to an embodiment of the present disclosure.
  • the image processing device 1400 includes at least an image acquisition part 1401, a processing part 1403, and an output part 1405.
  • the image acquisition component 1401, the processing component 1403 and the output component 1405 are related medical devices, which can be integrated in the same medical device, or can be divided into multiple devices, connected and communicated with each other, and included in a medical system.
  • the image acquisition component 1401 may be an endoscope
  • the processing component 1403 and the output component 1405 may be a computer component communicating with the endoscope, or the like.
  • the image acquisition component 1401 is used to acquire an image to be processed.
  • the processing component 1403 is, for example, used to execute the steps of the image processing method shown in FIG. 11 or FIG. 12 .
  • the processing component 1403 can perform image processing operations on the image to be processed based on the deep learning network module in the trained image processing model. , to obtain a processed image with connectivity, the number of connected domains in the processed image being less than a predetermined threshold.
  • the processing component 1403 can perform image processing operations on the image to be processed based on the trained image processing model to obtain a connected image.
  • a processed image in which the number of connected domains in the processed image is less than a predetermined threshold.
  • the output component 1303 is used to output the processed image with connectivity.
  • an electronic device of another exemplary embodiment is also provided in the embodiment of the present disclosure.
  • the electronic device in the embodiments of the present disclosure may include a memory, a processor, and a device stored on the memory. and a computer program that can be run on a processor. When the processor executes the program, it can implement the steps of the image processing model training method or the endoscopic image recognition method in the above embodiments.
  • the processor in the electronic device is the processor 110 in the server 100
  • the memory in the electronic device is the memory in the server 100. 120.
  • Embodiments of the present disclosure also provide a computer-readable storage medium.
  • Figure 15 shows a schematic diagram 1500 of a storage medium according to an embodiment of the present disclosure.
  • computer-executable instructions 1501 are stored on the computer-readable storage medium 1500.
  • the training method and the image processing method of the endoscopic image feature learning model based on contrastive learning according to the embodiments of the present disclosure described with reference to the above figures may be executed.
  • the computer-readable storage medium includes, but is not limited to, volatile memory and/or non-volatile memory, for example.
  • the volatile memory may include, for example, random access memory (RAM) and/or cache memory (cache).
  • the non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, etc.
  • Embodiments of the present disclosure also provide a computer program product or computer program, which includes computer instructions stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the training method of the endoscopic image feature learning model based on contrastive learning according to an embodiment of the present disclosure.
  • Image processing methods are described in detail below.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

公开了一种图像处理模型的训练方法、图像处理方法及装置,所述方法包括:获取(S801)训练数据集,所述训练数据集包括多个样本图像和与每一个样本图像对应的真实标签图像;以及基于所述训练数据集,对所述图像处理模型进行训练(S803),以获得训练完成的图像处理模型,其中,所述图像处理模型由可导区域增长模块和用于执行图像处理的深度学习网络模块连接构成,所述可导区域增长模块用于执行区域增长操作,以获得样本图像的连通域特征图像,其中,对所述图像处理模型进行训练是至少基于所述连通域特征图像进行的。

Description

图像处理模型的训练方法、图像处理方法及装置 技术领域
本公开涉及人工智能领域,具体涉及一种基于可导区域增长模块的图像处理模型的训练方法、图像处理方法、装置及计算机可读介质。
背景技术
在临床疾病诊断和治疗中,对图像进行处理能够辅助医生更好地了解图像组织结构信息。近年来卷积神经网络(CNN)的应用使得图像处理效果得到显著提升。然而,常见CNN网络旨在鼓励对单个像素进行处理,忽略组织结构间的连通性,进而影响后续分析步骤。
基于CNN网络框架,在一些情形下已经提出新的网络框架来约束结构拓扑和连接性。例如,利用一系列图卷积代替传统卷积层,可以获取更高阶的邻域信息。此外,提出利用注意力网络获取聚合CNN特征。还提出将图卷积网络整合到统一的CNN架构中,构建新的图网络,以共同学习表征包括连通性在内的全局图像特征和局部外观。还提出新的基于中心线提取的连通感知相似度量(clDice),通过计算处理的血管掩膜的形态学骨架与金标准掩膜的重叠度,保证血管段的连通性。然而,clDice在细小血管的处理结果并不能达到满意的效果。
因此,有待提出一种改进的图像处理模型的训练方法,能够解决医学图像中组织结构间断连的问题,保证组织结构的连通性。
发明内容
考虑到以上问题而做出了本公开。本公开的一个目的是提供一种基于可导区域增长模块的图像处理模型的训练方法、图像处理方法、装置及计算机可读介质。
本公开的实施例提供了一种图像处理模型的训练方法,所述方法包括:获取训练数据集,所述训练数据集包括多个样本图像和与每一个样本图像对应的真实标签图像;以及基于所述训练数据集,对所述图像处理模型进行训练,以获得训练完成的图像处理模型,其中,所述图像处理模型由可导区域增长模块和用于执行图像处理的深度学习网络模块连接构成,所述可导区域增长模块用于执行区域增长操作,以获得样本图像的连通域特征图像,其中,对所述图像处理模型进行训练是至少基于所述连通域特征图像进行的。
例如,根据本公开的实施例,其中,所述可导区域增长模块执行区域增长操作以获得样本图像的连通域特征图像包括:接收输入图像和种子函数,所述输入图像是所述深度学习网络模块的中间层针对所述样本图像所生成的特征图像、所述深度学习网络模块的输出层针对所述样本图像所生成的预测结果图像、以及所述样本图像的真实标签图像中的一个;基于所述输入图像对所述种子函数执行最大池化膨胀操作;将最大池化膨胀操作后的种子函数与所述输入图像执行结合操作;重复上述步骤直到达到迭代次数阈值,得到所述种子函数的区域增长结果,以作为所述样本图像的连通域特征图像。
例如,根据本公开的实施例,其中,所述结合操作包括以下中的任意一项:将所述最大池化膨胀操作后的种子函数中的每个像素点与所述输入图像中的对应像素点进行相乘操作;以及将所述最大池化膨胀操作后的种子函数中的每个像素点与所述输入图像中的对应像素点进行取最小值操作。
例如,根据本公开的实施例,其中,所述可导区域增长模块连接在所述深度学习网络模块的输出层之后,其中,对所述图像处理模型进行训练包括:利用所述深度学习网络模块对样本图像进行图像处理预测,并从所述输出层输出所述预测结果图像;将所述预测结果图像作为第一输入图像输入到所述可导区域增长模块;将所述样本图像的真实标签图像 作为第二输入图像输入到所述可导区域增长模块;利用所述可导区域增长模块,基于所述第一输入图像和所述种子函数执行区域增长操作,得到所述种子函数的第一区域增长结果,以作为所述样本图像的第一连通域特征图像;利用所述可导区域增长模块,基于所述第二输入图像和所述种子函数执行区域增长操作,得到所述种子函数的第二区域增长结果,以作为所述样本图像的第二连通域特征图像;以及基于所述第一连通域特征图像和所述第二连通域特征图像,计算目标损失函数值,并基于所述目标损失函数值调整所述深度学习网络模块的参数。
例如,根据本公开的实施例,其中,所述目标损失函数为如下定义的损失函数Lc

Lc=1-softcoDice      (2)
其中,X表示真实标签图像,Y表示预测结果图像,S为种子函数,g(X,S)为所述第一连通域特征图像,g(Y,S)为所述第二连通域特征图像,以及所述损失函数Lc对不连通域产生更重的惩罚。
例如,根据本公开的实施例,其中,所述种子函数是基于等间距策略、池化反池化策略和断点池化策略中的任意一个所生成的。
例如,根据本公开的实施例的方法,其中,基于所述等间距策略生成种子函数包括:构建与样本图像相同维度的图像,将所述图像中预定间隔的像素点设置为种子点,并将其余像素点设置为背景像素,并将设置后的所述图像作为种子函数。
例如,根据本公开的实施例,其中,基于所述池化反池化策略生成种子函数包括:对所述真实标签图像执行最大池化操作以获取一个或多个局部最大值;对最大池化操作后的结果图像进行反池化操作以恢复所述一个或多个局部最大值在所述真实标签图像中的实际位置,并将反池化操作后的结果图像作为种子函数。
例如,根据本公开的实施例,其中,基于所述断点池化策略生成种子函数包括:将所述预测结果图像和所述真实标签图像相减;对相减后的图像执行所述最大池化膨胀操作;将执行所述最大池化膨胀操作后得到的图像与所述预测结果图像相乘,以得到交叉点图像;以及对所述交叉点图像执行最大池化操作以获取一个或多个局部最大值;对最大池化操作后的结果图像进行反池化操作以恢复所述一个或多个局部最大值在所述交叉点图像中的实际位置,并将反池化操作后的结果图像作为种子函数。
例如,根据本公开的实施例,其中,所述最大池化膨胀操作包括:利用池化内核对所述种子函数执行步长为1的最大池化操作。
例如,根据本公开的实施例的方法,其中,当所述样本图像是一维图像时,所述池化内核的尺寸为3,当所述样本图像是二维图像时,所述池化内核的尺寸为3*3,当所述样本图像是三维图像时,所述池化内核的尺寸为3*3*3,当所述样本图像是四维图像时,所述池化内核的尺寸为3*3*3*3。
例如,根据本公开的实施例,其中,所述可导区域增长模块的输入端连接到所述深度学习网络模块的第一中间层并且所述可导区域增长模块的输出端连接到不同于所述第一中间层的第二中间层,以及,对所述图像处理模型进行训练包括:将所述第一中间层所生成的第一特征图像作为第三输入图像输入到所述可导区域增长模块;利用所述可导区域增长模块,基于所述第三输入图像和所述种子函数执行区域增长操作,得到所述种子函数的第三区域增长结果,以作为所述样本图像的第三连通域特征图像;将所述第三连通域特征图像输入到所述第二中间层以与所述第二中间层所生成的第二特征图像进行融合;利用所述深度学习网络模块,基于所融合的特征图像执行图像处理预测;基于预测结果计算目标损失函数值;以及基于所述目标损失函数值调整所述深度学习网络模块的参数。
例如,根据本公开的实施例,其中,所述目标损失函数是交叉熵损失函数,dice损失函数和焦点损失函数中的一个。
例如,根据本公开的实施例的方法,其中,所述种子函数是基于等间距策略和池化反池化策略中的任意一个所生成的。
例如,根据本公开的实施例,其中,基于所述等间距策略生成种子函数包括:构建与样本图像相同维度的图像,将所述图像中预定间隔的像素点设置为种子点,并将其余像素点设置为背景像素,并将设置后的所述图像作为种子函数。
例如,根据本公开的实施例,其中,基于所述池化反池化策略生成种子函数包括:从所述深度学习网络模块的第三中间层接收第三特征图像;对所述第三特征图像执行最大池化操作以获取一个或多个局部最大值;对最大池化操作后的结果图像进行反池化操作以恢复所述一个或多个局部最大值在所述第三特征图像中的实际位置,并将反池化操作后的结果图像作为种子函数。
例如,根据本公开的实施例,其中,在对所述第三特征图像执行最大池化操作之前,进一步利用一卷积层对所述第三特征执行卷积操作。
例如,根据本公开的实施例的方法,其中,将所述第三连通域特征图像输入到所述第二中间层以与所述第二中间层所生成的第二特征图像进行融合包括:将所述第三连通域特征图像与所述第二中间层所生成的第二特征图像进行逐像素点叠加操作,以得到融合后的特征图像。
本公开的实施例还提供了还提供了一种图像处理模型的训练装置,所述装置包括:图像获取部件,用于获取训练数据集,所述训练数据集包括多个样本图像和与每一个样本图像对应的真实标签图像;以及训练部件,用于基于所述训练数据集,对所述图像处理模型进行训练,以获得训练完成的图像处理模型,其中,所述图像处理模型由可导区域增长模块和用于执行图像处理的深度学习网络模块连接构成,所述可导区域增长模块用于执行区域增长操作,以获得样本图像的连通域特征图像,其中,对所述图像处理模型进行训练是至少基于所述连通域特征图像进行的。
本公开的实施例还提供了还提供了一种用于图像处理的方法,包括:获取待处理的图像;基于训练好的图像处理模型中的深度学习网络模块,对所述待处理的图像执行图像处理操作,以获得具有连通性的处理后的图像,所述处理后的图像中的连通域的数量少于预定阈值,其中,所述训练好的图像处理模型是基于根据前面任一项所述的图像处理模型的训练方法所获得的。
本公开的实施例还提供了还提供了一种用于图像处理的方法,包括:获取待处理的图像;基于训练好的图像处理模型,对所述待处理的图像执行图像处理操作,以获得具有连通性的处理后的图像,所述处理后的图像中的连通域的数量少于预定阈值,其中,所述训练好的图像处理模型由可导区域增长模块和用于执行图像处理的深度学习网络模块连接构成,所述可导区域增长模块的输入端连接到所述深度学习网络模块的第一中间层并且所述可导区域增长模块的输出端连接到不同于所述第一中间层的第二中间层,其中,所述可导区域增长模块用于执行区域增长操作,以获得所述待处理的图像的连通域特征图像,其中,基于训练好的图像处理模型对所述待处理的图像执行图像处理操作是至少基于所述连通域特征图像进行的。
例如,根据本公开的实施例,其中,所述训练好的图像处理模型是基于上面任一项所述的图像处理模型的训练方法所获得的。
例如,根据本公开的实施例,其中,所述可导区域增长模块执行区域增长操作以获得待处理的图像的连通域特征图像包括:接收输入图像和种子函数,所述输入图像是所述深度学习网络模块的中间层针对所述待处理的图像所生成的特征图像;基于所述输入图像对所述种子函数执行最大池化膨胀操作;将最大池化膨胀操作后的种子函数与所述输入图像执行结合操作;重复上述步骤直到达到迭代次数阈值,得到所述种子函数的区域增长结果,以作为所述待处理的图像的连通域特征图像。
例如,根据本公开的实施例,其中,所述最大池化膨胀操作包括:利用池化内核对所述种子函数执行步长为1的最大池化操作。
例如,根据本公开的实施例的方法,其中,当所述待处理的图像是一维图像时,所述池化内核的尺寸为3,当所述待处理的图像是二维图像时,所述池化内核的尺寸为3*3,当所述待处理的图像是三维图像时,所述池化内核的尺寸为3*3*3,当所述样本图像是四 维图像时,所述池化内核的尺寸为3*3*3*3。
例如,根据本公开的实施例,其中,所述结合操作包括以下中的任意一项:将所述最大池化膨胀操作后的种子函数中的每个像素点与所述输入图像中的对应像素点进行相乘操作;以及将所述最大池化膨胀操作后的种子函数中的每个像素点与所述输入图像中的对应像素点进行取最小值操作。
例如,根据本公开的实施例,其中,对所述待处理的图像执行图像处理操作包括:将所述第一中间层针对所述待处理的图像生成的第一特征图像作为输入图像输入到所述可导区域增长模块;利用所述可导区域增长模块,基于所述输入图像和种子函数执行区域增长操作,得到所述种子函数的区域增长结果,以作为所述待处理的图像的连通域特征图像;将所述连通域特征图像输入到所述第二中间层以与所述第二中间层所生成的第二特征图像进行融合;利用所述深度学习网络模块,基于所融合的特征图像执行图像预测,以获得具有连通性的处理后的图像。
例如,根据本公开的实施例,其中,所述种子函数是基于等间距策略和池化反池化策略中的任意一个所生成的。
例如,根据本公开的实施例,其中,基于所述等间距策略生成种子函数包括:构建与样本图像相同维度的图像,将所述图像中预定间隔的像素点设置为种子点,并将其余像素点设置为背景像素,并将设置后的所述图像作为种子函数。
例如,根据本公开的实施例,其中,基于所述池化反池化策略生成种子函数包括:从所述深度学习网络模块的第三中间层接收第三特征图像;对所述第三特征图像执行最大池化操作以获取一个或多个局部最大值;对最大池化操作后的结果图像进行反池化操作以恢复所述一个或多个局部最大值在所述第三特征图像中的实际位置,并将反池化操作后的结果图像作为种子函数。
例如,根据本公开的实施例,其中,在对所述第三特征图像执行最大池化操作之前,进一步利用一卷积层对所述第三特征执行卷积操作。
例如,根据本公开的实施例,其中,将所述连通域特征图像输入到所述第二中间层以与所述第二中间层所生成的第二特征图像进行融合包括:将所述连通域特征图像与所述第二中间层所生成的第二特征图像进行逐像素点叠加操作,以得到融合后的特征图像。
本公开的实施例还提供了还提供了一种于图像处理的装置,包括:图像获取部件,用于获取待处理的图像;处理部件,基于训练好的图像处理模型中的深度学习网络模块,对所述待处理的图像执行图像处理操作,以获得具有连通性的处理后的图像,所述处理后的图像中的连通域的数量少于预定阈值,输出部件,用于输出所述具有连通性的处理后的图像,其中,所述训练好的图像处理模型是基于根据前面任一项所述的图像处理模型的训练方法所获得的。
本公开的实施例还提供了还提供了一种于图像处理的装置,包括:图像获取部件,用于获取待处理的图像;处理部件,基于训练好的图像处理模型,对所述待处理的图像执行图像处理操作以获得具有连通性的处理后的图像,所述处理后的图像中的连通域的数量少于预定阈值;输出部件,用于输出所述具有连通性的处理后的图像,其中,所述训练好的图像处理模型由可导区域增长模块和用于执行图像处理的深度学习网络模块连接构成,所述可导区域增长模块的输入端连接到所述深度学习网络模块的第一中间层并且所述可导区域增长模块的输出端连接到不同于所述第一中间层的第二中间层,其中,所述可导区域增长模块用于执行区域增长操作,以获得样本图像的连通域特征图像,其中,所述处理部件基于训练好的图像处理模型对所述待处理的图像执行图像处理操作是至少基于所述连通域特征图像进行的。
例如,根据本公开的实施例,其中,所述训练好的图像处理模型是基于根据前面所述方法中的任一项所述的图像处理模型的训练方法所获得的。
本公开的实施例还提供了还提供了一种电子设备,包括存储器和处理器,其中,所述存储器上存储有处理器可读的程序代码,当处理器执行所述程序代码时,执行根据前面所述方法中的任意一项。
本公开的实施例还提供了还提供了一种计算机可读存储介质,其上存储有计算机可执行指令,所述计算机可执行指令用于执行根据前面所述方法中的任意一项。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对本公开实施例的附图作简单地介绍。明显地,下面描述中的附图仅仅涉及本公开的一些实施例,而非对本公开的限制。
图1示出了根据本公开实施例的图像处理模型的训练方法及基于训练好的图像处理模型进行图像处理的方法的应用架构示意图;
图2示出了一种传统的用于图像分割处理的UNet网络架构的示意图;
图3是以与样本图像相关联的真实标签图像为例来对可导区域增长模块300的区域增长操作进行示意性说明的示意图;
图4示出了根据本公开一个实施例的基于可导区域增长模块的图像处理模型400的示意性结构图;
图5示出了根据本公开实施例的等间距策略,池化反池化策略和断点池化策略这三种自动种子函数生成策略;
图6描述了三种不同的种子函数生成策略所生成的种子函数的效果图;
图7示出了根据本公开另一个实施例的基于可导区域增长模块的图像处理模型700的示意性结构图;
图8示出了用于训练根据本公开实施例的图像处理模型的训练方法800的流程图;
图9是结合图4所示的图像处理模型400,来对图像处理模型进行训练的示例实现细节进行说明的流程图;
图10是结合图7所示的图像处理模型700,来对图像处理模型进行训练的示例实现细节进行说明的流程图;
图11是基于训练好的图像处理模型400所进行的图像处理方法的流程图;
图12是描述了基于训练好的图像处理模型700所进行的另一图像处理方法的流程图;
图13示出了根据本公开实施例的图像处理模型的训练装置;
图14示出了根据本公开实施例的图像处理装置的结构示意图;以及
图15示出了根据本公开的实施例的存储介质的示意图。
具体实施方式
下面将结合附图对本公开实施例中的技术方案进行清楚、完整地描述,显而易见地,所描述的实施例仅仅是本公开的部分实施例,而不是全部的实施例。基于本公开实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,也属于本公开保护的范围。
本说明书中使用的术语是考虑到关于本公开的功能而在本领域中当前广泛使用的那些通用术语,但是这些术语可以根据本领域普通技术人员的意图、先例或本领域新技术而变化。此外,特定术语可以由申请人选择,并且在这种情况下,其详细含义将在本公开的详细描述中描述。因此,说明书中使用的术语不应理解为简单的名称,而是基于术语的含义和本公开的总体描述。
虽然本公开对根据本公开的实施例的系统中的某些模块做出了各种引用,然而,任何数量的不同模块可以被使用并运行在用户终端和/或服务器上。所述模块仅是说明性的,并且所述系统和方法的不同方面可以使用不同模块。
本公开中使用了流程图来说明根据本公开的实施例的系统所执行的操作。应当理解的是,前面或下面操作不一定按照顺序来精确地执行。相反,根据需要,可以按照倒序或同时处理各种步骤。同时,也可以将其他操作添加到这些过程中,或从这些过程移除某一步或数步操作。
传统的深度学习网络在进行图像处理时往往旨在鼓励在单个像素的基础上进行图像 处理,然而,对于用于临床疾病诊断和治疗过程中的医学图像来说,基于单个像素的处理往往忽略了组织结构间的连通性,往往产生组织断连的结果,进而影响后续分析步骤。
因此,本公开提出一种改进的图像处理模型及其训练方法,通过将一种新颖的可导区域增长模块嵌入到传统的深度学习网络中,能够解决传统的深度学习图像处理模型造成的医学图像中组织结构间断连的问题,保证组织结构的连通性。
当然,应当理解,根据本公开实施例的图像处理模型的训练方法和图像处理方法不仅仅适用于医学图像,同样也适用于对具有区域连通性要求的非医学图像进行处理,本公开对此不进行限制。
图1示出了根据本公开实施例的图像处理模型的训练方法及基于训练好的图像处理模型进行图像处理的方法的应用架构示意图,包括服务器100、终端设备200。
终端设备200例如可以是医疗设备,例如,用户可以基于终端设备200查看医学图像的处理结果。
终端设备200与服务器100之间可以通过互联网相连,实现相互之间的通信。例如,上述的互联网使用标准通信技术和/或协议。互联网通常为因特网、但也可以是任何网络,包括但不限于局域网(Local Area Network,LAN)、城域网(Metropolitan AreaNetwork,MAN)、广域网(Wide Area Network,WAN)、移动、有线或者无线网络、专用网络或者虚拟专用网络的任何组合。在一些实施例中,使用包括超文本标记语言(Hyper Text MarkupLanguage,HTML)、可扩展标记语言(Extensible Markup Language,XML)等的技术和/或格式来代表通过网络交换的数据。此外还可以使用诸如安全套接字层(Secure SocketLayer,SSL)、传输层安全(Transport Layer Security,TLS)、虚拟专用网络(VirtualPrivate Network,VPN)、网际协议安全(Internet Protocol Security,IPsec)等常规加密技术来加密所有或者一些链路。在另一些实施例中,还可以使用定制和/或专用数据通信技术取代或者补充上述数据通信技术。
服务器100可以为终端设备200提供各种网络服务,其中,服务器100可以是一台服务器、包括若干台服务器的服务器集群或云计算中心。
例如,服务器100可以包括处理器110(Center Processing Unit,CPU)、存储器120、输入设备130和输出设备140等。输入设备130可以包括键盘、鼠标、触摸屏等,输出设备140可以包括显示设备,如液晶显示器(Liquid Crystal Display,LCD)、阴极射线管(Cathode Ray Tube,CRT)等。
存储器120可以包括只读存储器(ROM)和随机存取存储器(RAM),并向处理器110提供存储器120中存储的程序指令和数据。在本公开的实施例中,存储器120可以用于存储本公开实施例中图像处理模型的训练方法或图像处理方法的程序。
处理器110通过调用存储器120存储的程序指令,处理器110用于按照获得的程序指令执行本公开实施例中任一种图像处理模型的训练方法或图像处理方法的步骤。
例如,本公开实施例中,图像处理模型的训练方法或图像处理方法例如由服务器100侧执行,例如,针对图像处理方法,终端设备200可以将采集到的医学图像发送给服务器100,由服务器100对该医学图像进行深度学习图像处理,并可以将结果返回给终端设备200。
如图1所示的应用架构,是以应用于服务器100侧为例进行说明的,当然,本公开实施例中的方法也可以由终端设备200执行,例如终端设备200可以从服务器100侧获得训练好的图像处理模型,从而基于该训练好的图像处理模型,对医学图像进行处理,获得处理结果,对此本公开实施例中并不进行限制。
本公开各个实施例以应用于图1所示的应用架构图为例进行示意性说明。当然,应当理解的是,本公开实施例中的应用架构图是为了更加清楚地说明本公开实施例中的技术方案,并不构成对本公开实施例提供的技术方案的限制,当然,对于其它的应用架构和业务应用,本公开实施例提供的技术方案对于类似的问题,同样适用。
传统的深度学习网络模块通常都是在像素级别上对图像进行处理的,其通过对每个像素进行密集的预测、推断来实现细粒度的推理,从而使每个像素都被标记为相应的类别。
图2示出了一种传统的用于图像分割处理的UNet网络架构的示意图。
从图2可以看出,UNet网络架构包括U型网络架构和跳层连接。UNet网络架构是一个对称的网络架构,包含左侧和右侧两条路径。左侧的路径可以视为一个编码器,也可以称为上采样处理路径,其包括五个卷积子模块,每个子模块包括两个卷积层和ReLU层,这里的卷积层结构统一为3×3的卷积核。每个子模块之后有一个通过最大池化实现的下采样层。卷积子模块用于提取特征,最大池化层用于降低维度,每次最大池化层之后输出的特征图像的分辨率变为一半。最后一个卷积子模块输出的特征图不经过最大池化,直接被输入到右侧的解码器。右侧的路径可以视为一个解码器,也可以称为下采样处理路径,包含与编码器基本对称的结构,对输入的特征图执行3×3的卷积和上采样,逐步修复物体的细节和空间维度。此外,网络中还用到了特征融合,如图2中的虚线箭头所示,通过跳层连接的方式将前面部分下采样网络的特征与后面上采样的特征进行了拼接和融合以获得更准确的上下文信息,达到更好的处理效果。该UNet网络模型最后输出目标图像的分割图,该分割图中的每个像素的像素值可以是表示其类别的标签。
如上所述,由于传统的用于图像处理的深度学习网络是在像素级别进行理解和处理图像的,其输出的预测处理结果往往存在断开或分离的像素点或像素区域。而对于包括组织图像的医学图像来说,这些断开或分离的像素点或像素区域有时往往属于同一个组织结构,因此,基于传统的深度学习网络的图像处理方法忽略了组织结构间的连通性,从而影响后续诊断和治疗步骤。
基于此,本公开的实施例提出了一种新颖的区域增长模块,并提出将其嵌入到传统的用于图像处理的深度学习网络。通过将该区域增长模块作为一个特殊的层引入到深度学习网络中,可以使这个区域增长模块(或层)直接参与网络的训练和预测(可选,不一定参与预测,但一定参与训练)的过程。这里的区域增长模块作为一个特殊的层可以允许“梯度(gradient)”穿过,从而保证网络的训练,因此在下文中可以将其称为“可导区域增长模块”。
该新颖的可导区域增长模块可以执行区域增长操作以获得样本图像的连通域特征,使得嵌入了该新颖的可导区域增长模块的传统的用于图像处理的深度学习网络能够从该新颖的可导区域增长模块获得关于区域连通性的特征,从而在保证连通性的基础上来实现图像的处理。
下面以如图2所示的UNet网络架构作为传统的用于图像处理的深度学习网络为例,来描述如何将本公开实施例的新颖的可导区域增长模块嵌入到其中来构成一种新颖的图像处理模型,并如何对嵌入了新颖的可导区域增长模块后的图像处理模型进行训练,从而使其能更好的学习图像的连通域特征。
应当理解,这里的图像处理可以例如是图像变换、图像识别、图像分类、图像分割等各种图像处理过程,本公开对此不作限制。
此外,如本领域技术人员所理解的那样,本领域技术人员可以采取任意适用于图像处理的传统的深度学习网络,例如linknet、ResNet以及VGG Net等,同时也可以根据实际情况对这些处理网络进行适应性的调整。
本公开提出的新颖的可导区域增长模块被设计为以并联或串联的方式连接到传统的深度学习网络模块,并可以基于从深度学习网络模块的某一中间层接收的样本图像的特征图像、与样本图像相关联的真实标签图像、或从深度学习网络模块的输出层接收的针对样本图像所生成的预测结果图像来对种子函数进行可微膨胀操作,使得能够将种子点限制在样本图像的连通区域(例如,诸如血管、细小血管、器官等的组织区域)内进行增长,从而获得样本图像的连通域特征。
现在结合图3来描述根据本公开实施例的可导区域增长模块300的区域增长操作的示意性过程。
本公开实施例的新颖的可导区域增长模块基于所接收的与样本图像相关联的图像,并基于该所接收的图像对种子函数执行区域增长操作以获得样本图像的连通域特征图像。
例如,该与样本图像相关联的图像是该区域增长操作所嵌入的深度学习网络模块的中 间层针对样本图像所生成的特征图像、深度学习网络模块的输出层针对该样本图像所生成的预测结果图像、以及该样本图像的真实标签图像中的一个。
图3以与样本图像相关联的真实标签图像为例来对可导区域增长模块300的区域增长操作进行示意性说明。
图3所示的真实标签图像是一个二值图像,其中黑色的像素点表示属于组织区域的像素,白色的像素点表示背景像素。
如图3所示,可导区域增长模块300所执行的区域增长操作是一个迭代的过程。在种子点X的每次膨胀之后,都将膨胀后的结果与真实标签图像进行结合操作,使得将种子点X被限制在组织区域内进行增长。其中超参数t表示执行迭代的次数,t设置值越大则可以保证区域增长操作的输出包含与真实标签图像相接近的区域连通性特征。
与基于相似属性强度、灰度级和纹理颜色等来实现区域增长的传统方法不同,本公开实施例所提出的新颖的可导区域增长模块可以采用最大池化膨胀操作和结合操作来实现区域增长。
本公开的实施例提出了基于最大池化层来实现的最大池化膨胀操作。如上所述,通过利用特殊的网络层(例如,这里是最大池化层)来实现区域增长模块的操作,使得可以允许“梯度”穿过,从而保证了网络的训练。
例如,基于最大池化层的最大池化膨胀操作可以包括利用尺寸N*N(例如,对于二维图像)的池化内核对种子函数执行步长为1的最大池化操作。应当理解,这里的池化内核可以基于图像的维度不同而具有不同的维度,例如,当图像是一维图像时,池化内核的尺寸为N,当图像是二维图像时,所述池化内核的尺寸为N*N,当图像是三维图像时,池化内核的尺寸为N*N*N,当图像是四维图像时,池化内核的尺寸为N*N*N*N,以此类推。例如,这里的N可以取值为3。最大池化膨胀操作后的图像与输入的样本图像具有相同的维度。由于卷积网络中池化层更容易实现,相对于其他手段的膨胀操作也可以实现更高的运算效率。
根据本公开的实施例,这里的结合操作可以包括相乘操作和取最小值操作。
例如,相乘操作包括将最大池化膨胀操作后的种子函数中的每个像素点与所接收的与样本图像相关联的图像(例如,图3中是真实标签图像)中的对应像素点相乘。
例如,取最小值操作包括将最大池化膨胀操作后的种子函数中的每个像素点与所接收的与样本图像相关联的图像(例如,图3中是真实标签图像)中的对应像素点进行取最小值。
如此,乘法操作将种子点限制在组织连通区域内进行增长,取最小值操作将使断点后的组织结构响应值降低,两者都能够达到增大断点惩罚的作用,从而使得区域增长后的图像包含获得样本图像的连通域特征。
图4示出了根据本公开一个实施例的基于可导区域增长模块的图像处理模型400的示意性结构图。
如图4所示,根据本公开一个实施例的图像处理模型400包括可导区域增长模块400A和用于图像处理的深度学习网络模块400B,其中,可导区域增长模块400A连接在深度学习网络模块400B之后。
例如,这里的可导区域增长模块400A可以是图3中描述的可导区域增长模块300。例如,这里的深度学习网络模块400B可以是如上所述的Unet模型,其对输入样本图像进行预测后得出该样本图像中每个像素的预测类别,从而实现图像区域的分割。当然,这里的深度学习网络模块400B还可以是任何适用于图像处理的其他深度学习网络,例如linknet、ResNet以及VGG Net等,同时也可以根据实际情况对这些处理网络进行适应性的调整。
如上所述,传统的用于图像处理的深度学习网络模块400B是在单个像素级别进行处理的,其输出的预测结果图像往往存在断开或分离的像素点或像素区域。因此,本公开的一个实施例提出将可导区域增长模块400A连接到深度学习网络模块400B的输出层之后,通过结合深度学习网络模块400B的预测输出图像和真实标签图像来进行种子函数的区域 增长,并重新构造新的coDice损失函数,从而丢弃与种子函数断开的所有像素,增大不连通性的惩罚。
根据本公开的实施例在图4所示的图像处理模型400的结构中,为了充分发挥可导区域增长模块400A在的图像处理模型400中所起到的增大不连通惩罚的作用,本公开实施例还提出了用于可导区域增长模块400A的三种种子函数生成策略。
图5示出了根据本公开实施例的等间距策略,池化反池化策略和断点池化策略这三种自动种子函数生成策略。
图5的(A)是基于等间距策略构造种子函数的示意性流程图。
等间距策略构造过程可以包括:构建与样本图像相同维度的图像,将图像中预定间隔的像素点设置为种子点,并将其余像素点设置为背景像素,并将设置后的图像作为种子函数。
图5的(B)是基于池化反池化策略构造种子函数的示意性流程图。
例如,在本公开的实施例中,这里的池化反池化操作是最大池化反池化操作。如本领域技术人员已知的,最大池化操作反池化操作通过在池化过程中获取一个或多个局部最大值,然后在反池化时恢复该一个或多个局部最大值在图像中的实际位置,并将子区域中其它像素值设为零,可以在给定内核大小的正方形内选择单个种子点并抑制其他候选点,保证在包含所有组织结构上生成等间距的种子点。
如图5的(B)所示,本公开实施例的池化反池化策略可以包括:对真实标签图像执行最大池化操作以获取一个或多个局部最大值;对最大池化操作后的结果图像进行反池化操作以恢复该一个或多个局部最大值在真实标签图像中的实际位置,并将反池化操作后的结果图像作为种子函数。
图5的(C)是基于断点池化策略构造种子函数的示意性流程图。
图5的(C)所示的断点池化策略可以概括为如下公式(3):
Iseeds=maxunpool(dilate(Igt-Ipre)×Ipre)    (3)
其中,Ipre和Igt分别为预测结果和真实标签。
如图5的(C)所示,首先将预测结果图像和真实标签图像相减,随后对相减后的图像执行最大池化膨胀操作(例如,图5(C)中示出为最大池化),以得到紧邻断点的交叉点。例如,这里的相减过程可以是预测结果图像和真实标签图像的逐像素相减,并对负数差值取绝对值。
例如,这里的最大池化膨胀操作可以是如上文所述的基于最大池化层所进行的最大池化膨胀。例如,可以基于卷积内核大小为N*N(例如,对于二维图像)且步长为1的最大池化层来实现对预测结果图像和真实标签图像相减后所得到图像的最大池化膨胀,例如,这里的N可以取值为3。最大池化膨胀操作后的图像与输入的样本图像具有相同的维度。由于卷积网络中池化层更容易实现,相对于其他手段的膨胀操作可以实现更高的运算效率。
随后,将最大池化膨胀操作后的图像与预测结果图像进一步相乘,可以得到包括紧邻断点的交叉点的图像,利用最大池化反池化操作对交叉点进行过滤,可以保证种子出现在断点附近,最大限度发挥可导区域增长模块的作用。
例如,这里的最大池化反池化操作与图6的(B)中描述的最大池化和反池化操作相同,先对交叉点的图像执行最大池化操作以获取交叉点图像中的一个或多个局部最大值,然后进行反池化操作以恢复一个或多个局部最大值在交叉点图像中的实际位置。
图6描述了三种不同的种子函数生成策略所生成的种子函数的效果图。
最左边的图像是二值化的真实标签图像,其中白色表示血管组织像素,黑色表示背景像素。图6的(A)是基于等间距策略构造的种子函数的效果图,图6的(B)基于池化反池化策略构造的种子函数的效果图,图6的(C)是基于断点池化策略构造的种子函数的效果图。
如图6可以看出,池化反池化策略及断点池化策略所构造的种子函数能够保证种子点出现在组织区域内,并且断点池化策略所构造的种子函数更能够保证种子点出现在断点附近,最大限度发挥可导区域增长模块的作用。此外,对于池化反池化策略及断点池化策略 来说,均可以通过利用特殊的网络层(例如,池化层)来实现种子函数的生成,使得可以允许“梯度”穿过,从而保证了网络的训练。
除了将可导区域增长模块连接在深度学习网络模块的输出层之后,下面介绍将可导区域增长模块嵌入在深度学习网络模块的中间层之间的另一实施例。
图7示出了根据本公开另一个实施例的基于可导区域增长模块的图像处理模型700的示意性结构图。
如图7所示,根据本公开另一个实施例的图像处理模型700包括可导区域增长模块700A和用于图像处理的深度学习网络模块700B,其中,可导区域增长模块700A连接在深度学习网络模块700B的中间层之间。
例如,这里的可导区域增长模块700A可以是图3中描述的可导区域增长模块300。例如,这里的深度学习网络模块700B可以是如上所述的Unet模型,其对输入样本图像进行处理后得出该样本图像中每个像素的预测类别,从而实现图像区域的分割处理。当然,这里的深度学习网络模块700B还可以是任何适用于任何图像处理的其他深度学习网络,例如linknet、ResNet以及VGG Net等,同时也可以根据实际情况对这些处理网络进行适应性的调整。
如图7所示,可导区域增长模块700A的输入端连接在深度学习网络模块700B的第一中间层之后,并从该中间层接收特征图像作为输入,并且可导区域增长模块700A的输出端连接到深度学习网络模块700B的第二中间层,使得可以将可导区域增长模块700A的输出与该第二中间层的特征图像进行融合。
根据本公开的实施例,在图7所示的图像处理模型700的结构中,可以基于上面所提到的等间距策略和池化反池化策略这两种种子生长策略来生成可导区域增长模块700A的种子函数。
例如,等间距策略构造过程可以包括:构建与样本图像相同维度的图像,将图像中预定间隔的像素点设置为种子点,并将其余像素点设置为背景像素,并将设置后的图像作为种子函数。
例如,池化反池化策略可以包括:基于从深度学习网络模块的第三中间层接收第三特征图像;对第三特征图像执行最大池化操作以获取第三特征图像中一个或多个局部最大值;对最大池化操作后的结果图像进行反池化操作以恢复该一个或多个局部最大值在第三特征图像中的实际位置,并将反池化操作后的结果图像作为种子函数。
此外,如上文参考图3所描述的,种子函数的种子点图和与其进行结合操作的图像具有相同的维度。为此,由于第三中间层和第一中间层所产生的特征图像可能具有不同的维度。因此,在这种情况下,需要对第三中间层所生成的第三特征图像执行进一步的卷积操作。例如,如图7所示,在对第三特征图像执行最大池化操作之前,还利用一卷积层对该第三特征执行卷积操作。
应当理解,这里采用的术语“第一中间层”、“第二中间层”和“第三中间层”仅是为了在不同的中间层之间进行区分,而并非是这几个中间层的编号或限定这几个中间层的先后顺序。例如,虽然图7中将“第一中间层”示出为靠前的中间层,将“第二中间层”示出为靠后的中间层,将“第三中间层”示出为在中间,然而本公开不限于此,这三个中间层可以具有不同的顺序,例如“第二中间层”可以是靠前的中间层,而“第一中间层”则可以是中间的中间层,而“第三中间层”则可以是靠后的中间层。
如此,可导区域增长模块700A通过利用区域增长操作实现种子点的膨胀,并且在每次膨胀之后将膨胀结果与从深度学习网络模块700B的第一中间层接收的特征图像进行如上所述的结合操作,使得可以将种子函数的种子点限制在样本图像的组织区域内进行增长。而通过将可导区域增长模块700A的输出与深度学习网络模块700B的第二中间层所产生的特征图像进行特征融合,则可以在图像处理模型的训练过程中提高区域不连通的惩罚,从而增加了其输出的预测图像中的区域连通性。
下面通过几个示例或实施例对根据本公开至少一个实施例提供的图像处理模型的训练方法和图像处理方法进行非限制性的说明。如下面所描述的,在不相互抵触的情况下这 些示例或实施例中不同特征可以相互组合,从而得到新的示例或实施例,这些新的示例或实施例也都属于本公开保护的范围。
图8示出了用于训练根据本公开实施例的图像处理模型的训练方法800的流程图。例如,该图像处理模型的训练方法800可以由服务器来执行,该服务器可以是图1中所示的服务器100。
首先,在步骤S801中,获取训练数据集,所述训练数据集包括多个样本图像和与每一个样本图像对应的真实标签图像。
例如,这里的样本图像可以是包括组织图像的医学图像。当然,这里的样本图像还可以是除了医学图像之外的其他任何合适的图像,本公开对此不作限制。
例如,这里的样本图像可以是通过医学影像技术获得的,也可以是通过网络下载的方式获取的,也可以通过其他途径获取的,本公开的实施例对此不作限制。
例如,这里的真实标签图像是对对应样本图像中每个像素所属区域或所属类别进行标注的标签图像。
在步骤S803中,基于所述训练数据集,对所述图像处理模型进行训练,以获得训练完成的图像处理模型。
例如,这里的图像处理模型可以是如上参考图4所示的图像处理模型400或参考图7所示的图像处理模型700,这两个图像处理模型均由上面所描述新颖的可导区域增长模块和用于执行图像处理的传统的深度学习网络模块连接而成。
例如,在该图像处理模型是图像处理模型400的示例中,可导区域增长模块400A连接在深度学习网络模块400B的输出层之后,并从该输出层接收预测结果图像。这里的预测结果图像是深度学习网络模块400B针对输入样本所输出的预测结果,该预测结果是与样本图像具有相同维度的图像,其中的每个像素点被标记为样本图像中对应像素点所属区域或所属类别的标签。
例如,在该图像处理模型是图像处理模型700的示例中,可导区域增长模块700A连接在深度学习网络模块700B的两个中间层之间,从该深度学习网络模块700B的一个中间层接收特征图像,并基于该特征图像执行如上文所讨论的区域增长操作,并将区域增长操作所获取的组织区域的连通性特征返回到深度学习网路模块700B。
在这两种图像处理模型的训练过程中,可以利用可导区域增长模块基于从深度学习网络模块中的一层(例如,中间层或输出层)接收的特征图像、预测图像和/或真实标签图像来对种子函数执行如上所述的区域增长操作,以得到包含样本图像的区域连通性特征的区域增长结果,从而可以使得在图像处理模型的训练过程中提高区域不连通的惩罚,从而增加了其输出的预测图像中的区域连通性。
以下参考图9,结合图4所示的图像处理模型400,来对上述图8中的步骤S803中的对图像处理模型进行训练的示例实现细节进行说明。
如图9所示,在步骤S901中,利用所述深度学习网络模块对样本图像进行图像处理预测,并从所述输出层输出所述预测结果图像。
例如,这里的深度学习网络模块即传统的用于图像处理的深度学习网络模块(例如,上文所述的Unet模型),其基于样本图像生成预测结果的过程为本领域的公知技术手段,在此不再进行赘述。
如上所述,传统的用于图像处理的深度学习网络模块是在像素级别的基础上进行的图像处理,往往忽略了区域的连通性,其输出的预测结果中会出现断开的像素点或像素区域,从而影响后续的分析步骤。
基于此,结合图4来看,本公开的一个实施例提出将如上文所描述的新颖的可导区域增长模块400A连接到传统的深度学习网络模块400B的输出层之后,分别基于深度学习网络模块400B的预测输出图像和真实标签图像来执行种子函数的区域增长,并重新构造新的coDice损失函数,从而丢弃与种子函数断开的所有像素,增大不连通性的惩罚,如下面的步骤S903-S911所述。
在步骤S903中,将所述预测结果图像作为第一输入图像输入到所述可导区域增长模 块。
在步骤S905中,将所述样本图像的真实标签图像作为第二输入图像输入到所述可导区域增长模块。
在步骤S907中,利用所述可导区域增长模块,基于所述第一输入图像和所述种子函数执行区域增长操作,得到所述种子函数的第一区域增长结果,以作为所述样本图像的第一连通域特征图像。
在步骤S909中,利用所述可导区域增长模块,基于所述第二输入图像和所述种子函数执行区域增长操作,得到所述种子函数的第二区域增长结果,以作为所述样本图像的第二连通域特征图像。
利用可导区域增长模块基于所接收的与样本图像相关联的图像(例如,这里是深度学习网络模块400B针对样本图像所生成的预测结果图像和样本图像的真实标签图像)执行种子函数的区域增长操作的示例步骤已在图2中进行了详细描述,在此不再赘述。应当理解,由于区域增长是基于与样本图像的特征图像进行的,这里的区域增长结果将包含关于样本图像区域连通性的特征。
如上所述,由于传统的用于图像处理的深度学习网络模块常常会将原本属于一个区域的像素识别为属于与该区域断开的另一个区域,因此,直接基于传统的用于图像处理的深度学习网络模块的预测结果图像进行的区域增长操作所得到的第一连通域特征图像与基于真实标签图像进行的区域增长操作所得到的第二连通区域特征图像直接将存在差异。
因此,对图像处理模型400进行训练的一个目标是构造出一种新的损失函数,以最小化这两个连通区域特征图像之间的差异为优化目标来对图像处理模型400进行训练。
在步骤S911中,基于所述第一连通域特征图像和所述第二连通域特征图像,计算目标损失函数值,并基于所述目标损失函数值调整所述深度学习网络模块的参数。
例如,为了减少上面的第一连通区域特征图像和第二连通区域特征图像之间的差异,这里的目标损失函数Lc可以构造为如下:

Lc=1-softcoDice         (2)
其中,X表示真实标签图像,Y表示预测结果图像,S为种子函数,g(X,S)为所述第一连通域特征图像,g(Y,S)为所述第二连通域特征图像。
在存在拓扑结构错误的情况下,coDice能够比普通Dice下降得更多,从而对不连通域产生更重的惩罚。
基于上述目标损失函数可以对图4所示的图像处理模型400进行参数调整,以使得随着迭代训练的继续,目标损失函数最终最小化。
如此,通过将可导区域增长模块(例如,图4中的可导区域增长模块400A)连接在传统的深度学习网络模块(例如,图4中的深度学习网络模块400B)的输出层之后,利用可导区域增长模块分别基于预测结果图像和真实标签图像得到关于这两个图像的区域连通性特征,并构造新的codDice损失函数,能够使得在图像处理模型的训练过程中增大区域不连通的惩罚,从而增加了其输出的预测图像中的区域连通性。
以下参考图10,结合图7所示的图像处理模型700,来对上面图8中的步骤S803对图像处理模型进行训练的示例实现细节进行说明。
在图7所示的图像处理模型700中,可导区域增长模块700A以并联方式连接在深度学习网络模块700B的两个中间层之间,从该深度学习网络模块700B的一个中间层(例如,第一中间层)接收特征图像,基于所接收的特征图像执行区域增长操作,并将区域增长操作后得到的输出返回到另一中间层(例如,第二中间层)。
如图10所示,在步骤S1001中,将所述第一中间层所生成的第一特征图像作为第三输入图像输入到所述可导区域增长模块。
这里的第一特征图像是深度学习网络模块700B的第一中间层针对输入的样本图像所产生的特征图像。
在步骤S1003中,利用所述可导区域增长模块,基于所述第三输入图像和所述种子函数执行区域增长操作,得到所述种子函数的第三区域增长结果,以作为所述样本图像的第三连通域特征图像。
利用可导区域增长模块基于所接收的与样本图像相关联的图像(例如,这里是第一中间层针对样本图像所生成的特征图像)执行种子函数的区域增长操作的示例步骤已在图2中进行了详细描述,在此不再赘述。
应当理解,由于区域增长是基于与样本图像的特征图像进行的,因此这里的区域增长结果将包含关于样本图像中区域连通性的特征。
在步骤S1005中,将所述第三连通域特征图像输入到所述第二中间层以与所述第二中间层所生成的第二特征图像进行融合。
例如,这里的特征融合可以包括将第三连通域特征图像与第二特征图像进行逐像素点的逐像素点叠加操作。当然,也可以采取其他的特征融合技术,例如逐像素点相乘等等,本公开对此不作限制。
在步骤S1007中,利用所述深度学习网络模块,基于所融合的特征图像执行图像处理预测。
在步骤S1009中,基于预测结果计算目标损失函数值。
在步骤S1011中,基于所述目标损失函数值调整所述深度学习网络模块的参数。
例如,这里的损失函数是针对传统的深度学习网络模块700B设计的损失函数。例如,基于深度学习网络模块700B的类型不同,损失函数可以是交叉熵损失函数,dice损失函数和焦点损失函数等等,本公开对此不做限制,也不再赘述。
基于通过如上方式训练好的图像处理模型,本公开实施例还提供了基于训练好的图像处理模型进行图像处理的方法。
下面分别结合基于如上所述的方法训练好的图像处理模型400和图像处理700,来描述基于这两个训练好的图像处理模型所进行的图像处理方法1100和1200。
图11描述了基于训练好的图像处理模型400所进行的图像处理方法1100。
在步骤S1101中,获取待处理的图像。
例如,这里的待处理的图像可以是包括组织图像的医学图像。当然,这里的待处理的图像还可以是除了医学图像之外的其他任何合适的图像,本公开对此不作限制。
例如,这里的待处理的图像可以是通过医学影像技术获得的,也可以是通过网络下载的方式获取的,也可以通过其他途径获取的,本公开的实施例对此不作限制。
在步骤S1103中,基于训练好的图像处理模型中的深度学习网络模块,对所述待处理的图像执行图像处理操作,以获得具有连通性的处理后的图像,所述处理后的图像中的连通域的数量少于预定阈值。
如上所述,由于在图像处理模型400的训练阶段已经以使得基于预测结果图像和真实标签图像所得到的区域连通性特征最小化为目标来调整深度学习网络模块400B的参数,因此,在训练完成后可以仅基于深度学习网络模块400B来进行图像处理。
如上所述,传统的用于图像处理的深度学习网络模块是在像素级别的基础上进行的图像处理,往往忽略了区域的连通性,其输出的预测结果中会出现断开的像素点或像素区域。例如,理想情况下预测结果中连通域个数应该是1,传统的用于图像处理的深度学习网络模块所得到的预测结果中连通域个数则可能大概是50-60个。而本公开的基于训练好的学习网络模型400B所进行的图像处理可以有效减少连通域个数,例如减少到20-30个。
图12描述了基于训练好的图像处理模型700所进行的另一图像处理方法1200。
在步骤S1201中,获取待处理的图像。
例如,这里的待处理的图像可以是包括组织图像的医学图像。当然,这里的待处理的图像还可以是除了医学图像之外的其他任何合适的图像,本公开对此不作限制。
例如,这里的待处理的图像可以是通过医学影像技术获得的,也可以是通过网络下载的方式获取的,也可以通过其他途径获取的,本公开的实施例对此不作限制。
在步骤S1203中,基于训练好的图像处理模型,对所述待处理的图像执行图像处理操 作,以获得具有连通性的处理后的图像,所述处理后的图像中的连通域的数量少于预定阈值。
与图像处理模型400不同,图像处理模型700中的可导区域增长模块700A作为一个特殊的层嵌入在神经网络模块700B的中间层中间。因此,在训练完成后的预测阶段,需要基于整个图像处理模型700来进行图像预测处理。
与训练过程相同,在预测阶段,可导区域增长模块700A用于执行区域增长操作,以获得待处理的图像的连通域特征图像,基于训练好的图像处理模型700对待处理的图像执行图像处理操作则是至少基于待处理的图像的连通域特征图像进行的。
例如,可导区域增长模块700A执行区域增长操作以获得待处理的图像的连通域特征图像包括:接收输入图像和种子函数,输入图像是深度学习网络模块700B的中间层针对待处理的图像所生成的特征图像;基于输入图像对种子函数执行最大池化膨胀操作;将最大池化膨胀操作后的种子函数与输入图像执行结合操作;重复上述步骤直到达到迭代次数阈值,得到种子函数的区域增长结果,以作为待处理的图像的连通域特征图像。
例如,图像处理模型700对待处理的图像执行图像处理操作包括:将第一中间层针对待处理的图像生成的第一特征图像作为输入图像输入到可导区域增长模块700A;利用可导区域增长模块700A,基于输入图像和种子函数执行区域增长操作,得到种子函数的区域增长结果,以作为待处理的图像的连通域特征图像;将连通域特征图像输入到第二中间层以与第二中间层所生成的第二特征图像进行融合;利用深度学习网络模块700B,基于所融合的特征图像执行图像预测,以获得具有连通性的处理后的图像。
同样,由于传统的用于图像处理的深度学习网络模块是在像素级别的基础上进行的图像处理,往往忽略了区域的连通性,其输出的预测结果中会出现断开的像素点或像素区域。例如,理想情况下的预测结果中连通域个数应该是1,传统的用于图像处理的深度学习网络模块所得到的预测结果中连通域个数则可能大概是50-60个。而本公开的基于训练好的图像处理模型700所进行的图像处理可以有效减少连通域个数,例如减少到20-30个。
图13示出了根据本公开实施例的图像处理模型的训练装置1300,例如包括图像获取部件1301和训练部件1303。
图像获取部件1301用于获取训练数据集,所述训练数据集包括多个样本图像和与每一个样本图像对应的真实标签图像。训练部件1303用于基于所述训练数据集,对所述图像处理模型进行训练,以获得训练完成的图像处理模型。其中,图像处理模型由可导区域增长模块和用于执行图像处理的深度学习网络模块连接构成,可导区域增长模块用于执行区域增长操作,以获得样本图像的连通域特征图像,并且其中,对图像处理模型进行训练是至少基于该连通域特征图像进行的。
图14示出了根据本公开实施例的图像处理装置1400的结构示意图。该图像处理装置1400至少包括图像获取部件1401、处理部件1403和输出部件1405。
本公开实施例中,图像获取部件1401、处理部件1403和输出部件1405为相关的医疗器械,可以集成在同一医疗器械中,也可以分为多个设备,相互连接通信,被包括在一个医疗系统来使用等,例如针对消化道疾病诊断,图像获取部件1401可以为内镜,处理部件1403和输出部件1405可以为与内镜相通信的计算机组件等。
例如,图像获取部件1401用于获取待处理的图像。处理部件1403例如用于执行图11或图12所示的图像处理方法的步骤。例如,在图像处理模型为如图4所述的图像处理模型400的结构时,处理部件1403可以基于训练好的图像处理模型中的深度学习网络模块,对所述待处理的图像执行图像处理操作,以获得具有连通性的处理后的图像,所述处理后的图像中的连通域的数量少于预定阈值。又例如,在图像处理模型为如图7所述的图像处理模型700的结构时,处理部件1403可以基于训练好的图像处理模型,对所述待处理的图像执行图像处理操作,以获得具有连通性的处理后的图像,所述处理后的图像中的连通域的数量少于预定阈值。输出部件1303用于输出所述具有连通性的处理后的图像。
基于上述实施例,本公开实施例中还提供了另一示例性实施方式的电子设备。在一些可能的实施方式中,本公开实施例中电子设备可以包括存储器、处理器及存储在存储器上 并可在处理器上运行的计算机程序,其中,处理器执行程序时可以实现上述实施例中图像处理模型训练方法或内窥镜图像识别方法的步骤。
例如,以电子设备为本公开图1中的服务器100为例进行说明,则该电子设备中的处理器即为服务器100中的处理器110,该电子设备中的存储器即为服务器100中的存储器120。
本公开的实施例还提供了一种计算机可读存储介质。图15示出了根据本公开的实施例的存储介质的示意图1500。如图15所示,所述计算机可读存储介质1500上存储有计算机可执行指令1501。当所述计算机可执行指令1501由处理器运行时,可以执行参照以上附图描述的根据本公开实施例的基于对比学习的内窥镜图像特征学习模型的训练方法和图像处理方法。所述计算机可读存储介质包括但不限于例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。
本公开的实施例还提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行根据本公开实施例的基于对比学习的内窥镜图像特征学习模型的训练方法和图像处理方法。
本领域技术人员能够理解,本公开所披露的内容可以出现多种变型和改进。例如,以上所描述的各种设备或组件可以通过硬件实现,也可以通过软件、固件、或者三者中的一些或全部的组合实现。
此外,虽然本公开对根据本公开的实施例的系统中的某些单元做出了各种引用,然而,任何数量的不同单元可以被使用并运行在客户端和/或服务器上。所述单元仅是说明性的,并且所述系统和方法的不同方面可以使用不同单元。
本领域普通技术人员可以理解上述方法中的全部或部分的步骤可通过程序来指令相关硬件完成,所述程序可以存储于计算机可读存储介质中,如只读存储器、磁盘或光盘等。例如,上述实施例的全部或部分步骤也可以使用一个或多个集成电路来实现。相应地,上述实施例中的各模块/单元可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。本公开并不限制于任何特定形式的硬件和软件的结合。
除非另有定义,这里使用的所有术语(包括技术和科学术语)具有与本公开所属领域的普通技术人员共同理解的相同含义。还应当理解,诸如在通常字典里定义的那些术语应当被解释为具有与它们在相关技术的上下文中的含义相一致的含义,而不应用理想化或极度形式化的意义来解释,除非这里明确地这样定义。
以上是对本公开的说明,而不应被认为是对其的限制。尽管描述了本公开的如果干示例性实施例,但本领域技术人员将容易地理解,在不背离本公开的新颖教学和优点的前提下可以对示例性实施例进行许多修改。因此,所有这些修改都意图包含在权利要求书所限定的本公开范围内。应当理解,上面是对本公开的说明,而不应被认为是限于所公开的特定实施例,并且对所公开的实施例以及其他实施例的修改意图包含在所附权利要求书的范围内。本公开由权利要求书及其等效物限定。
本申请要求于2022年06月02日递交的中国专利申请第202210626310.9号的优先权,在此全文引用上述中国专利申请公开的内容以作为本申请的一部分。

Claims (10)

  1. 一种图像处理模型的训练方法,所述方法包括:
    获取训练数据集,所述训练数据集包括多个样本图像和与每一个样本图像对应的真实标签图像;以及
    基于所述训练数据集,对所述图像处理模型进行训练,以获得训练完成的图像处理模型,
    其中,所述图像处理模型由可导区域增长模块和用于执行图像处理的深度学习网络模块连接构成,所述可导区域增长模块用于执行区域增长操作,以获得样本图像的连通域特征图像,
    其中,对所述图像处理模型进行训练是至少基于所述连通域特征图像进行的。
  2. 根据权利要求1所述的方法,其中,所述可导区域增长模块执行区域增长操作以获得样本图像的连通域特征图像包括:
    接收输入图像和种子函数,所述输入图像是所述深度学习网络模块的中间层针对所述样本图像所生成的特征图像、所述深度学习网络模块的输出层针对所述样本图像所生成的预测结果图像、以及所述样本图像的真实标签图像中的一个;
    基于所述输入图像对所述种子函数执行最大池化膨胀操作;
    将最大池化膨胀操作后的种子函数与所述输入图像执行结合操作;
    重复上述步骤直到达到迭代次数阈值,得到所述种子函数的区域增长结果,以作为所述样本图像的连通域特征图像。
  3. 根据权利要求2所述的方法,其中,所述结合操作包括以下中的任意一项:
    将所述最大池化膨胀操作后的种子函数中的每个像素点与所述输入图像中的对应像素点进行相乘操作;以及
    将所述最大池化膨胀操作后的种子函数中的每个像素点与所述输入图像中的对应像素点进行取最小值操作。
  4. 根据权利要求2-3中任一项所述的方法,其中,所述可导区域增长模块连接在所述深度学习网络模块的输出层之后,其中,对所述图像处理模型进行训练包括:
    利用所述深度学习网络模块对样本图像进行图像处理,并从所述输出层输出所述预测结果图像;
    将所述预测结果图像作为第一输入图像输入到所述可导区域增长模块;
    将所述样本图像的真实标签图像作为第二输入图像输入到所述可导区域增长模块;
    利用所述可导区域增长模块,基于所述第一输入图像和所述种子函数执行区域增长操作,得到所述种子函数的第一区域增长结果,以作为所述样本图像的第一连通域特征图像;
    利用所述可导区域增长模块,基于所述第二输入图像和所述种子函数执行区域增长操作,得到所述种子函数的第二区域增长结果,以作为所述样本图像的第二连通域特征图像;以及
    基于所述第一连通域特征图像和所述第二连通域特征图像,计算目标损失函数值,并基于所述目标损失函数值调整所述深度学习网络模块的参数。
  5. 根据权利要求4所述的方法,其中,所述目标损失函数为如下定义的损失函数Lc

    Lc=1-softcoDice   (2)
    其中,X表示真实标签图像,Y表示预测结果图像,S为种子函数,g(X,S)为所述第一连通域特征图像,g(Y,S)为所述第二连通域特征图像,以及
    所述损失函数Lc对不连通域产生更重的惩罚。
  6. 根据权利要求4所述的方法,其中,所述种子函数是基于等间距策略、池化反池化策略和断点池化策略中的任意一个所生成的。
  7. 根据权利要求6所述的方法,其中,基于所述等间距策略生成种子函数包括:
    构建与样本图像相同维度的图像,将所述图像中预定间隔的像素点设置为种子点,并将其余像素点设置为背景像素,并将设置后的所述图像作为种子函数。
  8. 根据权利要求6所述的方法,其中,基于所述池化反池化策略生成种子函数包括:
    对所述真实标签图像执行最大池化操作以获取一个或多个局部最大值;
    对最大池化操作后的结果图像进行反池化操作以恢复所述一个或多个局部最大值在所述真实标签图像中的实际位置,并将反池化操作后的结果图像作为种子函数。
  9. 根据权利要求6所述的方法,其中,基于所述断点池化策略生成种子函数包括:
    将所述预测结果图像和所述真实标签图像相减;
    对相减后的图像执行所述最大池化膨胀操作;
    将执行所述最大池化膨胀操作后得到的图像与所述预测结果图像相乘,以得到交叉点图像;以及
    对所述交叉点图像执行最大池化操作以获取一个或多个局部最大值;
    对最大池化操作后的结果图像进行反池化操作以恢复所述一个或多个局部最大值在所述交叉点图像中的实际位置,并将反池化操作后的结果图像作为种子函数。
  10. 根据权利要求2-3中任一项所述的方法,其中,所述可导区域增长模块的输入端连接到所述深度学习网络模块的第一中间层并且所述可导区域增长模块的输出端连接到不同于所述第一中间层的第二中间层,以及,
    对所述图像处理模型进行训练包括:
    将所述第一中间层所生成的第一特征图像作为第三输入图像输入到所述可导区域增长模块;
    利用所述可导区域增长模块,基于所述第三输入图像和所述种子函数执行区域增长操作,得到所述种子函数的第三区域增长结果,以作为所述样本图像的第三连通域特征图像;
    将所述第三连通域特征图像输入到所述第二中间层以与所述第二中间层所生成的第二特征图像进行融合;
    利用所述深度学习网络模块,基于所融合的特征图像执行图像处理预测;
    基于预测结果计算目标损失函数值;以及
    基于所述目标损失函数值调整所述深度学习网络模块的参数。
PCT/CN2023/097972 2022-06-02 2023-06-02 图像处理模型的训练方法、图像处理方法及装置 WO2023232137A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210626310.9 2022-06-02
CN202210626310.9A CN117237260A (zh) 2022-06-02 2022-06-02 图像处理模型的训练方法、图像处理方法及装置

Publications (1)

Publication Number Publication Date
WO2023232137A1 true WO2023232137A1 (zh) 2023-12-07

Family

ID=89025710

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/097972 WO2023232137A1 (zh) 2022-06-02 2023-06-02 图像处理模型的训练方法、图像处理方法及装置

Country Status (2)

Country Link
CN (1) CN117237260A (zh)
WO (1) WO2023232137A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118279705A (zh) * 2024-05-31 2024-07-02 浙江大华技术股份有限公司 目标识别方法、商品识别方法、设备以及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111815563A (zh) * 2020-06-10 2020-10-23 三峡大学 一种U-Net与区域增长PCNN相结合的视网膜视盘分割方法
CN114092439A (zh) * 2021-11-18 2022-02-25 深圳大学 一种多器官实例分割方法及系统
CN114298971A (zh) * 2021-11-23 2022-04-08 中国科学院深圳先进技术研究院 一种冠状动脉分割方法、系统、终端以及存储介质
WO2022087853A1 (zh) * 2020-10-27 2022-05-05 深圳市深光粟科技有限公司 一种图像分割方法、装置及计算机可读存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111815563A (zh) * 2020-06-10 2020-10-23 三峡大学 一种U-Net与区域增长PCNN相结合的视网膜视盘分割方法
WO2022087853A1 (zh) * 2020-10-27 2022-05-05 深圳市深光粟科技有限公司 一种图像分割方法、装置及计算机可读存储介质
CN114092439A (zh) * 2021-11-18 2022-02-25 深圳大学 一种多器官实例分割方法及系统
CN114298971A (zh) * 2021-11-23 2022-04-08 中国科学院深圳先进技术研究院 一种冠状动脉分割方法、系统、终端以及存储介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118279705A (zh) * 2024-05-31 2024-07-02 浙江大华技术股份有限公司 目标识别方法、商品识别方法、设备以及存储介质

Also Published As

Publication number Publication date
CN117237260A (zh) 2023-12-15

Similar Documents

Publication Publication Date Title
Shin et al. Deep vessel segmentation by learning graphical connectivity
Yeung et al. Focus U-Net: A novel dual attention-gated CNN for polyp segmentation during colonoscopy
Luo et al. Scribble-supervised medical image segmentation via dual-branch network and dynamically mixed pseudo labels supervision
Fakhry et al. Residual deconvolutional networks for brain electron microscopy image segmentation
Wang et al. Depth upsampling based on deep edge-aware learning
Cong et al. Boundary guided semantic learning for real-time COVID-19 lung infection segmentation system
Zhang et al. Attention-based interpolation network for video deblurring
CN111242288A (zh) 一种用于病变图像分割的多尺度并行深度神经网络模型构建方法
Huang et al. Medical image segmentation using deep learning with feature enhancement
WO2023232137A1 (zh) 图像处理模型的训练方法、图像处理方法及装置
Cheng et al. DDU-Net: A dual dense U-structure network for medical image segmentation
Jia et al. Learning high-resolution and efficient non-local features for brain glioma segmentation in MR images
Liu et al. Bi-RRNet: Bi-level recurrent refinement network for camouflaged object detection
Qamar et al. Multi stream 3D hyper-densely connected network for multi modality isointense infant brain MRI segmentation
Jian et al. Dual-branch-UNnet: A dual-branch convolutional neural network for medical image segmentation
Wang et al. PaI‐Net: A modified U‐Net of reducing semantic gap for surgical instrument segmentation
Wang et al. Msfnet: multistage fusion network for infrared and visible image fusion
Yu et al. Cnsnet: A cleanness-navigated-shadow network for shadow removal
Ren et al. Medical video super-resolution based on asymmetric back-projection network with multilevel error feedback
Shi et al. Dual dense context-aware network for hippocampal segmentation
Yu et al. Self-supervised multi-task learning for medical image analysis
Yi et al. Priors-assisted dehazing network with attention supervision and detail preservation
Deng et al. Ternary symmetric fusion network for camouflaged object detection
Fang et al. A multi-focus image fusion network combining dilated convolution with learnable spacings and residual dense network
Li et al. 3D U-Net brain tumor segmentation using VAE skip connection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23815314

Country of ref document: EP

Kind code of ref document: A1