CN109166130B

CN109166130B - Image processing method and image processing device

Info

Publication number: CN109166130B
Application number: CN201810885716.2A
Authority: CN
Inventors: 夏清
Original assignee: Beijing Sensetime Technology Development Co Ltd
Current assignee: Beijing Sensetime Technology Development Co Ltd
Priority date: 2018-08-06
Filing date: 2018-08-06
Publication date: 2021-06-22
Anticipated expiration: 2038-08-06
Also published as: CN109166130A

Abstract

The embodiment of the application discloses an image processing method and an image processing device, wherein the method comprises the following steps: converting original image data into target image data which accords with a target resolution threshold, wherein the original image data comprises characteristic elements; inputting the target image data into a first segmentation processing module for image segmentation to obtain a first segmentation image; cutting the original image data according to the first segmentation image and obtaining image area data which accord with target image parameters; inputting the image area data into a second segmentation processing module for image segmentation to obtain a second segmentation image; and reducing the second segmentation image to the original resolution space of the original image data to obtain the target segmentation result of the characteristic elements, so that the processing efficiency of atrial segmentation of the magnetic resonance image can be improved, and the accuracy of atrial segmentation can be improved.

Description

Image processing method and image processing device

Technical Field

The present invention relates to the field of image processing, and in particular, to an image processing method, an image processing apparatus, and a computer-readable storage medium.

Background

Atrial fibrillation is one of the most common heart rate disorder diseases at present, the probability of occurring in the general population reaches 2 percent, while the incidence rate in the old population is higher and has a certain fatality rate, and the health of human beings is seriously threatened. Current effective treatments for atrial fibrillation are lacking, primarily due to a lack of insight into the anatomy of the atria. Magnetic resonance techniques can be used to generate three-dimensional images of different structures within the heart, while the healthy and fibrotic tissue differences in gadolinium-enhanced magnetic resonance images are more pronounced and are often used to assist in the planning of targeted surgical ablation treatments for atrial fibrillation. Accurate segmentation of the atrium is therefore key to understanding and analyzing atrial fibrosis, contributing to the development of effective treatments and to the advance planning of the surgery.

However, because the contrast ratio of the gadolinium enhanced magnetic resonance image is low, the distinction between the atrial tissue and the surrounding tissue is not obvious enough, the direct segmentation of the atrium, especially the left atrium, is challenging, and currently, in practical applications, the method mainly uses artificial segmentation, which usually consumes a lot of time and has low segmentation accuracy.

Disclosure of Invention

The embodiment of the application provides an image processing method and an image processing device, which can improve the processing efficiency of atrial segmentation of a magnetic resonance image and improve the accuracy of atrial segmentation.

A first aspect of an embodiment of the present application provides an image processing method, including:

converting original image data into target image data which accords with a target resolution threshold, wherein the original image data comprises characteristic elements;

inputting the target image data into a first segmentation processing module for image segmentation to obtain a first segmentation image;

cutting the original image data according to the first segmentation image and obtaining image area data which accord with target image parameters;

inputting the image area data into a second segmentation processing module for image segmentation to obtain a second segmentation image;

and reducing the second segmentation image to the original resolution space of the original image data to obtain the target segmentation result of the characteristic elements.

In an alternative embodiment, the converting the raw image data into the target image data meeting the target resolution threshold includes:

converting the original image data into first image data meeting a first resolution threshold;

and performing downsampling on the first image data to obtain the target image data which accords with the target resolution threshold.

In an optional implementation manner, before the cropping the original image data according to the first segmentation image and obtaining image area data meeting target image parameters, the method further includes:

obtaining barycentric coordinates of the feature elements in the first segmentation image;

the cutting the original image data according to the first segmentation image and obtaining image area data which accord with target image parameters comprises the following steps:

and with the barycentric coordinate as a center, restoring the first segmentation image to the original resolution space, and cutting out area data which accord with the size of a target image.

In an alternative embodiment, the target image parameter comprises a target image size threshold.

In an alternative embodiment, the raw image data comprises a gadolinium enhanced magnetic resonance image.

In an alternative embodiment, the first segmentation processing module comprises a first neural network structure and the second segmentation processing module comprises a second neural network structure.

In an alternative embodiment, the method of training the first neural network structure includes:

converting first original training data into first training data meeting a first resolution threshold, the first original training data comprising first feature elements;

down-sampling the first training data to obtain target training data meeting a second resolution threshold;

inputting the target training data into a first training module to obtain a first feature map;

generating a first target feature map according to the feature map, wherein the resolution of the first target feature map is the second resolution threshold;

fusing the first feature map and the first target feature map to obtain first probability distribution information;

updating the network parameters in the first training module according to the first probability distribution information to obtain the trained first neural network structure.

In an alternative embodiment, the first probability distribution information includes a probability that the element in the first feature map is the left atrium and/or a probability that the element is not the left atrium.

In an alternative embodiment, the training method of the second neural network structure includes:

acquiring barycentric coordinates of second feature elements in the original segmentation data;

cutting the original segmentation data by taking the barycentric coordinate as a center to obtain training area data;

inputting the training area data into a second training module to obtain a second feature map;

generating a second target feature map according to the second feature map, wherein the resolution of the second target feature map is the same as that of the training area data;

fusing the second feature map and the second target feature map to obtain second probability distribution information;

and updating the network parameters in the second training module according to the second probability distribution information to obtain the trained second neural network structure.

In an alternative embodiment, the second probability distribution information includes a probability that the element in the second feature map is the left atrium and/or a probability that the element is not the left atrium.

A second aspect of the embodiments of the present application provides an image processing apparatus, including:

the image conversion module is used for converting original image data into target image data meeting a target resolution threshold, wherein the original image data comprises characteristic elements;

the first segmentation processing module is used for carrying out image segmentation on the target image data to obtain a first segmentation image;

the cutting module is used for cutting the original image data according to the first segmentation image and obtaining image area data which accord with target image parameters;

the second segmentation processing module is used for carrying out image segmentation on the image area data to obtain a second segmentation image;

and the restoring module is used for restoring the second segmentation image to the original resolution space to obtain a target segmentation result of the characteristic element.

A third aspect of embodiments of the present application provides another image processing apparatus, including a processor and a memory, where the memory is configured to store one or more programs configured to be executed by the processor, and where the program includes instructions for performing some or all of the steps described in any of the methods according to the first aspect of embodiments of the present application.

A fourth aspect of embodiments of the present application provides a computer-readable storage medium for storing a computer program for electronic data exchange, wherein the computer program causes a computer to perform some or all of the steps as described in any one of the methods of the first aspect of embodiments of the present application.

According to the embodiment of the application, original image data are converted into target image data which meet a target resolution threshold value, the original image data comprise characteristic elements, the target image data are input into a first segmentation processing module to be subjected to image segmentation, a first segmentation image is obtained, then the original image data are cut according to the first segmentation image to obtain image area data which meet target image parameters, the image area data are input into a second segmentation processing module to be subjected to image segmentation, a second segmentation image is obtained, and finally the second segmentation image is restored to an original resolution space of the original image data, so that a target segmentation result of the characteristic elements is obtained, the processing efficiency of magnetic resonance image atrial segmentation can be improved, and the accuracy of atrial segmentation is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.

Fig. 1 is a schematic flowchart of an image processing method disclosed in an embodiment of the present application;

fig. 2 is a schematic flowchart of a training method of a neural network structure in an image processing method disclosed in an embodiment of the present application;

FIG. 3 is a flowchart illustrating a training method of a neural network structure in another image processing method disclosed in the embodiments of the present application;

fig. 4 is a schematic structural diagram of an image processing apparatus disclosed in an embodiment of the present application;

fig. 5 is a schematic structural diagram of another image processing apparatus disclosed in the embodiment of the present application.

Detailed Description

In order to make the technical solutions of the present invention better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terms "first," "second," and the like in the description and claims of the present invention and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

The image processing apparatus to which embodiments of the present application relate may be an electronic device, including a terminal device, including, but not limited to, other portable devices such as a mobile phone, a laptop computer, or a tablet computer having a touch sensitive surface (e.g., a touch screen display and/or a touch pad), in particular implementations. It should also be understood that in some embodiments, the device is not a portable communication device, but is a desktop computer having a touch-sensitive surface (e.g., a touch screen display and/or touchpad).

The concept of deep learning in the embodiments of the present application stems from the study of artificial neural networks. A multi-layer perceptron with multiple hidden layers is a deep learning structure. Deep learning forms a more abstract class or feature of high-level representation properties by combining low-level features to discover a distributed feature representation of the data.

Deep learning is a method based on characterization learning of data in machine learning. An observation (e.g., an image) may be represented using a number of ways, such as a vector of intensity values for each pixel, or more abstractly as a series of edges, a specially shaped region, etc. Tasks (e.g., face recognition or facial expression recognition) are more easily learned from the examples using some specific representation methods. The benefit of deep learning is to replace the manual feature acquisition with unsupervised or semi-supervised feature learning and hierarchical feature extraction efficient algorithms. Deep learning is a new field in machine learning research, and its motivation is to create and simulate a neural network for human brain to analyze and learn, which simulates the mechanism of human brain to interpret data such as images, sounds and texts.

In machine learning, a Convolutional Neural Network (CNN) is a deep feedforward artificial Neural Network, and has been successfully applied to image recognition. Convolutional neural networks have been applied more and more widely in the image field, and generally, a CNN network mainly includes convolutional layers, pooling layers (pooling), fully-connected layers, lossy layers, and the like. The convolutional neural network includes a one-dimensional convolutional neural network, a two-dimensional convolutional neural network, and a three-dimensional convolutional neural network. One-dimensional convolutional neural networks are often applied to data processing of sequence classes; two-dimensional convolutional neural networks are often applied to the recognition of image-like texts; the three-dimensional convolutional neural network is mainly applied to medical image and video data identification. The neural network mentioned in the embodiments of the present application may be a three-dimensional convolutional neural network. Now that many deep learning frameworks (e.g., MxNet, Caffe, etc.) have been developed, it becomes very simple to train a model.

The following describes embodiments of the present application in detail.

Referring to fig. 1, fig. 1 is a schematic flow chart of an image processing method disclosed in an embodiment of the present application, and as shown in fig. 1, the image processing method can be executed by the image processing apparatus, and includes the following steps;

101. and converting the original image data into target image data which accords with a target resolution threshold, wherein the original image data contains characteristic elements.

The raw image data mentioned in the embodiments of the present application may be three-dimensional images of the heart obtained by various medical imaging devices, such as three-dimensional images of different structures inside the heart generated by magnetic resonance techniques. The Magnetic Resonance technology can generate Magnetic Resonance Imaging (MRI), can determine the chemical structure of a substance and the density distribution of a certain component without destroying a sample, and is rapidly applied to medical treatment, biological engineering and other aspects outside the physical and chemical fields, thereby becoming one of the most powerful methods for analyzing the complex structure of biological macromolecules and diagnosing the disease condition.

While the healthy and fibrotic tissue differences in gadolinium-enhanced magnetic resonance images are more pronounced, they are often used to assist in the formulation of targeted surgical ablation treatment for atrial fibrillation. The raw image data in this application may be a gadolinium enhanced magnetic resonance image.

The feature element may be understood as a segmentation object of image processing, for example, the original image data is a gadolinium-enhanced magnetic resonance image including a heart, and the feature element may be a left atrium, that is, the left atrium segmentation is implemented by the method.

Before the image processing is performed by the deep learning model, the original image data may be preprocessed and converted into the target image data meeting the target resolution threshold, and then step 102 is performed.

The input original image data are unified into the same resolution, so that the image processing efficiency can be improved, and the subsequent convolution processing is facilitated.

Specifically, the step 101 may include:

(1) converting the original image data into first image data meeting a first resolution threshold;

(2) and downsampling the first image data to obtain the target image data which meets the target resolution threshold.

Specifically, the input original image data may be unified into the first resolution threshold by an image cropping and/or an image expansion method, for example, the first resolution threshold stored in advance may be 576 × 96, that is, the step converts the resolution of the original image data to 576 × 96, that is, the first image data is obtained, so as to facilitate subsequent unification processing;

the down-sampling (or called down-sampling) in the embodiment of the present application can be understood as a way of reducing an image, and the main purposes of the down-sampling are two: 1. fitting the image to the size of the display area; 2. a thumbnail of the corresponding image is generated. For an image I with size M × N, S-times down-sampling is carried out to obtain a resolution image with size (M/S) × (N/S), wherein S is a common divisor of M and N, if an image in a matrix form is considered, the image in a window of S × S of an original image is changed into a pixel, and the value of the pixel is the average value of all pixels in the window, in this case, the number of pixels is reduced to the square multiple of the original S.

The image processing apparatus may store the target resolution threshold in advance, for example, 144 × 48 resolution, may down-sample the first image data to obtain target image data with 144 × 48 resolution, and then perform step 102, which may reduce video memory consumption and facilitate subsequent convolution processing.

The video memory is also called a frame buffer, and is used for storing rendering data processed or to be extracted by the video card chip. As with the memory of a computer, video memory is the means used to store graphics information to be processed.

102. And inputting the target image data into a first segmentation processing module for image segmentation to obtain a first segmentation image.

Through the first segmentation processing module, the rough segmentation of the sampled low-resolution target image data can be realized, and the positions of the characteristic elements are estimated according to the segmentation result. Wherein the first segmentation processing module may include a full convolution neural network structure. The convolutional neural network in the embodiment of the application is a feedforward neural network, and the artificial neurons can respond to peripheral units and can perform large-scale image processing. The convolutional neural network includes convolutional layers and pooling layers. In the embodiment of the application, a three-dimensional convolutional neural network can be used, and is mainly applied to medical image and video data identification.

103. And cutting the original image data according to the first segmentation image and obtaining image area data which accord with target image parameters.

The original image data is three-dimensional image data, and a resolution space in which the three-dimensional image data exists is referred to as an original resolution space. Specifically, the center of the first divided image may be reduced to the original resolution space of the original image data, and then the original image data may be clipped to obtain image area data corresponding to the target image parameter. The center of the first segmented image obtained in step 102 may be restored to the original resolution space, wherein the target image parameter includes a target image size threshold, and the original image data is cropped according to the target image parameter and the center of the first segmented image, so as to obtain image region data including the feature elements.

For example, the target image size threshold may be 240 × 160 × 96, i.e., a three-dimensional image region including 240 × 160 × 96 of the sizes around the feature elements may be obtained. Performing step 104 after further cropping the image may improve the accuracy of the image processing.

Optionally, before the cutting the original image data according to the first segmentation image and obtaining image area data meeting the target image parameter, the method further includes:

obtaining barycentric coordinates of feature elements in the first segmentation image;

the above cutting the original image data according to the first segmentation image and obtaining image area data conforming to the target image parameter includes:

and with the barycentric coordinate as a center, restoring the first segmentation image to the original resolution space and cutting original image data to obtain image region data which accords with a target image size threshold.

Specifically, the first segmentation processing module may output a probability map of background and foreground using the neural network, apply a simple binary test to the two volume spatial probability distributions, wherein voxels are assigned as foreground or background according to the corresponding probability, and the binary mask may be used for rough positioning, for example, calculate the center of gravity of the prediction mask (which may be understood as the left atrium in the first segmented image) and use the center as the center of the left atrium, and then restore the center to the original resolution space of the original image data, and use the center as the center to cut out a region with a size of 240 × 160 × 96, and then feed the second segmentation processing module to perform step 104.

104. And inputting the image area data into a second segmentation processing module for image segmentation to obtain a second segmentation image.

The second segmentation processing module can further segment the image area data to obtain the second segmented image. In the step, the trained convolutional neural network is used for accurately segmenting the data of the cut image area, so that a segmentation result with higher accuracy can be obtained.

105. And reducing the second segmentation image to the original resolution space to obtain a target segmentation result of the characteristic elements.

And restoring the second segmentation image after the accurate segmentation to the original resolution space to obtain a target segmentation result aiming at the characteristic elements.

In a general processing method, three-dimensional magnetic resonance data is directly used as input, a large amount of video memory is consumed, the calculation time is long, and the requirement on calculation equipment is high. In the embodiment of the application, the segmentation is divided into two steps of rough positioning and fine segmentation, and two similar neural networks are trained to realize the segmentation, so that the video memory consumption is reduced, and the calculation time is shortened. In specific implementation, only 2 seconds and 2.6GB video memory are needed for completing one-time segmentation, and the method can be deployed on a common computer and is simple to implement.

According to the embodiment of the application, the full convolution neural network can be utilized to carry out full-automatic segmentation on the left atrium in the gadolinium enhanced magnetic resonance image, and compared with the traditional method, the segmentation accuracy is greatly improved.

According to the embodiment of the application, original image data are converted into target image data which meet a target resolution threshold value, the original image data comprise characteristic elements, the target image data are input into a first segmentation processing module to be subjected to image segmentation, a first segmentation image is obtained, the original image data are cut according to the first segmentation image to obtain image area data which meet target image parameters, the image area data are input into a second segmentation processing module to be subjected to image segmentation, a second segmentation image is obtained, and finally the second segmentation image is restored to the original resolution space to obtain a target segmentation result of the characteristic elements.

Referring to fig. 2, fig. 2 is a schematic flow chart of a training method of a neural network structure disclosed in an embodiment of the present application, through which the first neural network structure can be obtained through training and can be used for a first segmentation module to implement its function. The subject performing the steps of the embodiments of the present application may be an image processing apparatus for medical image processing. As shown in fig. 2, the training method of the neural network structure includes the following steps:

201. the first original training data is converted into first training data meeting a first resolution threshold, wherein the first original training data comprises first characteristic elements.

The neural network structure may be trained using the existing segmentation data pairs (images + Masks) before the image processing method shown in fig. 1 is performed. The first feature element may be understood as a segmentation object of image processing, for example, in the embodiment of the present application, the first original training data is a gadolinium-enhanced magnetic resonance image including a heart, and the first feature element is a left atrium, that is, the training method is used to implement left atrium segmentation.

Before training the neural network structure to perform image processing, the first raw image data may be preprocessed and converted into first training data meeting a first resolution threshold, and then step 202 is performed.

Specifically, the input original image data may be unified into the first resolution threshold by an image cropping and/or an image expansion method, for example, the resolution is converted to 576 × 96, so as to obtain the first image data. The input first original image data are unified into the same resolution, so that the image processing efficiency can be improved, and the subsequent convolution processing is facilitated.

202. And performing downsampling on the first training data to obtain target training data meeting a second resolution threshold.

Specifically, the first training data may be sampled, the image processing apparatus may pre-store the second resolution threshold, for example, 144 × 48 resolution, and obtain the target training data with 144 × 48 resolution after sampling, and then execute step 203, which may reduce video memory consumption and facilitate subsequent convolution processing.

203. And inputting the target training data into a first training module to obtain a first feature map.

In the embodiment of the application, for three-dimensional image segmentation, a three-dimensional full convolution neural network structure based on V-Net or 3D-U-Net can be used. There are two main operations in the convolutional network, one is Convolution (Convolution) for feature extraction, and multiple convolutional layers are usually used to obtain a deeper feature map (feature map); one is Pooling (Pooling), which compresses the input feature map, on one hand, the feature map is reduced, the network computation complexity is simplified, and on the other hand, feature compression is performed, and main features are extracted. The pooling layer does not affect the interaction between the feature channels (or channels for short), but operates in each feature channel, and the convolutional layer can interact between the channels, and then generates a new feature channel in the next layer.

V-Net is a fully convolutional neural network that uses convolution operations to extract features at different scales from the data and reduces the resolution by applying the appropriate step size. The left part of the network is the coding structure of a standard convolutional network, which captures context information in a local to global sense, and the right part decodes the signal to its original size and outputs two channels representing the probabilities of the front and back backgrounds, respectively.

The left side of the network is divided into several stages operating at different resolutions, each stage consisting of one or two convolutional layers, and the residual function is learned, i.e. the input of each stage is added to the output of the last convolutional layer of that stage. The convolution performed in each layer uses a volume convolution kernel having a size of 5 × 5 × 5, and the pooling operation is implemented by a convolution operation having a size of 2 × 2 × 2 and a step size of 2. Furthermore, at each stage of the encoding path, the number of eigen-channels doubles, while the resolution halves. At the end of each layer, Batch Normalization (Batch Normalization) and parameterized linear rectification functions (PRelu) were used.

The right side of the network is a symmetric left-hand peer that extracts features and expands spatial support to output a two-channel foreground probability distribution. Like the left part of the network, each stage of the right part also contains one or two convolutional layers, and is also a learning residual function. The convolution performed in each layer also uses a volume kernel having a volume size of 5 × 5 × 5, and the inverse pooling operation is realized by an inverse convolution operation.

Performing convolution, pooling, Batch Normalization and PRelu on the sampled target training data for multiple times, wherein the Batch Normalization is an algorithm for accelerating neural network training, convergence speed and stability, which is often used in a deep network; the PReLU (parametric Rectified Linear Unit), which is a Linear rectification function (ReLU) added with parameter correction, is also called a corrected Linear Unit, is an activation function (activation function) commonly used in artificial neural networks, and generally refers to a nonlinear function represented by a ramp function and a variant thereof, so that the calculation process is simplified. Through the first training module, a plurality of feature maps may be generated. Specifically, continuing with the example where the second resolution threshold in step 202 is 144 × 48 resolution, the first feature map resolutions can be obtained at 144 × 48, 72 × 24, 36 × 12, 18 × 6, and 9 × 9, respectively, and the feature channels are increased from 8 to 128.

204. And generating a first target feature map according to the first feature map, wherein the resolution of the first target feature map is the second resolution threshold.

Specifically, the first feature map may be gradually restored by a deconvolution operation to obtain a first target feature map, where the resolution of the first target feature map is the second resolution threshold, that is, the first feature map may be restored to the same resolution as the target training data after the downsampling, such as the aforementioned resolution 144 × 48, and then step 205 is performed.

205. And fusing the characteristic diagram and the target characteristic diagram to obtain first probability distribution information.

And finally, obtaining first probability distribution information through a softmax layer, wherein the first probability distribution information can comprise two output data with the resolution being the second resolution threshold. The first probability distribution information may include a probability that an element in the first feature map is the left atrium and/or a probability that the element is not the left atrium.

In a specific implementation, the second resolution threshold is 144 × 48, where two output data with the resolution of 144 × 48 may be obtained, and each of the two output data may represent a probability distribution of whether an element in one feature map is the left atrium.

The softmax is understood as normalization, and normally softmax is used in the last layer of the network for final classification and normalization. The dimensions of the input layer and the output layer of softmax are the same, and if the current picture classification is one hundred, the output passing through the softmax layer is a one-hundred-dimensional vector. The first value in the vector is the probability value that the current picture belongs to the first class, and the second value in the vector is the probability value … that the current picture belongs to the second class, the sum of the vectors is 1 in one hundred dimensions.

Step 206 may be performed after obtaining the first probability distribution information described above.

206. And updating the network parameters in the first training module according to the first probability distribution information to obtain the trained first neural network structure.

In particular, the network parameters may be updated using a back propagation algorithm using DICE, IoU, or other loss function until the model converges, which is the first neural network structure described above for coarse segmentation. Where, in statistics, statistical decision theory and economics, a loss function refers to a function that maps an event (an element in a sample space) to a real number expressing the economic or opportunity cost associated with its event. More generally, a loss function is a function that measures the degree of loss and error (such loss is related to a "wrong" estimate, such as cost or loss of equipment) in statistics.

The learning process of the above back propagation algorithm (BP algorithm) is composed of a forward propagation process and a back propagation process. In the forward propagation process, input information passes through the hidden layer through the input layer, is processed layer by layer and is transmitted to the output layer. If the expected output value cannot be obtained in the output layer, taking the square sum of the output and the expected error as an objective function, turning into reverse propagation, calculating the partial derivative of the objective function to the weight of each neuron layer by layer to form the gradient of the objective function to the weight vector, and finishing the learning of the network in the weight modifying process as the basis for modifying the weight. And when the error reaches the expected value, the network learning is finished. According to the method and the device, the trained first neural network structure can be obtained more accurately through the learning rule of the back propagation algorithm.

The first neural network structure may be used in the embodiment shown in fig. 1, that is, the first segmentation processing module may include the first neural network structure to implement the segmentation processing function thereof.

The embodiment of the present application converts first original training data into first training data meeting a first resolution threshold, wherein the first original training data comprises first feature elements, the first training data is down-sampled to obtain target training data which accords with a second resolution threshold, then the target training data is input into a first training module to obtain a first characteristic diagram, generating a first target feature map according to the first feature map, wherein the resolution of the first target feature map is the second resolution threshold, then, the feature map and the target feature map are fused to obtain first probability distribution information, the network parameters in the first training module can be updated according to the first probability distribution information to obtain the trained first neural network structure, so that the accuracy of the neural network is improved, and the neural network structure particularly suitable for left atrium segmentation is obtained.

Referring to fig. 3, fig. 3 is a schematic flow chart of another training method for a neural network structure disclosed in the embodiment of the present application, by which the second neural network structure can be obtained through training and can be used for the second segmentation module to realize the functions thereof. The subject performing the steps of the embodiments of the present application may be an image processing apparatus for medical image processing. As shown in fig. 3, the training method of the neural network structure includes the following steps:

301. and obtaining barycentric coordinates of a second feature element in the original segmentation data.

The original segmentation data is segmentation data used for training a neural network, and the existing segmentation data pairs include images and Masks, may be image data subjected to segmentation processing, and may be from a training data set.

The second feature element may be understood as a segmentation object of image processing, for example, in the embodiment of the present application, the original segmentation data is a gadolinium-enhanced magnetic resonance image including a heart, and the second feature element is a left atrium, that is, the training method is used to implement left atrium segmentation.

The Mask herein is understood to be a special gray scale image with the same resolution as the image data, and the luminance value of each pixel represents the class of the pixel in the image data at the corresponding position of the pixel, and is often used to represent different segmentation objects in the image data, and in particular, when the Mask is 1 in the present application, the corresponding pixel is the left atrium, and when the Mask is 0, the corresponding pixel is the background. The barycentric coordinates of the left atrium can be calculated based on the given Mask in the existing segmented data pairs, and step 302 is performed.

302. And cutting the original segmentation data by taking the barycentric coordinates as a center to obtain training area data.

After obtaining the barycentric coordinates, a fixed-size area completely covering the left atrium around the center in the original segmentation data may be cut out with the coordinates as the center, wherein the resolution of the training area data may be a preset third resolution threshold, such as 240 × 160 × 96, that is, obtaining the training area data with the resolution of 240 × 160 × 96 may reduce the memory consumption and facilitate the subsequent convolution processing.

303. And inputting the training area data into a second training module to obtain a second feature map.

And thirdly, performing convolution, pooling, Batch Normalization and PRelu processing on the training region data (the sampled three-dimensional magnetic resonance image) for multiple times by adopting a V-Net or 3D-U-Net based three-dimensional fully-convolutional neural network structure to generate a feature map. For example, feature maps with resolution sizes of 240 × 160 × 96, 120 × 80 × 48, 60 × 40 × 24, 30 × 20 × 12, and 15 × 20 × 6 may be generated.

The step 303 may refer to the specific description in the step 203 in the embodiment shown in fig. 2, and is not described herein again.

304. And generating a second target characteristic diagram according to the second characteristic diagram. The resolution of the second target feature map is the same as the resolution of the training region data.

Specifically, the second feature map may be gradually restored to the resolution of the training region data, such as the fourth resolution threshold 240 × 160 × 96, by using a deconvolution operation, and then step 305 is performed.

The step 304 may refer to the detailed description in the step 204 in the embodiment shown in fig. 2, and is not repeated here.

305. And fusing the second characteristic diagram and the second target characteristic diagram to obtain second probability distribution information.

And finally, obtaining second probability distribution information through a softmax layer, wherein the second probability distribution information can comprise two output data with the resolution being the fourth resolution threshold. The second probability distribution information may include a probability that the element in the second feature map is the left atrium and/or a probability that the element is not the left atrium.

In a specific implementation, the fourth resolution threshold is 240 × 160 × 96, and two output data with the resolution of 240 × 160 × 96 may be obtained, where each output data may represent a probability distribution of whether an element in a feature map is the left atrium. Step 306 may be performed after obtaining the second probability distribution information described above.

The step 305 may refer to the detailed description in the step 205 in the embodiment shown in fig. 2, and is not described herein again.

306. And updating the network parameters in the second training module according to the second probability distribution information to obtain the trained second neural network structure.

In particular, the network parameters may be updated using a back propagation algorithm with DICE, IoU, or other loss functions until the model converges, which is a second neural network structure for refined segmentation.

The second neural network structure may be used in the embodiment shown in fig. 1, that is, the second segmentation processing module may include the second neural network structure to implement the segmentation processing function thereof.

According to the embodiment of the application, the barycentric coordinate of a second feature element in original segmentation data is obtained, the original segmentation data is cut by taking the barycentric coordinate as a center, training area data is obtained, the training area data is input into a second training module to obtain a second feature map, a second target feature map is generated according to the second feature map, the second feature map and the second target feature map are fused to obtain second probability distribution information, network parameters in the second training module are updated according to the second probability distribution information, the trained second neural network structure is obtained, the accuracy of the neural network is improved, and the neural network structure particularly suitable for left atrium segmentation is obtained.

Through the steps described in fig. 1 and fig. 2, a first neural network structure (network 1) for rough left atrium segmentation and a second neural network structure (network 2) for fine left atrium segmentation can be obtained, and for a completely new gadolinium enhanced magnetic resonance image, the step of segmenting the left atrium can be implemented by applying the two neural networks to the image processing method shown in fig. 1 based on the network 1 shown in fig. 2 and the network 2 shown in fig. 3, that is, before the steps of the embodiment shown in fig. 1, the training methods shown in fig. 2 and fig. 3 can be executed to obtain the network 1 and the network 2, and the network 1 and the network 2 are respectively applied to the first segmentation module and the second segmentation module, so as to complete the image processing process of the embodiment shown in fig. 1.

In a specific implementation, the training data comprises 100 sets of nmr data, wherein each set of data is obtained using a clinical whole-body nmr scanner and comprises the original nmr image data and the corresponding left atrial chamber marker. The raw resolution of these data may be 0.625 x 0.625 cubic millimeters with 47 of 576 x 88 voxels and 53 of 640 x 88 voxels, and it is difficult to apply neural networks to segment directly from such high resolution images due to memory limitations. In fact, the left atrial chamber and even the whole heart only occupy a small part of the magnetic resonance image, and most of other positions in the image are irrelevant tissues, even nothing, so that the segmentation can be divided into two steps. The first is to locate the atria and the second is to divide the atrial chamber from a very small trimmed body, which can be used to train the network on a common home computer. In order to make the input data have the same size and be suitable for the V-Net architecture, the present embodiment uniformly crops and fills all images to a fixed resolution, for example, 576 × 576 × 96 in size.

Therefore, the full convolution neural network is utilized to carry out full-automatic segmentation of the left atrium in the gadolinium enhanced magnetic resonance image, and compared with the traditional method, the method greatly improves the accuracy of segmentation. And if the three-dimensional magnetic resonance data is directly used as input, a large amount of video memory is consumed, the calculation time is long, and the requirement on calculation equipment is high, the segmentation is divided into two steps of rough positioning and fine segmentation, two similar networks are trained, the video memory consumption is reduced, and the calculation time is reduced, for example, only 2 seconds are needed for one segmentation, 2.6GB video memory is needed for one segmentation, and the method can be deployed on a common household computer, and is simple to implement.

The embodiment of the application is suitable for the medical field, for example, after a cardiologist obtains a gadolinium enhanced magnetic resonance image of a patient, the cardiologist can rapidly and automatically segment the left atrium of the patient by using the method, and then the cardiologist can preliminarily judge whether an abnormality occurs according to the three-dimensional structure of the left atrium of the patient, and further can plan an operation through the three-dimensional structure of the left atrium, such as an intervention path of a catheter.

After obtaining the three-dimensional structure of the left atrium, a cardiologist can understand and study the cause of the cardiac tissue fibrosis by combining with the segmentation of the fibrotic tissue, so that the cardiac tissue fibrosis is fundamentally prevented or delayed, and the morbidity and the mortality of a patient are reduced.

The above description has introduced the solution of the embodiment of the present application mainly from the perspective of the method-side implementation process. It is to be understood that the image processing apparatus includes hardware structures and/or software modules corresponding to the respective functions in order to implement the above-described functions. Those of skill in the art would appreciate that the invention can be implemented in hardware or a combination of hardware and computer software, with the exemplary elements and algorithm steps described in connection with the embodiments disclosed herein. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The embodiment of the present application may perform division of functional modules on the image processing apparatus according to the above method, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. It should be noted that, in the embodiment of the present application, the division of the module is schematic, and is only one logic function division, and there may be another division manner in actual implementation.

Referring to fig. 4, fig. 4 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present disclosure. As shown in fig. 4, the image processing apparatus 400 includes:

an image conversion module 410, configured to convert original image data into target image data meeting a target resolution threshold, where the original image data includes feature elements;

a first segmentation processing module 420, configured to perform image segmentation on the target image data to obtain a first segmented image;

a cropping module 430, configured to crop the original image data according to the first segmentation image and obtain image area data meeting target image parameters;

a second segmentation processing module 440, configured to perform image segmentation on the image region data to obtain a second segmented image;

the restoring module 450 is configured to restore the second segmentation image to an original resolution space of the original image data, so as to obtain a target segmentation result of the feature element.

Optionally, the image conversion module 410 includes a first conversion unit 411 and a second conversion unit 412,

the first conversion unit 411 is configured to convert the original image data into first image data meeting a first resolution threshold;

the second conversion unit 412 is configured to perform downsampling on the first image data to obtain the target image data meeting the target resolution threshold.

Optionally, the restoring module 450 is further configured to, before the cropping module 430 crops the original image data according to the first segmented image and obtains image region data meeting target image parameters, obtain barycentric coordinates of the feature elements in the first segmented image;

the cropping module 430 is further configured to: and with the barycentric coordinate as a center, restoring the first segmentation image to the original resolution space, and cutting out area data which accord with the size of the target image.

Optionally, the target image parameter includes a target image size threshold.

Optionally, the raw image data comprises a gadolinium enhanced magnetic resonance image.

Optionally, the first segmentation processing module 420 includes a first neural network structure, and the second segmentation processing module 440 includes a second neural network structure.

In an alternative embodiment, the modules of the image processing apparatus 400 may also be used for training the first neural network structure;

the first converting unit 411 is further configured to convert the first original training data into first training data meeting a first resolution threshold, where the first original training data includes a first feature element;

the second converting unit 412 is further configured to perform downsampling on the first training data to obtain target training data meeting a second resolution threshold;

the image processing apparatus 400 further comprises a first training module 460, configured to:

obtaining a first feature map according to the target training data;

Optionally, the first probability distribution information includes a probability that the element in the first feature map is the left atrium and/or a probability that the element is not the left atrium.

In an alternative embodiment, the modules of the image processing apparatus 400 may also be used for training the second neural network structure;

the reduction module 450 is further configured to:

the image processing apparatus 400 further comprises a second training module 470, configured to:

obtaining a second feature map according to the training area data;

Optionally, the second probability distribution information includes a probability that the element in the second feature map is the left atrium and/or a probability that the element is not the left atrium.

By implementing the image processing apparatus 400 shown in fig. 4, the image processing apparatus 400 may convert original image data into target image data that meets a target resolution threshold, where the original image data includes a feature element, input the target image data into a first segmentation processing module to perform image segmentation to obtain a first segmentation image, crop the original image data according to the first segmentation image to obtain image region data that meets a target image parameter, input the image region data into a second segmentation processing module to perform image segmentation to obtain a second segmentation image, and finally restore the second segmentation image to an original resolution space of the original image data to obtain a target segmentation result of the feature element, which may improve processing efficiency of magnetic resonance image atrial segmentation and improve accuracy of atrial segmentation.

Referring to fig. 5, fig. 5 is a schematic structural diagram of another image processing apparatus disclosed in the embodiment of the present application. As shown in fig. 5, the image processing apparatus 500 includes a processor 501 and a memory 502, wherein the image processing apparatus 500 may further include a bus 503, the processor 501 and the memory 502 may be connected to each other through the bus 503, and the bus 503 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus 503 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 5, but this is not intended to represent only one bus or type of bus. The image processing apparatus 500 may further include an input/output device 504, and the input/output device 504 may include a display screen, such as a liquid crystal display screen. Memory 502 is used to store one or more programs containing instructions; processor 501 is configured to call instructions stored in memory 502 to perform some or all of the method steps described above in the embodiments of fig. 1, 2, and 3. The processor 501 may implement the functions of the modules in the image processing apparatus 500 in fig. 5.

By implementing the image processing apparatus 500 shown in fig. 5, the image processing apparatus may convert original image data into target image data that meets a target resolution threshold, where the original image data includes a feature element, input the target image data into a first segmentation processing module to perform image segmentation to obtain a first segmentation image, crop the original image data according to the first segmentation image to obtain image region data that meets a target image parameter, input the image region data into a second segmentation processing module to perform image segmentation to obtain a second segmentation image, and finally restore the second segmentation image to an original resolution space of the original image data to obtain a target segmentation result of the feature element, which may improve processing efficiency of magnetic resonance image atrial segmentation and improve accuracy of atrial segmentation.

Embodiments of the present application also provide a computer storage medium, wherein the computer storage medium stores a computer program for electronic data exchange, and the computer program causes a computer to execute a part or all of the steps of any one of the image processing methods as described in the above method embodiments.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules (or units) is only one logical division, and there may be other divisions in actual implementation, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or modules through some interfaces, and may be in an electrical or other form.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.

The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a memory and includes several instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned memory comprises: various media capable of storing program codes, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash memory disks, read-only memory, random access memory, magnetic or optical disks, and the like.

The foregoing embodiments of the present invention have been described in detail, and the principles and embodiments of the present invention are explained herein by using specific examples, which are only used to help understand the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. An image processing method, characterized in that the method comprises:

inputting the target image data into a first segmentation processing module for image segmentation to obtain a first segmentation image; the first segmentation processing module comprises a first neural network structure;

the training method of the first neural network structure comprises the following steps:

fusing the first feature map and the first target feature map to obtain first probability distribution information; the first probability distribution information includes a probability that an element in the first feature map is the left atrium and/or a probability that the element is not the left atrium;

updating network parameters in the first training module according to the first probability distribution information to obtain the trained first neural network structure;

with the barycentric coordinate as a center, reducing the first segmentation image to an original resolution space, and cutting out area data which accord with the size of a target image;

inputting the image area data into a second segmentation processing module for image segmentation to obtain a second segmentation image; the second segmentation processing module comprises a second neural network structure;

2. The image processing method of claim 1, wherein the converting the raw image data into the target image data that meets a target resolution threshold comprises:

3. The image processing method according to claim 1 or 2, wherein the target image parameter comprises a target image size threshold.

4. An image processing method as claimed in claim 1 or 2, characterized in that the raw image data comprises a gadolinium enhanced magnetic resonance image.

5. An image processing method as claimed in claim 3, characterized in that the raw image data comprises a gadolinium enhanced magnetic resonance image.

6. The image processing method of claim 1, wherein the training method of the second neural network structure comprises:

7. The image processing method according to claim 6, wherein the second probability distribution information includes a probability that an element in the second feature map is the left atrium and/or a probability that the element is not the left atrium.

8. An image processing apparatus characterized by comprising:

the first segmentation processing module is used for carrying out image segmentation on the target image data to obtain a first segmentation image; the first segmentation processing module comprises a first neural network structure;

a first conversion unit, further configured to convert first original training data into first training data meeting a first resolution threshold, where the first original training data includes a first feature element;

the second conversion unit is also used for carrying out downsampling on the first training data to obtain target training data meeting a second resolution threshold;

the image processing apparatus further comprises a first training module configured to:

obtaining a first feature map according to the target training data;

updating network parameters in the first training module according to the first probability distribution information to obtain the trained first neural network structure; the first probability distribution information includes a probability that an element in the first feature map is the left atrium and/or a probability that the element is not the left atrium;

the restoring module is used for obtaining barycentric coordinates of the characteristic elements in the first segmentation image;

the cutting module is used for reducing the first segmentation image to an original resolution space by taking the barycentric coordinate as a center, and cutting out area data which accord with the size of a target image;

the second segmentation processing module is used for carrying out image segmentation on the image area data to obtain a second segmentation image; the second segmentation processing module comprises a second neural network structure;

the restoring module is further configured to restore the second segmentation image to an original resolution space of the original image data, and obtain a target segmentation result of the feature element.

9. The image processing apparatus according to claim 8, wherein the image conversion module includes a first conversion unit and a second conversion unit, wherein:

the first conversion unit is used for converting the original image data into first image data which accords with a first resolution threshold;

the second conversion unit is configured to perform downsampling on the first image data to obtain the target image data meeting the target resolution threshold.

10. The image processing apparatus according to claim 8 or 9, wherein the target image parameter comprises a target image size threshold.

11. An image processing apparatus as claimed in claim 8 or 9, characterized in that the raw image data comprises a gadolinium enhanced magnetic resonance image.

12. The image processing apparatus of claim 10, wherein the raw image data comprises a gadolinium enhanced magnetic resonance image.

13. The image processing apparatus of claim 8, wherein the image processing apparatus is further configured to train the second neural network structure;

the restoring module is further used for obtaining barycentric coordinates of second characteristic elements in the original segmentation data;

the image processing apparatus further comprises a second training module configured to:

14. The image processing apparatus according to claim 13, wherein the second probability distribution information includes a probability that an element in the second feature map is the left atrium and/or a probability that the element is not the left atrium.

15. An image processing apparatus comprising a processor and a memory for storing one or more programs configured for execution by the processor, the programs comprising instructions for performing the method of any of claims 1-7.

16. A computer-readable storage medium for storing a computer program for electronic data exchange, wherein the computer program causes a computer to perform the method according to any one of claims 1-7.