WO2020133124A1 - 医疗矢面影像取得方法、神经网络训练方法及计算机装置 - Google Patents

医疗矢面影像取得方法、神经网络训练方法及计算机装置 Download PDF

Info

Publication number
WO2020133124A1
WO2020133124A1 PCT/CN2018/124565 CN2018124565W WO2020133124A1 WO 2020133124 A1 WO2020133124 A1 WO 2020133124A1 CN 2018124565 W CN2018124565 W CN 2018124565W WO 2020133124 A1 WO2020133124 A1 WO 2020133124A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
sagittal
image
dimensional
medical image
Prior art date
Application number
PCT/CN2018/124565
Other languages
English (en)
French (fr)
Inventor
孙永年
蔡佩颖
谢佳茹
黄诗婷
黄榆涵
Original Assignee
孙永年
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 孙永年 filed Critical 孙永年
Priority to PCT/CN2018/124565 priority Critical patent/WO2020133124A1/zh
Publication of WO2020133124A1 publication Critical patent/WO2020133124A1/zh

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B8/00Diagnosis using ultrasonic, sonic or infrasonic waves
    • A61B8/08Detecting organic movements or changes, e.g. tumours, cysts, swellings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Definitions

  • the invention relates to a neural network and a computer device, in particular to a neural network and a computer device for medical sagittal images.
  • Medical examination equipment includes ultrasound examination equipment, computer tomography and other equipment. Taking prenatal fetal growth examination as an example, ultrasound examination plays a very important role in prenatal diagnosis. Physicians can use the ultrasound images of the first trimester to measure the zona pellucida and growth parameters of the fetal neck, as a screening test for early Down syndrome, fetal genetic defects or hypoplasia. However, fetal ultrasound images usually have too many noises, blurred borders and other shortcomings. Moreover, the ultrasound images of the first trimester are due to the incomplete development of the fetus and the inadequate boundaries caused by the attachment of the fetus to the endometrium. Measurement and evaluation mostly rely on the experience of professional clinical staff, and are also prone to human error.
  • the thickness of the zona pellucida of the fetal neck needs to be measured on the mid-sagittal plane (MSP) of the fetus, but how to find the correct observation surface in the ultrasound image is a very time-consuming and difficult technique.
  • MSP mid-sagittal plane
  • an object of the present invention is to provide a neural network capable of finding the sagittal plane and related methods and computer devices.
  • a neural network training method for obtaining medical sagittal images includes: using the first neural network to generate expected sagittal masks in 3D medical images; based on 3D medical images and expected sagittal masks to obtain expected results; based on 3D medical images and identification A vector mask is used to obtain the identification result; the second neural network is used for the expected result and the identification result; the loss function data is generated according to the output of the second neural network; and the parameters of the first neural network or the second neural network are adjusted according to the loss function data.
  • the three-dimensional medical image is an ultrasound image, a computed tomography image, a panoramic radiography image, or a magnetic resonance image.
  • the expected result is generated by the combined operation of the three-dimensional medical image and the expected sagittal mask
  • the identification result is generated by the combined operation of the three-dimensional medical image and the identified sagittal mask.
  • the first neural network is a convolutional neural network, which includes multiple convolutional layers, flat layers, reshaping layers, and multiple deconvolutional layers.
  • the second neural network is a convolutional neural network, which includes multiple convolutional layers and a flat layer.
  • the first neural network and the second neural network are generative adversarial networks.
  • the loss function data includes first loss function data and second loss function data.
  • the first loss function data is used to adjust the first neural network; the second loss function data is used to adjust the second neural network.
  • the training method further includes creating a three-dimensional medical image from multiple two-dimensional medical images.
  • the training method further includes: generating a two-dimensional sagittal image based on the expected sagittal mask and the three-dimensional medical image.
  • a method for obtaining medical sagittal images includes: using a first neural network to generate a desired sagittal mask on a three-dimensional medical image; and generating a two-dimensional sagittal image based on the expected sagittal mask and the three-dimensional medical image.
  • the step of generating a two-dimensional sagittal image includes: generating sagittal description data according to an expected sagittal mask; and performing coordinate transformation on the three-dimensional medical image according to the sagittal description data to generate a two-dimensional sagittal image .
  • the three-dimensional medical image is an ultrasound image, a computed tomography image, a panoramic radiograph image, or a magnetic resonance image.
  • the first neural network is a convolutional neural network, which includes multiple convolutional layers, flat layers, reshaping layers, and multiple deconvolution layers.
  • the obtaining method further includes creating a three-dimensional medical image from multiple two-dimensional medical images.
  • a computer device performs the method described above.
  • the computer device includes a processing core and a storage element.
  • the storage element stores the program code of the above method.
  • the processing core is coupled to the storage element and executes the program code to perform the above method.
  • the neural network is used to detect the sagittal plane, for example, the fetal median sagittal is detected from the ultrasound image flat.
  • the neural network can be regarded as a filter to learn the feature points in medical images and their positions in three-dimensional space, and generate a three-dimensional mask containing plane position information, after post-processing conversion, and finally get the median sagittal plane image .
  • FIG. 1 is a block diagram of a system for processing medical sagittal images according to an embodiment.
  • FIG. 2A is a block diagram of an embodiment of a neural network training method for acquiring medical sagittal images.
  • FIG. 2B is a block diagram of a combined operation according to an embodiment.
  • FIG. 2C is a block diagram of 3D medical image generation according to an embodiment.
  • FIG. 3 is a schematic diagram of a first neural network according to an embodiment.
  • FIG. 4 is a schematic diagram of a second neural network according to an embodiment.
  • FIG. 5 is a block diagram of an embodiment of a method for acquiring medical sagittal images.
  • FIG. 6 is a schematic diagram of image coordinate transformation according to an embodiment.
  • FIG. 7A is a schematic diagram of the sagittal plane of one embodiment.
  • 7B to 7D show ultrasound images of medical sagittal images.
  • FIG. 1 is a block diagram of a system for processing medical sagittal images according to an embodiment.
  • the system includes a medical imaging device 1, a computer device 2, and an output device 3.
  • the computer device 2 includes a processing core 21, a storage element 22, and a plurality of input/output interfaces 23, 24.
  • the processing core 21 is coupled to the storage element 22 and the input/output interface 23, 24.
  • the input/output interface 23 can receive the medical image 11 generated by the medical imaging device 1, the input/output interface 24 communicates with the output device 3, and the computer device 2 can output the processing result to the output device 3 through the input/output interface 24.
  • the storage element 22 stores the program code for the processing core 21 to execute.
  • the storage element 22 includes a non-volatile memory and a volatile memory.
  • the non-volatile memory is, for example, a hard disk, a flash memory, a solid state disk, an optical disk, or the like.
  • Volatile memory is, for example, dynamic random access memory, static random access memory, or the like.
  • the program code is stored in a non-volatile memory, and the processing core 21 may load the program code from the non-volatile memory to the volatile memory, and then execute the program code.
  • the processing core 21 is, for example, a processor, a controller, etc.
  • the processor includes one or more cores.
  • the processor may be a central processor or a graphics processor, and the processing core 21 may also be a processor or a graphics processor core.
  • the processing core 21 may also be a processing module, and the processing module includes multiple processors, for example, including a central processor and a graphics processor.
  • the medical imaging device 1 can generate medical images 11, which are, for example, ultrasound equipment, computed tomography, panoramic X-ray radiographic images, or magnetic nuclear resonance equipment.
  • the medical image 11 generated by the medical imaging device 1 can be first transmitted to the storage medium, and then input from the storage medium to the input/output interface 23.
  • the input/output interface 23 is, for example, a peripheral transmission port, and the storage medium is, for example, a non-volatile memory.
  • the medical imaging device 1 may be connected to the input/output interface 23 in a wired or wireless manner, and the medical image 11 is transmitted from the medical imaging device 1 to the input/output interface 23 through the connection method.
  • the input/output interface 23 is, for example, a communication port.
  • the computer device 2 can perform a neural network training method for obtaining a medical sagittal image.
  • the storage element 22 stores the program code, model and trained parameters of the training method.
  • the processing core 21 executes these program codes to perform the training method.
  • the training method includes: Use the first neural network to generate the expected sagittal mask in the 3D medical image; obtain the expected result based on the 3D medical image and the expected sagittal mask; obtain the certified result based on the 3D medical image and the certified sagittal mask; use the second neural network to The expected result and the identification result; generating loss function data according to the output of the second neural network; and adjusting the parameters of the first neural network or the second neural network according to the loss function data.
  • the computer device 2 can perform a method of acquiring a medical sagittal image.
  • the storage element 22 stores the relevant program code, model and parameters of the acquisition method.
  • the processing core 21 executes these program codes to perform the acquisition method.
  • the acquisition method includes: using the first neural network Generating a desired sagittal mask in the three-dimensional medical image; and generating a two-dimensional sagittal image based on the expected sagittal mask and the three-dimensional medical image.
  • the output device 3 is a device capable of outputting images, such as a display, a projector, a printer, and so on.
  • the computer device 2 can obtain the medical sagittal image by outputting the generated two-dimensional sagittal image to the output device 3.
  • the first neural network is a neural network trained from a neural network training method for obtaining medical sagittal images, and uses sagittal detection as filtering, filtering the sagittal plane from the three-dimensional medical image and generating a three-dimensional binary mask, the filtered information Not only the required features on the sagittal plane are left, but also the position information.
  • a plane can be found from the spatial volume of the three-dimensional medical image, which can accurately cut the target in the image into two halves and still have the required features.
  • the method in this case can overcome the shortcomings of the above-mentioned intuition method which is very time-consuming and inefficient, and overcomes the fact that the intuition method only depends on the two-dimensional image dimension to determine the exact position.
  • FIG. 2A is a block diagram of an embodiment of a neural network training method for acquiring medical sagittal images.
  • the actual program code and data of the first neural network 41, the expected result generation 42, the determination result generation 43, the second neural network 44 and the loss function calculation 45 can be stored in the storage element 22 of FIG. 1 and provided to the processing core 21 for execution and
  • the three-dimensional medical image 51, the expected sagittal mask 52, the expected result 53, the certified sagittal mask 54, the certified result 55, and the loss function data 56, 57 may be stored or loaded in the storage element 22 to be provided to the processing core 21 for processing.
  • the first neural network 41 is used in the three-dimensional medical image 51 to generate the expected sagittal mask 52.
  • the expected result 42 is generated based on the three-dimensional medical image 51 and the expected sagittal mask 52 to obtain the expected result 53.
  • the result 43 is determined to be based on the three-dimensional medical image 51 and the identification vector mask 54 to obtain the identification result 55
  • the second neural network 44 is used for the expected result 42 and the identification result 43 to generate the output
  • the loss function calculation 45 is based on the output of the second neural network 44 and uses the loss function to calculate and generate
  • the training method can use deep learning (deep learning) to automatically detect the vector surface.
  • the first neural network 41 and the second neural network 44 can be two subnet parts that generate an adversarial network, and the first neural network 41 is used as a generation Generator, the second neural network 44 acts as a critic.
  • the output loss of the antagonist can be used to adjust or optimize the generator and the antagonist respectively.
  • the three-dimensional medical image 51 is used as an input image.
  • the three-dimensional medical image 51 is an ultrasound image, a computed tomography image, a panoramic radiography image, a magnetic resonance image, or the like.
  • the ultrasound image is, for example, a whole-body ultrasound image or a local ultrasound image
  • the local ultrasound image is, for example, an ultrasound image of a head, an ultrasound image of a neck, an ultrasound image of a head and neck, or an ultrasound image of another part, or the like.
  • the three-dimensional medical image 51 may be the medical image 11 generated by the medical imaging device 1 of FIG.
  • the medical image 11 itself is a three-dimensional image, or the three-dimensional medical image 51 is generated from the medical image 11, for example, the medical image 11 is a plurality of two Three-dimensional images, these two-dimensional images respectively represent the target object in different cross-sections or at different coordinate planes. Three-dimensional medical images 51 are created from these two-dimensional images.
  • the first neural network 41 takes the three-dimensional medical image 51 as input and processes it, and the processing result is used as the expected sagittal mask 52.
  • the first neural network 41 is designed as a filter, which learns the feature points in the three-dimensional medical image 51 and their positions in the three-dimensional space, and generates a three-dimensional expected sagittal mask 52 containing plane position information . It is expected that the dimension and scale of the sagittal mask 52 may be the same as the dimension and scale of the three-dimensional medical image 51.
  • the first neural network 41 can take a cropped volume from the three-dimensional medical image 51 as input and output a three-dimensional mask as an expected sagittal mask 52.
  • the three-dimensional mask is, for example, a three-dimensional binary mask (3D binary mask).
  • the first The input and output of the neural network 41 have the same dimensions and scale.
  • the position information of the sagittal plane is embedded in the three-dimensional mask. If the voxel exists in the sagittal plane, the corresponding mask value is 1, otherwise, if the voxel is excluded from the sagittal plane, the corresponding mask value is 0.
  • the expected result generation 42 is to generate the expected result 53 based on the three-dimensional medical image 51 and the expected sagittal mask 52.
  • the generation method is, for example, a combination operation.
  • the authorization result generation 43 is to generate the authorization result 55 based on the three-dimensional medical image 51 and the authorization sagittal mask 54.
  • the generation method is, for example, a combination operation, and the authorization sagittal mask 54 may have the same dimensions and scale as the three-dimensional medical image 51.
  • the three-dimensional medical image 51, the expected sagittal mask 52, and the identified sagittal mask 54 may have the same dimensions and scale.
  • the combined operation is shown in FIG. 2B.
  • the expected result generation 42 and the determination result generation 43 can use the same way of the combined operation, and the program code of the combined operation can be shared by the expected result generation 42 and the determination result generation 43.
  • the combining operation includes multiplication 46 and series combining 47.
  • the three-dimensional mask and the three-dimensional medical image 51 are first subjected to a combination operation, which includes element-wise multiplication of the matrix, because the original three-dimensional medical image Intensity plane (intensity plane) is obtained in 51.
  • the result of the multiplication 46 and the three-dimensional medical image 51 are concatenated in combination, and the output of the concatenation 47 is two-channel data and sent to the second neural network 44.
  • the three-dimensional mask is the expected sagittal mask 52, and the output of the tandem combination 47 is the expected result 53.
  • the three-dimensional mask is the authorization sagittal mask 54, and the output of the tandem combination 47 is the authorization result 55.
  • the second neural network 44 takes the expected result 53 and the identification result 55 as input and processes it, and the processed result is output to the loss function calculation 45.
  • the loss function data 56, 57 includes the first loss function data 56 and the second loss function data 57
  • the first loss function data 56 is used to adjust the first neural network 41
  • the second loss function data 57 is used To adjust the second neural network 44.
  • the filter weights of the first neural network 41 and the second neural network 44 are trained using the loss function data 56, 57.
  • the loss function can be WGAN-GP or a modified version based on WGAN-GP, and the loss function data 56, 57 can be generated using the following formula:
  • FIG. 2C is a block diagram of 3D medical image generation according to an embodiment.
  • a plurality of 2D medical images 58 may be generated to create the 3D medical image in FIG. 2A 51.
  • the two-dimensional medical image 58 is a two-dimensional ultrasonic image
  • the three-dimensional medical image 51 is a three-dimensional ultrasonic image.
  • the actual program code and data generated by the three-dimensional medical image 48 can also be stored in the storage element 22 of FIG. 1 and provided to the processing core 21 for execution and processing.
  • FIG. 3 is a schematic diagram of a first neural network according to an embodiment.
  • the first neural network 41 is, for example, a convolutional neural network, which includes a plurality of convolutional layers 411, a flat layer 412, a reshaping layer 414, and Multiple deconvolution layers 415.
  • the first half of the first neural network 41 can be regarded as an encoder, and the second half can be regarded as a decoder.
  • the convolution layer 411 and the flat layer 412 can be regarded as an encoder.
  • the reshaping layer 414 and the deconvolution layer 415 can be Treated as a decoder.
  • the final output of the deconvolution layer 415 serves as the expected sagittal mask 52.
  • each convolutional layer 411 uses a kernel of the same scale and has a stride of the same scale, for example, a kernel of size 3 ⁇ 3 ⁇ 3 and a size of 1 ⁇ 1 ⁇ 1.
  • the kernel can also be called a filter.
  • the thickness of a feature map (channel) is gradually increased with the number of layers of the convolution layer 411. For example, after the first layer of convolution layer 411a, each time the layer of convolution layer 411 is doubled In FIG.
  • the number of input channels of the first layer convolution layer 411a is 1, and then after the output of the first layer convolution layer 411a, the convolution layers 411a to 411d from the first layer to the fourth layer,
  • the number of channels gradually increases from 4, to 8, 16, and 32.
  • the size of the data amount changes from 80 3 to 40 3 , 20 3 , 10 3 , and 5 3 in sequence.
  • the convolution layers 411 may also use kernels of different sizes, or only part of the convolution layers 411 use kernels of the same size and not all convolution layers 411 use kernels of the same size.
  • the convolution layers 411 may also use different sizes of strides, or only part of the convolution layers 411 may use the same size steps, and not all convolution layers 411 may use the same size steps.
  • the number of channels can also be changed in other ways, not limited to multiplication.
  • the output of the convolutional layer 411 can be further processed before entering the next convolutional layer 411 or the flat layer 412, such as a linear rectification layer (Rectified Linear Units layer, ReLU layer) or pooling layer (pooling layer), linear
  • the rectifying layer uses a leaky rectifying function (Leaky ReLU)
  • the pooling layer is, for example, a max pooling layer.
  • the output of each convolutional layer 411 can be processed by the linear rectification layer and the maximum pooling layer before entering the next convolutional layer 411, and the final convolutional layer 411 is output to the flat layer 412, each linear
  • the rectification layer uses a leakage rectification function.
  • Each pooling layer is the largest pooling layer.
  • the largest pooling layer uses a 2 ⁇ 2 ⁇ 2 core and has a 2 ⁇ 2 ⁇ 2 step size.
  • only part of the output of the convolutional layer 411 may be further processed before entering the next convolutional layer 411 or the flat layer 412. It is not necessary that all the output of the convolutional layer 411 be linearly rectified Layer or pooling layer.
  • the flat layer 412 may be followed by two fully connected layers 413 and then to the remodeling layer 414.
  • the fully connected layer 413 may also have a linear rectifying layer.
  • the linear rectifying layer uses a leakage rectifying function.
  • the reshaped layer 414 is connected to the deconvolution layer 415.
  • the amount of data after the flat layer 412 to the reshaping layer 414 is 4000, 500, and 4000.
  • each deconvolution layer 415 uses kernels of the same size and has a step size of the same size, for example, a kernel of size 3 ⁇ 3 ⁇ 3 and a step size of 2 ⁇ 2 ⁇ 2. .
  • the number of channels gradually decreases with the number of deconvolution layers 415, for example, it is doubled to the last layer through one deconvolution layer 415.
  • the number of input channels of the first deconvolution layer 415 is 32, and then after the output of the first deconvolution layer 415a, the deconvolution layer 415a from the first layer to the fourth layer ⁇
  • the number of channels gradually decreases from 32 to 16, 8, and 4
  • the number of output channels of the last deconvolution layers 415a to 415d is 1.
  • the size of the data volume changes from 5 3 to 10 3 , 20 3 , 40 3 , and 80 3 in sequence.
  • the deconvolution layers 415 may also use kernels of different sizes, or only some of the deconvolution layers 415 use kernels of the same size and not all deconvolution layers 415 use kernels of the same size.
  • the deconvolution layers 415 may also use strides of different sizes, or only part of the deconvolution layers 415 may use strides of the same size and not all deconvolution layers 415 may use strides of the same size.
  • the number of channels can also be changed in other ways, and is not limited to doubling.
  • the output of the deconvolution layer 415 can be further processed before entering the next deconvolution layer 415. Further processing is, for example, a linear rectification layer (Rectified Linear Units layer, ReLU layer).
  • the linear rectification layer uses a leakage rectification function ( leaky ReLU).
  • leaky ReLU leaky ReLU
  • the output of each deconvolution layer 415 can be processed by a linear rectification layer before entering the next deconvolution layer 415, and the convolution layer 415d of the last layer is provided with a sigmoid layer ( sigmoid layer), each linear rectification layer uses a leakage rectification function.
  • only part of the output of the deconvolution layer 415 may be further processed before entering the next deconvolution layer 415. It is not necessary that the output of all the deconvolution layers 415 have to go through the linear rectification layer .
  • FIG. 4 is a schematic diagram of a second neural network according to an embodiment.
  • the architecture of the second neural network is similar to the encoder portion of the first neural network.
  • the second neural network 44 is, for example, a convolutional neural network, which A plurality of convolution layers 441 and flat layers 442 are included.
  • the output of the flat layer 442 is provided to the loss function for calculation.
  • each convolution layer 441 uses kernels of the same size and has a step size of the same size, for example, a kernel of a size of 3 ⁇ 3 ⁇ 3 and a step size of 1 ⁇ 1 ⁇ 1.
  • the number of channels gradually increases with the number of layers of the convolutional layer 441. For example, after the first layer of convolutional layer 441a, each time a layer of convolutional layer 441 is multiplied, in FIG. 4, the input of the first layer of convolutional layer 441a The number of channels is 2, and after the output of the first layer convolution layer 441a, the number of channels from the first layer to the fourth layer convolution layers 411a to 411d gradually increases from 4, to 8, 16, and 32.
  • the size of the data volume changes from 80 3 to 40 3 , 20 3 , 10 3 , and 5 3 in sequence.
  • the convolution layers 441 may use kernels of different sizes, or only some of the convolution layers 441 use kernels of the same size, and not all convolution layers 441 use kernels of the same size.
  • the convolutional layers 441 may also use different sizes of strides, or only some of the convolutional layers 441 may use the same size steps. Not all convolutional layers 441 may use the same size steps.
  • the number of channels can also be changed in other ways, not limited to multiplication.
  • the output of the convolutional layer 441 can be further processed before entering the next convolutional layer 441 or the flat layer 442.
  • the further processing is, for example, a linear rectification layer, a sigmoid layer, or a pooling layer.
  • the pooling layer is, for example, the maximum pooling layer.
  • the output of each convolutional layer 441a-441c can be processed by the linear rectification layer and the maximum pooling layer before entering the next convolutional layer 441b-441d, and the convolutional layer of the last layer 441d is processed by the sigmoid layer and the largest pooling layer and then output to the flat layer 412.
  • Each linear rectification layer uses a leakage rectification function
  • each pooling layer is the largest pooling layer
  • the largest pooling layer uses 2 ⁇ 2 ⁇ 2 size
  • the kernel has a size of 2 ⁇ 2 ⁇ 2.
  • only part of the output of the convolutional layer 441 may be further processed before entering the next convolutional layer 441 or the flat layer 442. It is not necessary that all the output of the convolutional layer 441 be linearly rectified Layer, sigmoid layer, or pooling layer.
  • the final output of the second neural network 44 may not be a value, but a latent vector, thereby representing the part of the true or false mask. In FIG. 4, the amount of data after the flat layer 412 is 4000.
  • FIG. 5 is a block diagram of an embodiment of a method for acquiring medical sagittal images.
  • the first neural network 41 can be used to obtain medical sagittal images after being trained by the aforementioned method.
  • the actual program code and data of the first neural network 41 and the two-dimensional sagittal image generation 49 can be stored in the storage element 22 of FIG. 1 and provided to the processing core 21 for execution and processing, the three-dimensional medical image 51, the expected sagittal mask 52, and two
  • the dimensional sagittal image 59 can be stored or loaded in the storage element 22 for processing by the processing core 21.
  • the first neural network 41 is used in the three-dimensional medical image 51 to generate the expected sagittal mask 52.
  • the two-dimensional sagittal image generation 49 the two-dimensional sagittal image 59 is generated based on the expected sagittal mask 52 and the three-dimensional medical image 51.
  • the step of generating a two-dimensional sagittal image includes: generating sagittal description data according to the expected sagittal mask 52; and performing coordinate transformation on the three-dimensional medical image 51 according to the sagittal description data to generate a two-dimensional sagittal image 59.
  • the vector plane description data is, for example, a plane expression in three-dimensional space.
  • FIG. 6 is a schematic diagram of image coordinate transformation according to an embodiment.
  • the normal vector P of the initial sagittal plane I is (0, 0, 1).
  • RANSAC Random SAmple Consensus
  • the symbol M represents the correlation transformation of each pixel between the initial sagittal plane I and the result plane E, the coordinate position of each pixel (p, q) of the initial sagittal plane I in the two-dimensional image coordinates, that is, the initial sagittal plane I in the three-dimensional image coordinates
  • the coordinate position of the voxel (p, q, 0), the coordinate position of the voxel (i, j, k) to be transformed into the result plane E, the intensity at the coordinate position of the original voxel (i, j, k) The value is mapped to the coordinate position of the corresponding pixel (p, q) of the result plane E, and then the result plane E forms the final two-dimensional image, and the final two-dimensional image can be recorded in a two-dimensional manner.
  • the rotation angle ⁇ and the rotation axis u can be obtained from the calculation of the following inner and outer products:
  • the parameter d of the translation matrix T is the displacement (offset) from the original position along the unit normal vector, starting from the initial point (x, y, z), and the new point (x', y', z') can be moved along the displacement d Q times to arrive.
  • the translation matrix T can be derived as follows:
  • FIG. 7A is a schematic diagram of a sagittal plane according to an embodiment.
  • Mark 61 represents a sagittal plane
  • mark 62 represents a median plane
  • mark 63 represents a coronal plane
  • mark 64 represents a cross section. Face (horizontal).
  • FIGS. 7B to 7D show ultrasound images of medical sagittal images.
  • Ultrasound scanning has the advantages of low cost, real-time, and non-invasiveness, which can be used for prenatal inspection.
  • the obstetrician and gynecologist will measure the growth parameters of the fetus based on the mid-sagittal plane (MSP), such as the thickness of the neck zonula (Nuchal translucency thickness (NTT), maxillary length (Maxillary length) ) And the angle between the maxilla and frontal bone (Fronto maximumillary angle, FMF angle) to assess whether the fetus has chromosomal abnormalities and developmental delay.
  • MSP mid-sagittal plane
  • the median sagittal plane of the fetal three-dimensional ultrasound image can be automatically detected and a two-dimensional sagittal image can be generated.
  • the measurement of the thickness of the neck transparent band is shown in FIG. 7B
  • the measurement of the length of the maxillary bone is shown in FIG. 7C
  • the measurement of the angle between the maxilla and the frontal bone is shown in FIG. 7D.
  • the neural network training method, medical sagittal image acquisition method, and computer device of the present disclosure use a neural network to detect the sagittal plane, for example, to detect the fetal median sagittal from ultrasound images flat.
  • the neural network can be regarded as a filter to learn the feature points in medical images and their positions in three-dimensional space, and generate a three-dimensional mask containing plane position information, after post-processing conversion, and finally get the median sagittal plane image .

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Theoretical Computer Science (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Pathology (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Radiology & Medical Imaging (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)
  • Magnetic Resonance Imaging Apparatus (AREA)
  • Ultra Sonic Daignosis Equipment (AREA)

Abstract

一种取得医疗矢面影像的神经网络的训练方法包含:使用第一神经网络于三维医疗影像以产生预期矢面掩模;依据三维医疗影像与预期矢面掩模以得到预期结果;依据三维医疗影像与认定矢面掩模以得到认定结果;使用第二神经网络于预期结果与认定结果;依据第二神经网络的输出产生损失函数数据;以及依据损失函数数据调整第一神经网络或第二神经网络的参数。

Description

医疗矢面影像取得方法、神经网络训练方法及计算机装置 技术领域
本发明涉及一种神经网络及计算机装置,特别涉及一种医疗矢面影像的神经网络及计算机装置。
背景技术
医疗检查设备包括超声波检查设备、计算机断层摄影等设备,以产前胎儿生长检查为例,超声波检查在产前诊断上扮演极重要的角色。医师可在第一孕期的超声波影像进行胎儿颈部透明带与生长参数的测量,做为早期唐氏症、胎儿基因缺陷或发育不全的筛检。然而,胎儿超声波影像通常有过多噪声、边界模糊等缺点,而且,第一孕期的超声波影像更因为胎儿发育尚未完整,以及胎儿本身附着于子宫内膜造成边界不明显等问题,使得目前超声波影像测量与评估大都仰赖专业临床人员的经验,也易于导致人为的误差。此外,胎儿颈部透明带厚度需要在胎儿正中矢状面(Middle Sagittal Plane,MSP)上测量,但如何在超声波影像中找到此正确观察面,这是一件相当耗时与困难的技术。
因此,如何提供一种能找出矢状面的神经网络及计算机装置,已成为重要课题之一。
发明内容
有鉴于上述课题,本发明的目的是提供一种能找出矢状面的神经网络及相关的方法与计算机装置。
一种取得医疗矢面影像的神经网络的训练方法包含:使用第一神经网络于三维医疗影像以产生预期矢面掩模;依据三维医疗影像与预期矢面掩模以得到预期结果;依据三维医疗影像与认定矢面掩模以得到认定结果;使用第二神经网络于预期结果与认定结果;依据第二神经网络的输出产生损失函数数据;以及依据损失函数数据调整第一神经网络或第二神经网络的参数。
在一个实施例中,三维医疗影像是超声波影像、计算机断层摄影影像、全景X光片摄影影像、或磁核共振影像。
在一个实施例中,预期结果是三维医疗影像与预期矢面掩模的结合运算而产生,认定结果是三维医疗影像与认定矢面掩模的结合运算而产生。
在一个实施例中,第一神经网络为卷积神经网络,其包含多个卷积层、平 坦层、重塑层以及多个反卷积层。
在一个实施例中,第二神经网络为卷积神经网络,其包含多个卷积层以及平坦层。
在一个实施例中,第一神经网络与第二神经网络为生成对抗网路。
在一个实施例中,损失函数数据包含第一损失函数数据以及第二损失函数数据,第一损失函数数据用以调整第一神经网络;第二损失函数数据用以调整第二神经网络。
在一个实施例中,训练方法还包含从多个二维医疗影像建立三维医疗影像。
在一个实施例中,训练方法还包含:依据预期矢面掩模及三维医疗影像以产生二维矢面影像。
一种取得医疗矢面影像的方法包含:使用第一神经网络于三维医疗影像以产生预期矢面掩模;以及依据预期矢面掩模及三维医疗影像以产生二维矢面影像。
在一个实施例的取得医疗矢面影像的方法中,产生二维矢面影像的步骤包含:依据预期矢面掩模产生矢面描述数据;以及根据矢面描述数据对三维医疗影像进行坐标变换以产生二维矢面影像。
在一个实施例的取得医疗矢面影像的方法中,三维医疗影像是超声波影像、计算机断层摄影影像、全景X光片摄影影像、或磁核共振影像。
在一个实施例的取得医疗矢面影像的方法中,第一神经网络为卷积神经网络,其包含多个卷积层、平坦层、重塑层以及多个反卷积层。
在一个实施例的取得医疗矢面影像的方法中,取得方法还包含从多个二维医疗影像建立三维医疗影像。
一种计算机装置,进行以上所述的方法。
在一个实施例中,计算机装置包含处理核心以及储存元件,储存元件储存上述方法的程序代码,处理核心耦接储存元件并执行程序代码以进行上述方法。
承上所述,在本公开的取得医疗矢面影像的神经网络的训练方法、取得医疗矢面影像的方法以及计算机装置中,使用神经网络来检测矢状平面,例如是从超声波影像检测胎儿正中矢状平面。神经网络可视为一个过滤器,学习医疗影像中的特征点及其在三维空间中的位置,并产生一个包含平面位置信息的三维掩模,经过后处理的转换,最后得到正中矢状平面影像。
附图说明
图1为一个实施例的处理医疗矢面影像的系统的区块图。
图2A为一个实施例的取得医疗矢面影像的神经网络的训练方法的区块图。
图2B为一个实施例的结合运算的区块图。
图2C为一个实施例的三维医疗影像产生的区块图。
图3为一个实施例的第一神经网络的示意图。
图4为一个实施例的第二神经网络的示意图。
图5为一个实施例的取得医疗矢面影像的方法的区块图。
图6为一个实施例的影像坐标变换的示意图。
图7A为一个实施例的矢面的示意图。
图7B至图7D显示医疗矢面影像的超声波影像。
具体实施方式
以下将参照相关附图,说明依据本发明优选实施例的医疗矢面影像的取得方法、神经网络的训练方法及计算机装置,其中相同的元件将以相同的附图标记加以说明。
如图1所示,图1为一个实施例的处理医疗矢面影像的系统的区块图。系统包括医疗影像装置1、计算机装置2以及输出装置3,计算机装置2包含处理核心21、储存元件22以及多个输入输出接口23、24,处理核心21耦接储存元件22及输入输出接口23、24,输入输出接口23可接收医疗影像装置1产生的医疗影像11,输入输出接口24与输出装置3通信,计算机装置2可通过输出入接口24输出处理结果到输出装置3。
储存元件22储存程序代码以供处理核心21执行,储存元件22包括非易失性存储器及易失性存储器,非易失性存储器例如是硬盘、快闪存储器、固态盘、光盘片等等。易失性存储器例如是动态随机存取存储器、静态随机存取存储器等等。举例来说,程序代码储存于非易失性存储器,处理核心21可将程序代码从非易失性存储器加载到易失性存储器,然后执行程序代码。
处理核心21例如是处理器、控制器等等,处理器包括一个或多个核心。处理器可以是中央处理器或图形处理器,处理核心21也可以是处理器或图形处理器的核心。另一方面,处理核心21也可以是一个处理模组,该处理模组包括多个处理器,例如包括中央处理器及图形处理器。
医疗影像装置1可产生医疗影像11,其例如是超声波检查设备、计算机断层摄影、全景X光片摄影影像、或磁核共振等设备。医疗影像装置1产生的医 疗影像11可先传送到储存媒体,再从储存媒体输入到输入输出接口23,输入输出接口23例如是周边传输端口,储存媒体例如是非易失性存储器。另外,医疗影像装置1也可以和输入输出接口23以有线或无线方式联接,通过联接方式将医疗影像11从医疗影像装置1传输到输入输出接口23,输入输出接口23例如是通信端口。
计算机装置2可进行取得医疗矢面影像的神经网络的训练方法,储存元件22储存训练方法的相关程序代码、模型以及训练好的参数,处理核心21执行这些程序代码以进行训练方法,训练方法包含:使用第一神经网络于三维医疗影像以产生预期矢面掩模;依据三维医疗影像与预期矢面掩模以得到预期结果;依据三维医疗影像与认定矢面掩模以得到认定结果;使用第二神经网络于预期结果与认定结果;依据第二神经网络的输出产生损失函数数据;以及依据损失函数数据调整第一神经网络或第二神经网络的参数。
计算机装置2可进行取得医疗矢面影像的方法,储存元件22储存取得方法的相关程序代码、模型以及使用的参数,处理核心21执行这些程序代码以进行取得方法,取得方法包含:使用第一神经网络于三维医疗影像以产生预期矢面掩模;以及依据预期矢面掩模及三维医疗影像以产生二维矢面影像。
输出装置3是具备输出影像能力的装置,例如显示器、投影机、打印机等等。计算机装置2进行取得医疗矢面影像的方法可将产生的二维矢面影像输出到输出装置3。
第一神经网络是从取得医疗矢面影像的神经网络的训练方法所训练的神经网络,并将矢面检测当作是过滤,从三维医疗影像滤出矢面并产生三维二值掩模,滤出的信息不仅留下矢面上被需要的特征,还同时具有位置信息。利用训练后的第一神经网络,可以从三维医疗影像的空间体积(volume)找到一个平面能够准确地将影像中的目标物切成左右二半并仍具有被需要的特征。相较于列出全部候选切片并分类这些切片的直觉方法,本案的方法能克服上述直觉方法非常耗时及缺乏效率的缺点,并克服直觉方法仅以二维影像维判断依据导致的确实位置在三维空间失真的问题。
如图2A所示,图2A为一个实施例的取得医疗矢面影像的神经网络的训练方法的区块图。第一神经网络41、预期结果产生42、认定结果产生43、第二神经网络44以及损失函数计算45的实际程序代码及数据可以储存在图1的储存元件22中并提供给处理核心21执行及处理,三维医疗影像51、预期矢面掩模52、预期结果53、认定矢面掩模54、认定结果55以及损失函数数据56、57可 储存或加载于储存元件22中以提供给处理核心21处理。
第一神经网络41使用于三维医疗影像51以产生预期矢面掩模52,预期结果产生42是依据三维医疗影像51与预期矢面掩模52以得到预期结果53,认定结果产生43是依据三维医疗影像51与认定矢面掩模54以得到认定结果55,第二神经网络44使用于预期结果42与认定结果43以产生输出,损失函数计算45根据第二神经网络44的输出并使用损失函数以计算产生损失函数数据56、57,然后依据损失函数数据56、57调整第一神经网络41及第二神经网络44的参数。
训练方法可使用深度学习(deep learning)来自动检测矢面,举例来说,第一神经网络41与第二神经网络44可以是生成对抗网路的两个子网部分,第一神经网络41是作为生成器(generator),第二神经网络44是作为对抗器(critic)。利用对抗器的输出损失可分别调整或优化生成器和对抗器。
三维医疗影像51是作为输入影像,举例来说,三维医疗影像51是超声波影像、计算机断层摄影影像、全景X光片摄影影像、或磁核共振影像等等。超声波影像例如是全身超声波影像或局部超声波影像,局部超声波影像例如是头部超声波影像、颈部超声波影像、头颈部超声波影像或其他部位超声波影像等等。三维医疗影像51可以是图1的医疗影像装置1产生的医疗影像11,即医疗影像11本身就是三维影像,或者是,三维医疗影像51从医疗影像11来产生,例如医疗影像11是多个二维影像,这些二维影像分别代表标的物在不同层断面或在不同坐标面处,三维医疗影像51是从这些二维影像建立产生的。
第一神经网络41以三维医疗影像51作为输入并加以处理,处理结果作为预期矢面掩模52。举例来说,第一神经网络41被设计为一个过滤器,其学习三维医疗影像51中的特征点及其在三维空间中的位置,并产生一个包含平面位置信息的三维的预期矢面掩模52。预期矢面掩模52的维度与规模可以和三维医疗影像51的维度与规模相同。
第一神经网络41可从三维医疗影像51取出裁剪体积(cropped volume)作为输入并输出三维掩模作为预期矢面掩模52,三维掩模例如是三维二值掩模(3D binary mask),第一神经网络41的输入与输出具有相同的维度及规模大小。矢面的位置信息会嵌入在此三维掩模内,如果体素(voxel)存在于矢面中则对应掩模值为1,反之体素排除在矢面之外则对应掩模值为0。
预期结果产生42是根据三维医疗影像51及预期矢面掩模52而产生预期结果53,产生方式例如是使用结合运算。认定结果产生43是根据三维医疗影像51及认定矢面掩模54而产生认定结果55,产生方式例如是使用结合运算,认 定矢面掩模54可以和三维医疗影像51的维度与规模相同。为了便于处理,三维医疗影像51、预期矢面掩模52及认定矢面掩模54可具有相同的维度与规模。
结合运算如图2B所示,预期结果产生42和认定结果产生43可使用相同方式的结合运算,结合运算的程序代码可以被预期结果产生42和认定结果产生43共享。在图2B中,结合运算包括乘法46以及串联合并47。在将数据送入第二神经网络44前,三维掩模与三维医疗影像51先施以结合运算,结合运算包含矩阵逐元素乘法(element-wise multiplication),这是因为要从原始的三维医疗影像51中取得强度面(intensity plane)。然后,结合运算将乘法46的结果与三维医疗影像51串联合并(concatenate),串联合并47的输出会是二通道数据并送入第二神经网络44。
在预期结果产生42中,三维掩模是预期矢面掩模52,串联合并47的输出是作为预期结果53。在认定结果产生43中,三维掩模是认定矢面掩模54,串联合并47的输出是作为认定结果55。
请再参考图2A,第二神经网络44以预期结果53及认定结果55作为输入并加以处理,处理结果输出到损失函数计算45。在损失函数计算45中,损失函数数据56、57包含第一损失函数数据56以及第二损失函数数据57,第一损失函数数据56用以调整第一神经网络41,第二损失函数数据57用以调整第二神经网络44。
第一神经网络41和第二神经网络44的滤波器权值(filter weights)是利用损失函数数据56、57来训练。采用生成对抗网路的情况下,损失函数可采用WGAN-GP或是基于WGAN-GP的修改版本,损失函数数据56、57利用下式来产生:
L ce=wE[-ylog(x)-(1-y)log(1-x)]
L G=-(1-w)(E[C(x′)])+L ce
Figure PCTCN2018124565-appb-000001
L G:更新生成器的损失函数
L C:更新对抗器的损失函数
L ce:交叉熵(cross entropy)
x:预期矢面掩模
y:认定矢面掩模
Figure PCTCN2018124565-appb-000002
x和y并以随机数权值的线性结合,
Figure PCTCN2018124565-appb-000003
α:随机数权值,α∈(0,1)
x’:预期结果
y’:认定结果
Figure PCTCN2018124565-appb-000004
x’和y’并以随机数权值的线性结合,
Figure PCTCN2018124565-appb-000005
C:对抗器
E:期望(exceptation)
λ:梯度处罚(gradient penalty)的权值
w:权值,控制交叉熵损失及对抗(adversairal)损失的权衡
另外,如图2C所示,图2C为一个实施例的三维医疗影像产生的区块图,在三维医疗影像产生48中,可从多个二维医疗影像58产生建立图2A中的三维医疗影像51。举例来说,二维医疗影像58是二维超声波影像,三维医疗影像51是三维超声波影像。三维医疗影像产生48实际的程序代码及数据也可储存在图1的储存元件22中并提供给处理核心21执行及处理。
如图3所示,图3为一个实施例的第一神经网络的示意图,第一神经网络41例如是卷积神经网络,其包含多个卷积层411、平坦层412、重塑层414以及多个反卷积层415。第一神经网络41的前半部可视为编码器,后半部可视为解码器,例如是卷积层411及平坦层412可视为编码器,重塑层414及反卷积层415可视为解码器。反卷积层415的最终输出作为预期矢面掩模52。
卷积层411例如是四个,各卷积层411使用相同规模大小的内核(kernel)并具有相同规模大小的步幅(stride),例如3×3×3大小的内核以及1×1×1大小的步幅。内核又可称为滤波器(filter)。特征图(feature map)的厚度或称为通道(channel),其数量是随卷积层411的层数逐渐增加,例如第一层卷积层411a后,每通过一层卷积层411就倍增,在图3中,第一层卷积层411a的输入的通道数为1,然后经第一层卷积层411a的输出后,从第一层到第四层的卷积层411a~411d,通道数从4开始逐渐增加为8、16及32。数据量的大小从80 3依序变化为40 3、20 3、10 3、5 3。在其他的实施方式中,卷积层411也可分别使用不同大小的内核,或仅部分卷积层411使用相同规模大小的内核并非全部卷积层411都使用相同规模大小的内核。卷积层411也可分别使用不同大小的步幅,或仅部分卷积层411使用相同规模大小的步幅并非全部卷积层411都使用相同规模大小的步幅。通道的数量变化也可以有其他方式的变化,并非限定于倍增。
卷积层411的输出可经进一步处理再进入到下一个卷积层411或平坦层 412,进一步处理例如是线性整流层(Rectified Linear Units layer,ReLU layer)或池化层(pooling layer),线性整流层例如是使用带泄露整流函数(1eaky ReLU),池化层例如是最大池化层(max pooling layer)。举例来说,各卷积层411的输出可经线性整流层以及最大池化层处理再进入到下一个卷积层411,最后一层的卷积层411则是输出到平坦层412,各线性整流层使用带泄露整流函数,各池化层是最大池化层,最大池化层使用2×2×2大小的内核并具有2×2×2大小的步幅。在其他的实施方式中,可以仅部分的卷积层411的输出经进一步处理再进入到下一个卷积层411或平坦层412,并非一定要全部的卷积层411的输出都要经线性整流层或池化层。
平坦层412后可接两个完全连接层413再接到重塑层414,完全连接层413也可带有线性整流层,线性整流层例如是使用带泄露整流函数。重塑层414后连接反卷积层415。平坦层412后到重塑层414间的数据量大小为4000、500、4000。
反卷积层415例如是四个,各反卷积层415使用相同规模大小的内核并具有相同规模大小的步幅,例如3×3×3大小的内核以及2×2×2大小的步幅。通道数量是随反卷积层415的层数逐渐减少,例如通过一层反卷积层415就倍减直到最后一层。在图3中,第一层反卷积层415的输入的通道数为32,然后经第一层反卷积层415a的输出后,从第一层到第四层的反卷积层415a~415d,通道数从32开始逐渐减少为16、8及4,最后一层反卷积层415a~415d的输出的通道数为1。数据量的大小从5 3依序变化为10 3、20 3、40 3、80 3。在其他的实施方式中,反卷积层415也可分别使用不同大小的内核,或仅部分反卷积层415使用相同规模大小的内核并非全部反卷积层415都使用相同规模大小的内核。反卷积层415也可分别使用不同大小的步幅,或仅部分反卷积层415使用相同规模大小的步幅并非全部反卷积层415都使用相同规模大小的步幅。通道的数量变化也可以有其他方式的变化,并非限定于倍减。
反卷积层415的输出可经进一步处理再进入到下一个反卷积层415,进一步处理例如是线性整流层(Rectified Linear Units layer,ReLU layer),线性整流层例如是使用带泄露整流函数(leaky ReLU)。举例来说,除了最后一层,各反卷积层415的输出可经线性整流层处理再进入到下一个反卷积层415,最后一层的卷积层415d则是带有乙状层(sigmoid layer),各线性整流层使用带泄露整流函数。在其他的实施方式中,可以仅部分的反卷积层415的输出经进一步处理再 进入到下一个反卷积层415,并非一定要全部的反卷积层415的输出都要经线性整流层。
如图4所示,图4为一个实施例的第二神经网络的示意图,第二神经网络的架构类似于第一神经网络的编码器部分,第二神经网络44例如是卷积神经网络,其包含多个卷积层441及平坦层442。平坦层442的输出提供给损失函数进行计算。
卷积层441例如是四个,各卷积层441使用相同规模大小的内核并具有相同规模大小的步幅,例如3×3×3大小的内核以及1×1×1大小的步幅。通道数量是随卷积层441的层数逐渐增加,例如第一层卷积层441a后,每通过一层卷积层441就倍增,在图4中,第一层卷积层441a的输入的通道数为2,然后经第一层卷积层441a的输出后,从第一层到第四层的卷积层411a~411d,通道数从4开始逐渐增加为8、16及32。数据量的大小从80 3依序变化为40 3、20 3、10 3、5 3。在其他的实施方式中,卷积层441也可分别使用不同大小的内核,或仅部分卷积层441使用相同规模大小的内核并非全部卷积层441都使用相同规模大小的内核。卷积层441也可分别使用不同大小的步幅,或仅部分卷积层441使用相同规模大小的步幅并非全部卷积层441都使用相同规模大小的步幅。通道的数量变化也可以有其他方式的变化,并非限定于倍增。
卷积层441的输出可经进一步处理再进入到下一个卷积层441或平坦层442,进一步处理例如是线性整流层、乙状层、或池化层,线性整流层例如是使用带泄露整流函数,池化层例如是最大池化层。举例来说,除了最后一层之外,各卷积层441a~441c的输出可经线性整流层以及最大池化层处理再进入到下一个卷积层441b~441d,最后一层的卷积层441d则经乙状层以及最大池化层处理再输出到平坦层412,各线性整流层使用带泄露整流函数,各池化层是最大池化层,最大池化层使用2×2×2大小的内核并具有2×2×2大小的步幅。在其他的实施方式中,可以仅部分的卷积层441的输出经进一步处理再进入到下一个卷积层441或平坦层442,并非一定要全部的卷积层441的输出都要经线性整流层、乙状层、或池化层。第二神经网络44的最终输出可以不是一个值,而是一个特征向量(latent vector),由此代表真或假掩模的分部。在图4中,平坦层412后的数据量大小为4000。
如图5所示,图5为一个实施例的取得医疗矢面影像的方法的区块图。第一神经网络41经由前述的方法训练后可用来取得医疗矢面影像,第一神经网络 41以及三维医疗影像51的相关实施方式及变化可参考前面的相关说明。第一神经网络41、二维矢面影像产生49的实际程序代码及数据可以储存在图1的储存元件22中并提供给处理核心21执行及处理,三维医疗影像51、预期矢面掩模52、二维矢面影像59可储存或加载于储存元件22中以提供给处理核心21处理。
第一神经网络41使用于三维医疗影像51以产生预期矢面掩模52。在二维矢面影像产生49中,依据预期矢面掩模52及三维医疗影像51以产生二维矢面影像59。举例来说,产生二维矢面影像的步骤包含:依据预期矢面掩模52产生矢面描述数据;以及根据矢面描述数据对三维医疗影像51进行坐标转换以产生二维矢面影像59。矢面描述数据例如是三维空间的平面表示式。
如图6所示,图6为一个实施例的影像坐标变换的示意图,标记I代表初始矢面(sagittal plane),其平面表示式为z=0。初始矢面I的法线向量P为(0,0,1)。在预期矢面掩模52中,对非0的体素施用随机抽样一致(RANdom SAmple Consensus,RANSAC)算法,然后可估算得到一个结果平面E,其平面表示式为ax+by+cz+d=0。结果平面E的法线向量Q为(a,b,c),其中a 2+b 2+c 2=1。
标记M代表初始矢面I及结果平面E间各像素的相关变换(transformation),在二维影像坐标中初始矢面I的各像素(p,q)的坐标位置,即在三维影像坐标中初始矢面I的体素(p,q,0)的坐标位置,要变换至结果平面E的体素(i,j,k)的坐标位置,在原体素(i,j,k)的坐标位置上的强度值会映射到结果平面E的对应像素(p,q)的坐标位置,然后结果平面E形成最终二维影像,最终二维影像能以二维方式记录。变换M包含旋转矩阵R和平移矩阵T,其可表示为M=TR,利用旋转矩阵R和平移矩阵T的计算可求出二维矢面影像59。
对于法线向量P及法线向量Q来说,旋转角θ及旋转轴u可以从以下内积与外积的计算求出:
Figure PCTCN2018124565-appb-000006
Figure PCTCN2018124565-appb-000007
根据罗德里格旋转公式(Rodrigues′rotation formula),绕旋转轴u转了旋转角θ的旋转矩阵R可从下式导出,其中转轴u表示为u=(u x,u y,u z):
Figure PCTCN2018124565-appb-000008
Figure PCTCN2018124565-appb-000009
Figure PCTCN2018124565-appb-000010
||K|| 2=1
平移矩阵T的参数d是从原始位置沿单位法线向量的位移(offset),从初始点(x,y,z)开始,新点(x’,y’,z’)可以沿位移d移动Q倍而到达。平移矩阵T可用下式导出:
Figure PCTCN2018124565-appb-000011
如图7A所示,图7A为一个实施例的矢面的示意图,标记61代表矢面(sagittal plane),标记62代表中间面(median plane),标记63代表冠状面(coronal plane),标记64代表横断面(horizontal plane)。
如图7B至图7D所示,图7B至图7D显示医疗矢面影像的超声波影像。超声波扫描具有低成本、实时和非侵入性等优点,其可使用于产前检查。在第一孕期中,妇产科医生会根据胎儿的正中矢状平面(Middle Sagittal Plane,MSP)测量其生长参数,例如颈部透明带厚度(Nuchal translucency thickness,NTT)、上颚骨长度(Maxillary length)以及上颚骨与额骨夹角角度(Fronto maxillary facial angle,FMF angle),以评估胎儿是否有染色体异常及发育迟缓等情况。利用前述取得方法,可以自动检测胎儿三维超声波影像的正中矢状平面,并产生二维矢面影像。颈部透明带厚度的测量示意在图7B中,上颚骨长度的测量示意在图7C中,上颚骨与额骨夹角角度的测量示意在图7D中。
综上所述,在本公开的取得医疗矢面影像的神经网络的训练方法、取得医疗矢面影像的方法以及计算机装置中,使用神经网络来检测矢状平面,例如是从超声波影像检测胎儿正中矢状平面。神经网络可视为一个过滤器,学习医疗影像中的特征点及其在三维空间中的位置,并产生一个包含平面位置信息的三维掩模,经过后处理的转换,最后得到正中矢状平面影像。
以上所述仅为举例性,而非为限制性。任何未脱离本发明的精神与范畴,而对其进行的等效修改或变更,均应包含于随附的权利要求范围中。

Claims (15)

  1. 一种取得医疗矢面影像的神经网络的训练方法,包含:
    使用第一神经网络于三维医疗影像以产生预期矢面掩模;
    依据所述三维医疗影像与所述预期矢面掩模以得到预期结果;
    依据所述三维医疗影像与认定矢面掩模以得到认定结果;
    使用第二神经网络于所述预期结果与所述认定结果;
    依据所述第二神经网络的输出产生损失函数数据;以及
    依据所述损失函数数据调整所述第一神经网络或所述第二神经网络的参数。
  2. 如权利要求1所述的方法,其中,所述三维医疗影像是超声波影像、计算机断层摄影影像、全景X光片摄影影像、或磁核共振影像。
  3. 如权利要求1所述的方法,其中,所述预期结果是所述三维医疗影像与所述预期矢面掩模的结合运算而产生,所述认定结果是所述三维医疗影像与所述认定矢面掩模的结合运算而产生。
  4. 如权利要求1所述的方法,其中,所述第一神经网络为卷积神经网络,其包含多个卷积层、平坦层、重塑层以及多个反卷积层。
  5. 如权利要求1所述的方法,其中,所述第二神经网络为卷积神经网络,其包含多个卷积层以及平坦层。
  6. 如权利要求1所述的方法,其中,所述损失函数数据包含:
    第一损失函数数据,用以调整所述第一神经网络;以及
    第二损失函数数据,用以调整所述第二神经网络。
  7. 如权利要求1所述的方法,其中,所述第一神经网络与所述第二神经网络为生成对抗网路。
  8. 如权利要求1所述的方法,还包含:
    从多个二维医疗影像建立所述三维医疗影像。
  9. 如权利要求1所述的方法,还包含:
    依据所述预期矢面掩模及所述三维医疗影像以产生二维矢面影像。
  10. 一种取得医疗矢面影像的方法,包含:
    使用第一神经网络于三维医疗影像以产生预期矢面掩模;以及
    依据所述预期矢面掩模及所述三维医疗影像以产生二维矢面影像。
  11. 如权利要求10所述的方法,其中,产生所述二维矢面影像的步骤包含:
    依据所述预期矢面掩模产生矢面描述数据;
    根据所述矢面描述数据对所述三维医疗影像进行坐标变换以产生所述二维矢面影像。
  12. 如权利要求10所述的方法,其中,所述三维医疗影像是超声波影像、计算机断层摄影影像、全景X光片摄影影像、或磁核共振影像。
  13. 如权利要求10所述的方法,其中,所述第一神经网络为卷积神经网络,其包含多个卷积层、平坦层、重塑层以及多个反卷积层。
  14. 如权利要求10所述的方法,还包含:
    从多个二维医疗影像建立所述三维医疗影像。
  15. 一种计算机装置,进行如权利要求1至14中任一项所述的方法。
PCT/CN2018/124565 2018-12-28 2018-12-28 医疗矢面影像取得方法、神经网络训练方法及计算机装置 WO2020133124A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/124565 WO2020133124A1 (zh) 2018-12-28 2018-12-28 医疗矢面影像取得方法、神经网络训练方法及计算机装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/124565 WO2020133124A1 (zh) 2018-12-28 2018-12-28 医疗矢面影像取得方法、神经网络训练方法及计算机装置

Publications (1)

Publication Number Publication Date
WO2020133124A1 true WO2020133124A1 (zh) 2020-07-02

Family

ID=71129384

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/124565 WO2020133124A1 (zh) 2018-12-28 2018-12-28 医疗矢面影像取得方法、神经网络训练方法及计算机装置

Country Status (1)

Country Link
WO (1) WO2020133124A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104414680A (zh) * 2013-08-21 2015-03-18 深圳迈瑞生物医疗电子股份有限公司 一种三维超声成像方法及系统
CN104751178A (zh) * 2015-03-31 2015-07-01 上海理工大学 基于形状模板匹配结合分类器的肺结节检测装置及方法
CN107862726A (zh) * 2016-09-20 2018-03-30 西门子保健有限责任公司 基于深度学习的彩色二维影片医学成像
US10032281B1 (en) * 2017-05-03 2018-07-24 Siemens Healthcare Gmbh Multi-scale deep reinforcement machine learning for N-dimensional segmentation in medical imaging
CN108765483A (zh) * 2018-06-04 2018-11-06 东北大学 一种从脑部ct图像中确定中矢面的方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104414680A (zh) * 2013-08-21 2015-03-18 深圳迈瑞生物医疗电子股份有限公司 一种三维超声成像方法及系统
CN104751178A (zh) * 2015-03-31 2015-07-01 上海理工大学 基于形状模板匹配结合分类器的肺结节检测装置及方法
CN107862726A (zh) * 2016-09-20 2018-03-30 西门子保健有限责任公司 基于深度学习的彩色二维影片医学成像
US10032281B1 (en) * 2017-05-03 2018-07-24 Siemens Healthcare Gmbh Multi-scale deep reinforcement machine learning for N-dimensional segmentation in medical imaging
CN108765483A (zh) * 2018-06-04 2018-11-06 东北大学 一种从脑部ct图像中确定中矢面的方法及系统

Similar Documents

Publication Publication Date Title
TWI697010B (zh) 醫療矢面影像的取得方法、神經網路的訓練方法及計算機裝置
Sobhaninia et al. Fetal ultrasound image segmentation for measuring biometric parameters using multi-task deep learning
EP3355273B1 (en) Coarse orientation detection in image data
US10318839B2 (en) Method for automatic detection of anatomical landmarks in volumetric data
US20110262015A1 (en) Image processing apparatus, image processing method, and storage medium
JP2023511300A (ja) 医用画像における解剖学的構造を自動的に発見するための方法及びシステム
CN111368586B (zh) 超声成像方法及系统
JP2004033749A (ja) Pet腫瘍画像に関する半自動セグメント分割アルゴリズム
KR102202398B1 (ko) 영상처리장치 및 그의 영상처리방법
CN110728274A (zh) 医疗设备计算机辅助扫描方法、医疗设备及可读存储介质
JP2013542046A5 (zh)
CN110246580B (zh) 基于神经网络和随机森林的颅侧面影像分析方法和系统
EP3705047B1 (en) Artificial intelligence-based material decomposition in medical imaging
CN111462168B (zh) 运动参数估计方法和运动伪影校正方法
JP2016135252A (ja) 医用画像処理装置及び医用画像診断装置
WO2023044605A1 (zh) 极端环境下脑结构的三维重建方法、装置及可读存储介质
CN112950648B (zh) 确定磁共振图像中的正中矢状平面的方法和设备
JP6995535B2 (ja) 画像処理装置、画像処理方法およびプログラム
CN117408908B (zh) 一种基于深度神经网络的术前与术中ct图像自动融合方法
JP6340315B2 (ja) 画像処理方法
CN103366348B (zh) 一种抑制x线图像中骨骼影像的方法及处理设备
Liu et al. Automated classification and measurement of fetal ultrasound images with attention feature pyramid network
JP2006000127A (ja) 画像処理方法および装置並びにプログラム
CN109087357A (zh) 扫描定位方法、装置、计算机设备及计算机可读存储介质
CN114066798A (zh) 一种基于深度学习的脑肿瘤核磁共振影像数据合成方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18945305

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18945305

Country of ref document: EP

Kind code of ref document: A1