Automatic detection method and system for fetal head volume in ultrasonic image
Technical Field
The invention belongs to the technical field of prenatal ultrasonic examination, and particularly relates to an automatic detection method and system for fetal head volume in an ultrasonic image.
Background
Fetal intracranial structural abnormalities are one of the most common congenital malformations, the incidence rate of which is 1% -3%, and the neurological functions in fetal mothers and after birth are affected to different degrees. Therefore, the detection of the fetal craniocerebral development state in the pregnancy period has important clinical significance. In addition to structural evaluation, the development size of each intracranial structure has important clinical significance, and can be used for evaluating whether hypoplasia exists in various craniocerebral structures (including whole brain, cerebellar hemisphere, vermis, and the like). Currently, the most common indexes for evaluating the fetal craniocerebral development condition in the conventional fetal ultrasound are the double apical diameter, the head circumference and the cerebellar transverse diameter. In a strict sense, however, volume values reflect organ growth levels more accurately than diameter values, as is also demonstrated in some other organ evaluations of the fetus.
The existing detection method of fetal brain volume is to assume that the fetal brain is a regular sphere and calculate the fetal head volume in a two-dimensional radial line mode. However, this method has some non-negligible drawbacks: firstly, due to the instability of the speed and direction of manual detection and uncontrollable factors such as movement artifacts caused by the movement of the pregnant woman or the fetus due to too long time consumption in the detection process, the definition of the finally obtained image is poor, and the detection accuracy is not high; secondly, as the fetal brain is of a three-dimensional structure, a long time is consumed when the fetal brain is converted into a two-dimensional section and is marked by a doctor, so that the workload of the sonographer is large, and the wide application of the method is influenced; third, different levels of sonographers using this method may have different diagnostic results, resulting in inconsistent test results.
Disclosure of Invention
The invention aims to solve the technical problems of poor image definition and accuracy in the existing detection method for fetal craniocerebral volume, the technical problem that the detection method is widely applied due to large workload of an ultrasonic doctor and the technical problem that the detection results are inconsistent due to different diagnostic results obtained by different levels of ultrasonic doctors using the detection method.
To achieve the above object, according to one aspect of the present invention, there is provided an automatic detection method for fetal head volume in an ultrasound image, comprising the steps of:
(1) acquiring a data set;
(2) and (2) preprocessing the data set obtained in the step (1) to obtain a preprocessed fetal craniocerebral three-dimensional ultrasonic data set.
(3) Inputting the fetal craniocerebral three-dimensional ultrasonic data set preprocessed in the step (2) into a trained 3D FCN network to obtain voxels of the fetal craniocerebral three-dimensional ultrasonic data.
(4) And (4) calculating the volume of the head of the fetus by using the voxel of the three-dimensional ultrasonic data of the skull of the fetus obtained in the step (3).
Preferably, the data set includes fetal craniocerebral three-dimensional ultrasound data acquired from a three-dimensional ultrasound device, and fetal craniocerebral location information manually labeled by a sonographer for each fetal craniocerebral three-dimensional ultrasound data.
Preferably, the preprocessing of the data set obtained in step (1) in step (2) is by a median filtering method.
Preferably, the head volume V is calculated by the formula: v is VP×UvIn which V ispIs the voxel, U, obtained in step (3)vIs the volume of a unit voxel.
Preferably, the 3D FCN network is trained by:
A. acquiring a data set which comprises fetal craniocerebral three-dimensional ultrasonic data acquired from a three-dimensional ultrasonic device and fetal craniocerebral position information manually labeled by a sonographer for each fetal craniocerebral three-dimensional ultrasonic data;
B. b, denoising the data set obtained in the step A by adopting a median filtering method, normalizing the denoised data set, and randomly dividing the normalized data set into a training set, a verification set and a test set;
C. and B, inputting the training set in the data set subjected to normalization processing in the step B into a 3D FCN network to obtain inference output of the volume of the head of the fetus, and inputting the inference output into a loss function in the 3D FCN network to obtain a loss value.
D. Optimizing a loss function in the 3D FCN according to a random gradient descent algorithm and by using the loss value obtained in the step C so as to update the 3D FCN;
E. and C, repeatedly executing the step C and the step D aiming at the rest data sets in the training set part in the data set obtained in the step B until the 3D FCN network is converged to the best, thereby obtaining the trained 3D FCN network.
Preferably, the loss function is: l (x, y) ═ x-y)2, where x is the fetal head volume acquired by the sonographer from the manually labeled fetal craniocerebral location information, which is specifically equal to the product between the volume of the voxels acquired by the sonographer from the manually labeled fetal craniocerebral location information and the unit voxel, and y is the inferential output of the fetal head volume.
Preferably, the network structure of the 3D FCN network is as follows:
the first layer is an input layer, the input of which is a matrix of 128 x 1 pixels;
the second layer is a convolution layer with convolution kernel size of 3 x 3, number of convolution kernels of 32, step size of 1, this layer is filled using SAME pattern, and the output size is 128 x 32 matrix;
the third layer is a pooling layer with a pooling window size of 2 x 2, step size of (2, 2, 2), and a layer output matrix of 64 x 32;
the fourth layer is a convolution layer, the convolution kernel size is 3 × 3, the number of convolution kernels is 64, the step size is 1, the layer is filled by using an SAME mode, and a matrix with the size of 64 × 64 is output;
the fifth layer is a pooling layer with a pooling window size of 2 x 2, step size of (2, 2, 2), the layer output matrix of 32 x 64;
the sixth layer is a convolution layer with convolution kernel size of 3 × 3, number of convolution kernels of 128, step size of 1, this layer is filled using SAME pattern, and a matrix with size of 32 × 128 is output;
the seventh layer is a convolution layer with convolution kernel size of 3 × 3, number of convolution kernels of 128, step size of 1, this layer is filled using SAME pattern, and a matrix with size of 32 × 128 is output;
the eighth layer is a pooling layer with a pooling window size of 2 x 2, step size of (2, 2, 2), the layer output matrix of 16 x 128;
the ninth layer is a convolution layer with convolution kernel size of 3 × 3, number of convolution kernels of 256, step size of 1, this layer is filled using SAME pattern, and matrix with size of 16 × 256 is output;
the tenth layer is a convolution layer with convolution kernel size of 3 × 3, number of convolution kernels of 256, step size of 1, this layer is filled using SAME pattern, and matrix with size of 16 × 256 is output;
the eleventh layer is an deconvolution layer with deconvolution kernel size of 4 x 4, number of deconvolution kernels of 128, which uses 2 times the upsampling operation, outputting a matrix of size 32 x 128;
the twelfth layer is a convolution layer, the convolution kernel size is 1 × 1, the number of convolution kernels is 128, the step size is 1, the layer is filled by using an SAME mode, and a matrix with the size of 32 × 128 is output;
the thirteenth layer is a convolution layer with convolution kernel size of 3 × 3, number of convolution kernels of 128, step size of 1, this layer is filled using SAME pattern, and matrix with size of 32 × 128 is output;
a fourteenth layer is an deconvolution layer with deconvolution kernel size of 4 x 4, with a number of deconvolution kernels of 64, which uses 2 times the upsampling operation, outputting a matrix with size of 64 x 64;
the fifteenth layer is a convolution layer with convolution kernel size of 1 x 1, convolution kernel number of 64, step size of 1, this layer is filled using SAME pattern, and output matrix size of 64 x 64;
the sixteenth layer is a convolution layer with convolution kernel size of 3 x 3, convolution kernel number of 64, step size of 1, this layer is filled using SAME pattern, and the output size is a matrix of 64 x 64;
a seventeenth layer is a deconvolution layer with deconvolution kernel size of 4 x 4, with 32 deconvolution kernels, which uses 2 times the upsampling operation, outputting a matrix of size 128 x 32;
the eighteenth layer is a convolution layer with convolution kernel size 1 x 1, step size 1, number of convolution kernels 32, this layer is filled using SAME pattern, and the output size is 128 x 32 matrix;
the nineteenth layer is a convolution layer, the convolution kernel size is 3 × 3, the number of convolution kernels is 32, the step size is 1, the layer is filled by using an SAME mode, and a matrix with the size of 128 × 32 is output;
the twentieth layer is a convolution layer with convolution kernel size 1 x 1, convolution kernel number 1, step size 1, this layer is filled using SAME pattern, and the output size is 128 x 1 matrix.
According to another aspect of the present invention, there is provided a system for automatic detection of fetal head volume in an ultrasound image, comprising:
a first module for obtaining a data set;
the second module is used for preprocessing the data set acquired by the first module to obtain a preprocessed fetal craniocerebral three-dimensional ultrasonic data set;
the third module is used for inputting the fetal brain three-dimensional ultrasonic data set pretreated by the second module into the trained 3DFCN network to obtain the voxel of the fetal brain three-dimensional ultrasonic data;
and the fourth module is used for calculating the volume of the head of the fetus by using the voxel of the three-dimensional ultrasonic data of the head of the fetus, which is obtained by the third module.
In general, compared with the prior art, the above technical solution contemplated by the present invention can achieve the following beneficial effects:
(1) because the invention uses the three-dimensional imaging technology, the invention can provide independent images of any layer and eliminate the influence of front and back overlapped tissues on the images, thereby solving the technical problem of low image definition and accuracy in the existing fetal brain image acquisition;
(2) because the fetal head volume is intelligently and automatically detected through deep learning, the method can reduce the technical requirements and workload for doctors, and can solve the technical problem that the fetal head volume is difficult to widely apply due to high requirements on the professional level of the doctors in the conventional fetal head volume measuring method;
(3) because the fetal craniocerebral volume data sets adopted by the invention are screened by professional sonographers, and the data used for training has a unique determined standard for volume evaluation, the technical problem of inconsistent detection results caused by the difference between the evaluation results of different doctors in the existing fetal head volume measurement method can be solved.
Drawings
FIG. 1 is a flow chart of a method for automatic detection of fetal head volume in an ultrasound image of the present invention;
fig. 2(a) and 2(b) show the data set acquired in step (1) of the automatic detection method of the present invention, wherein fig. 2(a) is fetal craniocerebral three-dimensional ultrasound data, and fig. 2(b) is fetal craniocerebral position information manually labeled by an sonographer.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The invention aims to provide an automatic detection method for fetal head volume in an ultrasonic image, which realizes a method for automatically, intelligently and quickly identifying a series of fetal facial standard sections by three-dimensional volume data through deeply learning a large amount of fetal craniofacial three-dimensional ultrasonic image data which are normal and abnormal in early pregnancy.
The basic idea of the invention is to acquire the volume of the head of the fetus by three-dimensional ultrasound, analyze the volume by artificial intelligent ultrasound, and automatically and rapidly calculate the volume of a plurality of structures of the skull and the cranium. Therefore, the size of the fetal brain volume can be accurately detected, and the technical defect that a measurement false positive or false negative result occurs when the fetal head is deformed under pressure in the existing fetal brain volume detection method is overcome.
As shown in fig. 1, the present invention provides a method for automatically detecting the volume of a fetal head in an ultrasound image, comprising the following steps:
(1) acquiring a data set;
specifically, the data set includes fetal craniocerebral three-dimensional ultrasound data (as shown in fig. 2 (a)) acquired from three-dimensional ultrasound equipment manufactured by mainstream manufacturers on the market (including maire, union photograph, siemens and the like), and fetal craniocerebral position information manually labeled by sonographers for each fetal craniocerebral three-dimensional ultrasound data (as shown in fig. 2 (b)).
(2) And (2) preprocessing the data set obtained in the step (1) to obtain a preprocessed fetal craniocerebral three-dimensional ultrasonic data set.
Specifically, the method for preprocessing the data set obtained in step (1) in this step is to perform denoising processing by using a median filtering method, and then perform normalization processing on the denoised data set.
(3) Inputting the fetal craniocerebral Three-dimensional ultrasonic data set preprocessed in the step (2) into a trained Three-dimensional full convolution Network (3D FCN Network for short) to obtain voxels of the fetal craniocerebral Three-dimensional ultrasonic data.
(4) And (4) calculating the volume of the head of the fetus by using the voxel of the three-dimensional ultrasonic data of the skull of the fetus obtained in the step (3).
Specifically, the head volume V is calculated by the formula: v is VP×Uv. Wherein, VpIs the voxel, U, obtained in step (3)vIs the volume of a unit voxel (this value is known and typically corresponds to 2cm3-3cm3 per unit voxel).
Specifically, the 3D FCN network in the present invention is obtained by training through the following steps:
A. a data set is acquired, which comprises fetal craniocerebral three-dimensional ultrasonic data acquired from three-dimensional ultrasonic equipment manufactured by mainstream manufacturers (including maire, union photograph, siemens and the like) on the market, fetal craniocerebral position information manually labeled by an ultrasonographer for each fetal craniocerebral three-dimensional ultrasonic data, and voxels calculated after the ultrasonographer labels the fetal craniocerebral.
B. B, denoising the data set obtained in the step A by adopting a median filtering method, normalizing the denoised data set, and randomly dividing the normalized data set into a training set, a verification set and a test set;
specifically, the preprocessed data set is randomly divided into 3 parts, 70% of which is used as a training set (Trainset), 20% of which is used as a verification set (Validation set), and 10% of which is used as a Test set (Test set). In this example, there are a total of 800 data sets, with the training set comprising 560 data sets, the validation set comprising 160 data sets, and the test set comprising 80 data sets.
For the 3D FCN network used in the present invention, the network structure is as follows:
the first layer is an input layer, the input of which is a matrix of 128 x 1 pixels;
the second layer is a convolution layer with convolution kernel size of 3 x 3, number of convolution kernels of 32, step size of 1, this layer is filled using SAME pattern, and the output size is 128 x 32 matrix;
the third layer is a pooling layer with a pooling window size of 2 x 2, step size of (2, 2, 2), and a layer output matrix of 64 x 32;
the fourth layer is a convolution layer, the convolution kernel size is 3 × 3, the number of convolution kernels is 64, the step size is 1, the layer is filled by using an SAME mode, and a matrix with the size of 64 × 64 is output;
the fifth layer is a pooling layer with a pooling window size of 2 x 2, step size of (2, 2, 2), the layer output matrix of 32 x 64;
the sixth layer is a convolution layer with convolution kernel size of 3 × 3, number of convolution kernels of 128, step size of 1, this layer is filled using SAME pattern, and a matrix with size of 32 × 128 is output;
the seventh layer is a convolution layer with convolution kernel size of 3 × 3, number of convolution kernels of 128, step size of 1, this layer is filled using SAME pattern, and a matrix with size of 32 × 128 is output;
the eighth layer is a pooling layer with a pooling window size of 2 x 2, step size of (2, 2, 2), the layer output matrix of 16 x 128;
the ninth layer is a convolution layer with convolution kernel size of 3 × 3, number of convolution kernels of 256, step size of 1, this layer is filled using SAME pattern, and matrix with size of 16 × 256 is output;
the tenth layer is a convolution layer with convolution kernel size of 3 × 3, number of convolution kernels of 256, step size of 1, this layer is filled using SAME pattern, and matrix with size of 16 × 256 is output;
the eleventh layer is an deconvolution layer with deconvolution kernel size of 4 x 4, number of deconvolution kernels of 128, which uses 2 times the upsampling operation, outputting a matrix of size 32 x 128;
the twelfth layer is a convolution layer, the convolution kernel size is 1 × 1, the number of convolution kernels is 128, the step size is 1, the layer is filled by using an SAME mode, and a matrix with the size of 32 × 128 is output;
the thirteenth layer is a convolution layer with convolution kernel size of 3 × 3, number of convolution kernels of 128, step size of 1, this layer is filled using SAME pattern, and matrix with size of 32 × 128 is output;
a fourteenth layer is an deconvolution layer with deconvolution kernel size of 4 x 4, with a number of deconvolution kernels of 64, which uses 2 times the upsampling operation, outputting a matrix with size of 64 x 64;
the fifteenth layer is a convolution layer with convolution kernel size of 1 x 1, convolution kernel number of 64, step size of 1, this layer is filled using SAME pattern, and output matrix size of 64 x 64;
the sixteenth layer is a convolution layer with convolution kernel size of 3 x 3, convolution kernel number of 64, step size of 1, this layer is filled using SAME pattern, and the output size is a matrix of 64 x 64;
a seventeenth layer is a deconvolution layer with deconvolution kernel size of 4 x 4, with 32 deconvolution kernels, which uses 2 times the upsampling operation, outputting a matrix of size 128 x 32;
the eighteenth layer is a convolution layer with convolution kernel size 1 x 1, step size 1, number of convolution kernels 32, this layer is filled using SAME pattern, and the output size is 128 x 32 matrix;
the nineteenth layer is a convolution layer, the convolution kernel size is 3 × 3, the number of convolution kernels is 32, the step size is 1, the layer is filled by using an SAME mode, and a matrix with the size of 128 × 32 is output;
the twentieth layer is a convolution layer with convolution kernel size 1 x 1, convolution kernel number 1, step size 1, this layer is filled using SAME pattern, and the output size is 128 x 1 matrix.
C. The training set (560 data sets in this example) in the normalized data set of step B is input into the 3D FCN network to obtain an inference output of the fetal head volume, which is input into a loss function in the 3D FCN network to obtain a loss value.
Specifically, the loss function is: l (x, y) ═ x-y)2, where x is the fetal head volume acquired by the sonographer from the manually labeled fetal craniocerebral location information, which is specifically equal to the product between the volume of the voxels acquired by the sonographer from the manually labeled fetal craniocerebral location information and the unit voxel, and y is the inferential output of the fetal head volume.
D, optimizing a loss function in the 3D FCN according to a random Gradient Descent (SGD) algorithm and the loss value obtained in the step C to update the 3D FCN;
E. and C, repeatedly executing the step C and the step D aiming at the rest data sets in the training set part in the data set obtained in the step B until the 3D FCN network is converged to the best, thereby obtaining the trained 3D FCN network.
F. B, verifying the trained 3D FCN network by using the verification set in the data set obtained in the step B;
G. and D, testing the trained 3D FCN network by using the test set in the data set obtained in the step B.
Test results
And inputting the three-dimensional ultrasonic image in the test set into the trained 3D FCN, wherein the 3D FCN can automatically identify the head volume of the fetus.
The method uses Mean Square Error (MSE) to measure the similarity of fetal craniocerebral ultrasonic images, and uses Mean Absolute Percentage Error (MAPE) to measure the detection rate of fetal head volume.
Specifically, the formula for calculating the mean square error is:
where n is the number of samples in the data set, y
iIs the actual value of the volume of the head of the fetus,
Is the inferential output of the fetal head volume. The calculation formula of the average absolute percentage error is as follows:
where n is the number of samples in the data set, x
iIs the actual value of the volume of the head of the fetus,
Is the inferential output of the fetal head volume. The head volume detection rate is 1-MAPE. The Mean Square Error (MSE), Mean Absolute Percent Error (MAPE), and head volume detection rate of the trained model on the new test set are shown in table 1 below.
TABLE 1
As can be seen from table 1, the head volume detection rate of the method of the present invention is high, and the Mean Square Error (MSE) and the Mean Absolute Percentage Error (MAPE) are low.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.