CN112349407A

CN112349407A - Shallow ultrasonic image focus auxiliary diagnosis method based on deep learning

Info

Publication number: CN112349407A
Application number: CN202010582572.0A
Authority: CN
Inventors: 许学志
Original assignee: Shanghai Shiyi Intelligent Technology Co ltd
Current assignee: Shanghai Shiyi Intelligent Technology Co ltd
Priority date: 2020-06-23
Filing date: 2020-06-23
Publication date: 2021-02-09

Abstract

The invention relates to a shallow ultrasonic image focus auxiliary diagnosis method based on deep learning, which relates to the field of auxiliary medical diagnosis, and is characterized by coding all ultrasonic characteristic groups of an ultrasonic image, forming a characteristic coding dictionary Dict _ F after coding, acquiring an original shallow ultrasonic image, enhancing the original shallow ultrasonic image, adjusting the original shallow ultrasonic image to a standard shallow ultrasonic image with the width and the height of 832 to 640, carrying out polygon segmentation on focus parts in the obtained shallow ultrasonic image to form focus segmentation labels, constructing a neural network, training parameters of the neural network by using an image training set to obtain the neural network capable of carrying out focus positioning and characteristic analysis on the shallow ultrasonic image, carrying out auxiliary diagnosis and analysis on the shallow ultrasonic image by using the trained convolutional neural network, having high speed, high efficiency and high precision, and realizing automatic analysis of the characteristics of the shallow ultrasonic focus, can effectively carry out auxiliary diagnosis on the condition of the superficial ultrasonic image focus of the superficial visceral organ disease.

Description

Shallow ultrasonic image focus auxiliary diagnosis method based on deep learning

Technical Field

The invention relates to the field of auxiliary medical diagnosis, in particular to a superficial ultrasonic image focus auxiliary diagnosis method based on deep learning.

Background

Ultrasonic medicine has wide application in diagnosis of superficial abnormal tissues, and especially, early diagnosis of breast cancer, diagnosis of carotid artery vascular plaque and the like are one of the most common diagnosis means in hospitals at present. The greatest difference between the field of ultrasound medical vision and the field of common real-world vision (such as human face detection) is that the field of ultrasound medical vision cannot locate and detect a focus according to the image characteristics, such as the language, shape and size, but needs to restore the ultrasound gray level image to an organ anatomical structure with spatial stereo characteristics in the brain of a doctor, and then locates and detects the focus. Therefore, ultrasonic diagnosis has high requirements on the professional level of doctors, and the diagnosis accuracy rate is low for primary hospitals which lack high-level knowledge.

Ultrasonic medical image visual detection can effectively help primary doctors to improve diagnosis accuracy, but most of the current visual detection methods are carried out based on a U-Net artificial neural network, and are characterized in that the method has higher sensitivity on characteristics such as the shape, peripheral culture and the like of a specific focus target in an ultrasonic medical image, but is insensitive to medical information reflected by the space structures of different targets in the image.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide a method which has the advantages of high speed, high efficiency and high precision, realizes the automatic analysis of the characteristics of superficial ultrasonic lesions, and can effectively perform auxiliary diagnosis on the superficial ultrasonic image lesion conditions of superficial visceral organ diseases.

In order to achieve the purpose, the invention provides the following technical scheme:

a superficial ultrasonic image lesion auxiliary diagnosis method based on deep learning comprises the following steps:

coding all ultrasonic characteristic groups of the ultrasonic image to form a characteristic coding dictionary Dict _ F after coding;

step two: acquiring an original superficial ultrasonic image, enhancing the original superficial ultrasonic image, and adjusting the original superficial ultrasonic image to a standard superficial ultrasonic image with the width and the height of 832 × 640;

step three: performing polygon segmentation on the focus part in the superficial ultrasonic image obtained in the step 2 to form a focus segmentation label;

step four, constructing a neural network, and training parameters of the neural network by using an image training set to obtain the neural network capable of performing focus location and characteristic analysis on the superficial ultrasonic image;

and fifthly, performing auxiliary diagnosis and analysis on the superficial ultrasonic image by using the convolutional neural network obtained by training.

The invention is further configured to: in the third step, the ultrasonic characteristic groups of the segmentation labels are described, whether each ultrasonic characteristic group is true or not is judged during description, if true, the codes corresponding to the characteristics are obtained, and all the obtained codes are sequentially arranged to form the characteristic labels of the segmentation labels.

The invention is further configured to: and setting the standard superficial ultrasonic image as a training sample accounting for 90% of the total amount, and setting the segmentation label and the characteristic label as test samples accounting for 10% of the total amount.

The invention is further configured to: the neural network comprises a common backbone network from which a detection subnetwork for lesion detection and an analysis subnetwork for lesion characterization are separated.

The invention is further configured to: the backbone network of the fourth step is constructed by the following steps:

step 4.1.1: inputting a group of training samples with the size of 832 x 640, and measuring 1, 2, 4 or 8 images in the group;

step 4.1.2: dividing the pixel value of each channel of the input image by 127.5 and subtracting 1 to make the value of each channel of the pixel fall between the interval [0,1 ];

step 4.1.3: using convolution kernels with the size of 7 × 7 and the number of channels of 16 as the output of the step 4.1.2, and performing convolution operation with the step size of 2 × 2 to extract features so as to obtain a feature map with the number of channels of 16 and the size of 416 × 320;

step 4.1.4: performing pooling operation on the output of the step 4.1.3 with the size of 3 × 3 and the step size of 2 × 2, wherein the output size is 208 × 160 and the number of channels is 16;

step 4.1.5: performing feature extraction on the output of the step 4.1.4 by using a residual module Block with the scale of 1 x 1 and the number of output channels of 64, and accumulating and repeating for three times;

step 4.1.6: performing a feature extraction operation on the output of the step 4.1.5 by using a residual module Block with the scale of 2 x 2 and the number of output channels of 128, and outputting a feature map with the size of 104 x 80 and the number of channels of 128;

step 4.1.7: performing feature extraction on the output of the step 4.1.6 by using a residual module Block with the scale of 1 × 1 and the number of output channels of 128, and accumulating and repeating for three times;

step 4.1.8: performing a feature extraction operation on the output of the step 4.1.7 by using a residual module Block with the scale of 2 x 2 and the number of output channels of 256, and outputting a feature map with the size of 52 x 40 and the number of channels of 256;

step 4.1.9: performing feature extraction on the output of the step 4.1.8 by using a residual Block with the scale of 1 × 1 and the number of output channels of 256, accumulating and repeating twenty-two times, and performing convolution operation on the result by using a convolution kernel with the size of 3 × 3 and the number of channels of 64 to extract features, wherein the convolution operation has the step size of 1 × 1;

step 4.1.10: extracting the output result of the step 4.1.9 from the characteristic diagram corresponding to each channel along the channel direction, and splicing along the width direction to form characteristic diagrams with the size of 3328 × 40 and the number of channels of 1;

making the new feature graph formed in the step 4.1.10 into a patch extraction operation with the size of 64 x 4 and the step size of 64 x 1 to form a patch set;

step 4.1.11: and (3) calculating the patch set through a bidirectional long and short memory model with the cell number of 128, and forming a feature map with the size of 52 x 40 and the channel number of 512 after the calculation result passes through a full connection layer fc1 and reshape.

Step 4.1.12: and (4) performing convolution operation on the output result of the step (4.1.11) by using a convolution kernel with the size of 3 × 3, the step size of 2 × 2 and the number of channels of 512 to finally form a characteristic diagram with the size of 26 × 20 and the number of channels of 512.

The invention is further configured to: the construction process of the detection sub-network comprises the following steps:

step 4.2.1: performing an upsampling operation on the output of the step 4.1.12 to form a feature map with the size of 52 x 40 and the number of channels of 256, copying the output result of the step 4.1.8, and splicing the copied result behind the feature map along the direction of the channels;

step 4.2.2: performing feature extraction on the output result of the step 4.2.1 by using a residual module Block with the scale of 1 x 1 and the number of output channels of 128, and accumulating and repeating the operation twenty-three times;

step 4.2.3: performing up-sampling operation on the output of the step 4.2.2 to form a characteristic diagram with the size of 104 x 80 and the number of channels of 128, copying the output result of the step 4.1.6, and splicing the copied result behind the output result along the direction of the channels;

step 4.2.4: performing feature extraction on the output result of the step 4.2.3 by using a residual module Block with the scale of 1 x 1 and the number of output channels of 16, and accumulating and repeating for four times;

step 4.2.5: performing upsampling operation on the output of the step 4.2.4 to form a characteristic diagram with the size of 208 x 160 and the number of channels of 16, copying the output result of the step 4.1.4, and splicing the copied result behind the output result along the direction of the channels;

step 4.2.6: performing feature extraction on the output result of the step 4.2.5 by using a residual module Block with the scale of 1 x 1 and the number of output channels of 16, and accumulating and repeating for three times;

step 4.2.7: performing an upsampling operation on the output of the step 4.2.6 to form a characteristic diagram with the size of 416 x 320 and the number of channels of 16, copying the output result of the step 4.1.3, and splicing the copied result behind the characteristic diagram along the direction of the channels;

step 4.2.8: performing an upsampling operation on the output of the step 4.2.7 to form a feature map with the size of 832 × 640 and the number of channels of 16;

step 4.2.9: and (3) passing the feature map obtained in the step (4.2.8) through a full connection layer fc2 to form a feature map with the size of 832 × 640 and the number of channels of 2, and performing softmax operation on the output feature map in the channel direction to obtain a superficial ultrasonic image lesion detection result.

The invention is further configured to: the construction process of the analysis sub-network comprises the following steps:

step 4.3.1: taking the external rectangle of the segmentation label corresponding to the image used for constructing the main network in the step 4.1.1, transforming the obtained rectangle,

performing operation on the characteristic diagram obtained in the step 4.1.11, and outputting the characteristic diagram with the shape of 104 × 80 and the number of channels of 256;

clipping the feature graph sampled in the step by using the transformed external rectangle to obtain a clipping result in the range of the rectangle, and adjusting the clipping result to be the feature graph with the size of 26 × 13 and the number of channels of 256;

step 4.3.2: calculating the cut characteristic diagram obtained in the step 4.3.1 through a bidirectional long-short memory model with the cell number of 128, and then forming a characteristic diagram with the size of N × Length (Dict _ F) and the channel number of 1 through a full-connection layer fc 3;

step 4.3.3: splitting the characteristic diagram output in the step 4.3.2 into columns with the number of C in the column direction₁、C₂，…、C_sIs characterized by comprising a characteristic diagram of (A),

step 4.3.4: respectively aligned in rows of C₁、C₂，…、C_sAnd performing Softmax operation on the characteristic diagram to obtain a superficial ultrasonic characteristic analysis result.

The invention is further configured to: the method of transformation in step 4.3.1 is to divide all coordinate values representing the circumscribed rectangle by 8 to obtain a new circumscribed rectangle.

Compared with the defects of the prior art, the invention has the beneficial effects that:

compared with the traditional ultrasonic medical image visual detection method based on U-Net, the method has the advantages that the spatial structure information of pixel block level is added, so that the spatial anatomical information of human tissues can be restored from the ultrasonic gray level image, and the focus can be positioned and detected more accurately.

Furthermore, the invention not only can position and detect the focus, compared with the traditional visual detection which separately processes focus identification and focus characteristic analysis, the invention also provides a method for simultaneously and parallelly processing the focus and the focus, which reduces the intermediate information loss caused by the separate processing, thereby more accurately diagnosing the ultrasonic characteristic of the focus, and explaining the reason that the target in the image is the focus according to the ultrasonic characteristic.

Drawings

FIG. 1 is a flow chart of a superficial ultrasound image lesion aided diagnosis method based on deep learning;

FIG. 2 is a schematic diagram of a neural network backbone network structure for superficial ultrasound image lesion detection and analysis;

FIG. 3 is a schematic diagram of a neural network structure for superficial ultrasound image lesion detection;

FIG. 4 is a schematic diagram of a neural network structure for lesion feature analysis of superficial ultrasound images;

FIG. 5 is a Block structure diagram in the neural network.

Detailed Description

Embodiments of the present invention are further described with reference to fig. 1 to 5.

the features within the ultrasound feature set are logically mutually exclusive, and all feature sets include the "shape", "aspect ratio", "echo intensity", "boundary definition", "echo category", "posterior echo feature", "lateral ghost", "lesion internal calcification", "lesion peripheral calcification", "comet tail" feature set.

And in the third step, describing the ultrasonic characteristic groups of the segmentation labels, judging whether each ultrasonic characteristic group is true or not during description, acquiring codes corresponding to the characteristics if the ultrasonic characteristic group is true, and sequentially arranging all the acquired codes to form the characteristic labels of the segmentation labels.

And setting the standard superficial ultrasonic image as a training sample accounting for 90% of the total amount, and setting the segmentation label and the characteristic label as test samples accounting for 10% of the total amount.

The neural network comprises a common backbone network from which a detection subnetwork for lesion detection and an analysis subnetwork for lesion characterization are separated.

The backbone network of the neural network adopts a residual error network. The output of the sub-network of lesion detection during the auxiliary diagnostic stage will be used not only as lesion detection, but also as input to the sub-network of lesion characterization analysis.

As shown in fig. 2, the backbone network in step four is constructed by:

making the new feature map formed in the step 4.1.10 into a patch extraction operation with the size of 64 × 4 and the step size of 64 × 1 to form a patch set (namely, dividing the feature map into patch blocks which are orderly arranged;

step 4.1.11: calculating the patch set through a bidirectional long and short memory model with the cell number of 128, and forming a feature map with the size of 52 x 40 and the channel number of 512 after the calculation result passes through a full connection layer fc1 and reshape;

Spatial structure information at the pixel block level is added to the backbone network (steps 4.1.10-4.1.12), and therefore spatial anatomical structure information is more expressible than in conventional methods.

The purpose of Patch extraction (Patch) and bidirectional long-short memory model operation on the Patch in the process of constructing the backbone network is to enable a neural network to better express spatial anatomical structure information contained in the superficial ultrasonic image, so that the invention can more accurately detect and analyze the focus in the superficial ultrasonic image.

As shown in fig. 3: the construction process of the detection sub-network comprises the following steps:

The detection subnetwork adopts multi-scale splicing operation (the concatenate operation on multiple scales in fig. 3), and the purpose is to enable the neural network to better detect superficial ultrasonic lesions of different sizes.

As shown in fig. 4: the construction process of the analysis sub-network comprises the following steps:

step 4.3.2: calculating the cut characteristic diagram obtained in the step 4.3.1 through a bidirectional long-short memory model with the cell number of 128, and then forming a characteristic diagram with the size of N × Length (Dict _ F) and the channel number of 1 through a full-connection layer fc 3; where N is the number of images in the batch in step 4.1.1 and Length (Dict _ F) represents the number of all ultrasound features.

Step 4.3.3: splitting the characteristic diagram output in the step 4.3.2 into columns with the number of C in the column direction₁、C₂，…、C_sWherein, C_sRepresenting the number of columns of the s-th group, the real values of which can be found from the dictionary Dict _ F, C₁+C₂+…+ C_s= Length(Dict_F)；

The method of transformation in step 4.3.1 is to divide all coordinate values representing the circumscribed rectangle by 8 to obtain a new circumscribed rectangle.

A residual module Block is used during the construction of a backbone network, so that the feature extraction capability of the image is not reduced at least while the network depth is increased. Compared with a convolution neural network based on U-Net, the method for constructing the backbone network is more suitable for the ultrasonic image with complex characteristics.

The structure of the residual Block involved in the process of constructing the backbone network and the first subnet is shown in fig. 5:

the input of the method comprises an image with the number of channels M, a scale parameter s which represents how many times the output result needs to be reduced, and the number of channels N to be output, wherein if M is equal to N, the scale parameter is only fixed to be 1. The method comprises the steps of firstly, performing convolution operation on an input image by using a convolution kernel with the size of 1 x 1, the step length of s x s and the number of channels of N/4, secondly, performing batch regularization processing on a result of the convolution operation, thirdly, performing convolution operation on a result after batch regularization by using a convolution kernel with the size of 3 x 3, the step length of 1 x 1 and the number of channels of N/4, fourthly, performing batch regularization processing on convolution output, fifthly, continuously performing convolution operation on a result after batch regularization by using a convolution kernel with the size of 1 x 1, the step length of 1 x 1 and the number of channels of N, and finally, sixthly, adding the convolution result of the fifth step and the input image of the first step to obtain a final output. And adding the convolution result of the fifth step and the input image of the first step to form a branch, directly adding the branch if the number M of channels of the input image is equal to the number N of channels to be output, and if the branch is not equal to the number N of channels to be output, performing convolution operation on the input image by using convolution kernels with the size of 1 x 1, the step length of 1 x 1 and the number of channels of N and then adding the convolution kernels.

The training neural network is programmed by adopting Python language, and the neural network is built by adopting a Tensorflow framework. The model is trained by adopting a random gradient descent method, and a cross entropy function is adopted as an optimization target function.

Training the neural network needs to be iterated for many times, after samples in the training set are iterated, random shuffling (shuffle) needs to be carried out on the training set again, and then iterative training is continued until the accuracy rate of testing by using samples in the testing set reaches more than 95%.

In the fifth step, the trained neural network model is stored in a format of 'pb' of Tensorflow. In the stage of auxiliary diagnosis and analysis of superficial ultrasonic images, a trained model is loaded under a C + + programming language by using a library of Libtensflow.

The method for performing auxiliary diagnosis and analysis on the superficial ultrasonic image by using the convolutional neural network obtained by training comprises the following steps:

step 5.1: the focus detection subnet outputs the probability that the image area is the focus, and the removal probability is larger than 97% of the area to obtain the focus segmentation detection result;

step 5.2: and taking an external rectangle of the obtained lesion segmentation detection result, converting a rectangular coordinate (dividing each coordinate by 8), and inputting the rectangular coordinate into a lesion analysis subnet to obtain a lesion characteristic analysis result.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and those skilled in the art should be able to make general changes and substitutions within the technical scope of the present invention.

Claims

1. A superficial ultrasonic image focus auxiliary diagnosis method based on deep learning is characterized by comprising the following steps: the method comprises the following steps:

2. The shallow ultrasound image lesion auxiliary diagnosis method based on deep learning of claim 1, wherein: in the third step, the ultrasonic characteristic groups of the segmentation labels are described, whether each ultrasonic characteristic group is true or not is judged during description, if true, the codes corresponding to the characteristics are obtained, and all the obtained codes are sequentially arranged to form the characteristic labels of the segmentation labels.

3. The shallow ultrasound image lesion auxiliary diagnosis method based on deep learning of claim 2, wherein: and setting the standard superficial ultrasonic image as a training sample accounting for 90% of the total amount, and setting the segmentation label and the characteristic label as test samples accounting for 10% of the total amount.

4. The shallow ultrasound image lesion auxiliary diagnosis method based on deep learning of claim 2, wherein: the neural network comprises a common backbone network from which a detection subnetwork for lesion detection and an analysis subnetwork for lesion characterization are separated.

5. The deep learning-based superficial ultrasonic image lesion auxiliary diagnosis method of claim 4, wherein:

the backbone network of the fourth step is constructed by the following steps:

6. The deep learning-based superficial ultrasonic image lesion auxiliary diagnosis method of claim 4, wherein:

the construction process of the detection sub-network comprises the following steps:

7. The deep learning-based superficial ultrasonic image lesion auxiliary diagnosis method of claim 4, wherein:

the construction process of the analysis sub-network comprises the following steps:

8. The deep learning-based superficial ultrasonic image lesion auxiliary diagnosis method of claim 7, wherein: the method of transformation in step 4.3.1 is to divide all coordinate values representing the circumscribed rectangle by 8 to obtain a new circumscribed rectangle.

9. The deep learning-based superficial ultrasonic image lesion auxiliary diagnosis method of claim 6, which is characterized in that: the first subnet outputs the probability of the focus in the image region, and the probability of the focus is removed from the region with the probability more than 97 percent to obtain the focus segmentation detection result.

10. The deep learning-based superficial ultrasonic image lesion auxiliary diagnosis method of claim 9, wherein: and (4) taking the external rectangle of the obtained focus segmentation detection result, then transforming the rectangular coordinate, and inputting the result into a focus analysis subnet to obtain a focus characteristic analysis result.