CN113781439B

CN113781439B - Ultrasonic video focus segmentation method and device

Info

Publication number: CN113781439B
Application number: CN202111065625.2A
Authority: CN
Inventors: 马璐; 王东; 王立威; 张文涛; 王子腾; 张佳琦; 丁佳; 胡阳; 吕晨翀
Original assignee: Guangxi Yizhun Intelligent Technology Co ltd; Beijing Yizhun Medical AI Co Ltd
Current assignee: Guangxi Yizhun Intelligent Technology Co ltd; Zhejiang Yizhun Intelligent Technology Co ltd
Priority date: 2020-11-25
Filing date: 2020-11-25
Publication date: 2022-07-29
Anticipated expiration: 2040-11-25
Also published as: CN112446862A; CN113781440A; CN113781439A; CN113781440B; CN112446862B

Abstract

The application provides an ultrasonic video focus segmentation method and device; the method comprises the following steps: acquiring an image of a focus from an ultrasonic image; extracting the features of the image of the focus by using a cavity convolution network to obtain a feature map of the image of the focus; obtaining context information of a feature map of the focus image based on a pyramid pooling module to obtain a pyramid feature map; and segmenting the image of the focus from the ultrasonic image based on the pyramid feature map and the feature map of the image of the focus.

Description

Ultrasonic video focus segmentation method and device

The application has the application number of 202011333447.2, the application date of 11-25.2020, and the invention name is as follows: a dynamic breast ultrasound video full focus real-time detection and segmentation device and system based on artificial intelligence and a divisional application of an image processing method.

Technical Field

The application relates to the field of medical images, in particular to an ultrasonic video focus segmentation method and device.

Background

The breast cancer is malignant breast tumor, and the data published by the national cancer center shows that the breast cancer is in the 1 st place of malignant female tumor in China, and seriously threatens the health of women. Professor Hillman, university of virginia, 2010, drafted on N Engl J Med: accurate early diagnosis can improve 5-year survival rate of breast cancer patients from 25% to 99%.

The mammary gland ultrasonic technology has the advantages of no wound, rapidness, strong repeatability and the like, and can clearly display the change of the shape, the internal structure and the adjacent tissues of each layer of soft tissue of the mammary gland and the tumor in the soft tissue. Because of no radioactivity, the method is suitable for mammary gland examination of women of any age, especially pregnant women and lactating women. The part (such as the edge of the mammary gland) which is difficult to irradiate X-ray can be used for the compensation examination, and the position, the shape, the structure and the like of the tumor can be better displayed. For denser mammary gland, when the lump is difficult to distinguish, the ultrasound can clearly display the outline and form of the focus by the difference of sound wave interface reflection.

However, ultrasonic examination in China faces two major problems: firstly, an ultrasonic doctor is difficult to cultivate, the doctor needs to be trained before the ultrasonic image can be correctly interpreted, the learning period is long, the learning difficulty is high, and different operators have differences in interpretation of the image; secondly, the ultrasonic doctors in China are seriously lack of resources, the annual book of the clinical council statistics is displayed, and the ultrasonic registered doctors have at least 10 gaps of ten thousand persons. At present, the imbalance between the high demand and the real supply of breast ultrasound image diagnosis has become one of the main problems to be solved urgently in clinical practice.

The development of comprehensive digitization of medical images and computer technology has brought hope for solving the problem from the technical level. Computer aided detection/diagnosis (CAD) systems were first developed. CAD is an AI technique that artificially extracts features from medical images, labels suspicious lesion locations, and makes a judgment of the malignancy or malignancy of lesion areas by comprehensively using computer, mathematical, statistical, image processing and analysis methods. The training method is easy to understand, and the result is calculated according to the input features, so that the training efficiency and accuracy can be effectively improved, and the complexity of calculation is reduced. However, the traditional CAD has single function and insufficient performance, the false positive rate of lesion detection is too high, the lesion detection can quickly reach the bottleneck in performance, and the clinical value of the traditional CAD is not fully determined.

In recent years, with the occurrence and maturity of deep learning algorithms, the application of AI technology in medical images gradually goes to a higher level, and the possibility is brought for breaking through the accuracy bottleneck of the traditional CAD system. Unlike conventional CAD, deep learning can perform subsequent image processing without relying on manually extracted features. The learner indicates that deep neural network extracted features are sometimes more efficient than human designed features. This is also evidenced by the successful construction of a number of ultrasound CAD models and the excellent diagnostic capabilities. For example, liu and Shi et al apply supervised deep learning techniques to breast ultrasound images, apply S-DPN networks to two small breast ultrasound data sets, and achieve a maximum classification accuracy of 92.4% after some post-processing methods such as SVM are added; han S et al used the deep convolutional network GoogLeNet CNN to carry out the classification of mammary gland ultrasound image to 7408 ultrasound images of 5151 patients, realized the end-to-end study, and the classification accuracy rate has reached 90%, has exceeded human doctor. However, most of the existing explorations also focus on nodules of two-dimensional images, and for practical clinical application scenes, firstly, the auxiliary detection based on the two-dimensional images has very limited clinical help, doctors are usually required to manually capture images, and then the images are transmitted to a server for detection, and the images are continuously changed in the process of clinical scanning ultrasound, so that the detection mode not only breaks the diagnosis process of the doctors and increases the operation burden, but also the doctors cannot capture the images once after the images are changed once, and the detection mode cannot be applied to the clinic completely; secondly, most researches only focus on auxiliary detection of nodules, and other types of focuses still need to be completely dependent on doctors, so that the doctors cannot be effectively helped to improve the confidence and the efficiency; thirdly, the two-dimensional map focus information is not sufficient, some fat or blood vessels and the like in certain sections of the ultrasonic image often look the same as the focus, and the ultrasonic image must be comprehensively judged by combining front and back images, so that a natural bottleneck exists on the basis of the accuracy of the two-dimensional map, and high false positive exists usually.

Why there is currently very little exploration about breast ultrasound motion video? Firstly, due to the lack of video data, a common ultrasonic examination only keeps a single two-dimensional graph and does not store video images, so that the video data of mammary gland ultrasound is difficult to obtain, even if the video data is obtained, the labeling difficulty of the data is extremely high, the learning of AI depends on a large amount of high-quality labeled data, the video is calculated by 30 frames per second, each examination lasts for about ten minutes, each patient is labeled with 10 × 60 × 30 to 18000 images, and the AI is completed by a high-experience senior ultrasonic doctor, the working intensity of the ultrasonic doctor is very high, the large amount of labeling is extremely difficult to complete, and the AI based on the video is impossible without high-quality large amount of data; secondly, technically, the difficulty from a model with a two-dimensional image being 2D to a model with a video image being 4D is leap, the two-dimensional model only needs to consider the accuracy rate, so the model can be as complex as possible, spatial features with multiple dimensions as much as possible are extracted to achieve higher accuracy rate, more time consumption is generally used as cost, the 4D technology based on the video not only needs to consider the accuracy rate of the model, but also needs to achieve the real-time performance of the model, so that the accuracy rate cannot be improved by adopting the complex model, and the 4D technology based on the video needs to add time dimension information into the model, so that extremely high requirements are provided for the model, and at present, mature relevant models and algorithm references do not exist, and the model needs to be redesigned.

The invention is provided in view of the above.

Disclosure of Invention

Aiming at the problems that most of the existing breast ultrasound detection and segmentation researches only focus on tumors and are based on two-dimensional images, and clinical diagnosis generally needs to be comprehensively judged by combining information of front and back frames, so that the artificial intelligence clinical usability based on the two-dimensional images is poor, such as high false positive, incapability of realizing real-time detection and the like, and the invention provides a device, a system and a detection method for detecting and segmenting all focuses of dynamic breast ultrasound video based on AI (artificial intelligence) in order to effectively solve the problems of poor clinical usability and serious shortage of clinical help in the prior art, so as to solve the clinical missed diagnosis problem caused by visual fatigue and insufficient visual sensitivity and improve the diagnosis efficiency of doctors.

In order to achieve the above object, a first aspect of the present invention provides an ultrasound video lesion segmentation method, including: acquiring an image of a focus from an ultrasonic image; extracting the features of the image of the focus by using a cavity convolution network to obtain a feature map of the image of the focus; obtaining context information of a feature map of the focus image based on a pyramid pooling module to obtain a pyramid feature map; and segmenting the image of the focus from the ultrasonic image based on the pyramid feature map and the feature map of the image of the focus.

In some embodiments, the segmenting the image of the lesion from the ultrasound image based on the pyramid feature map and the feature map of the image of the lesion includes:

performing upsampling on each pyramid feature map, wherein the feature map obtained through the upsampling has the same size as that of the feature map of the image of the focus;

performing a connection operation on the feature map obtained by the up-sampling and the feature map of the image of the focus to obtain a global feature map;

and processing the global feature map by using a convolutional layer to obtain the focus segmented from the ultrasonic image.

In some embodiments, after the segmenting the image of the lesion from the ultrasound image, the method further comprises: determining the major and minor diameters of the segmented lesion according to a morphological method.

In some approaches, the feature map of the image of the lesion is one-eighth the size of the image of the lesion.

In some approaches, the dimension of a feature in the pyramid feature map is one-fourth of the dimension of the feature map of the image of the lesion.

In another aspect, the present invention provides an ultrasound video lesion segmentation apparatus, including:

the focus segmentation module is used for acquiring an image of a focus from the ultrasonic image;

extracting the features of the image of the focus by using a cavity convolution network to obtain a feature map of the image of the focus;

obtaining context information of a feature map of the focus image based on a pyramid pooling module to obtain a pyramid feature map;

and segmenting the image of the focus from the ultrasonic image based on the pyramid feature map and the feature map of the image of the focus.

In some embodiments, the lesion segmentation module is configured to perform upsampling on each of the pyramid feature maps, where a feature map obtained through the upsampling has the same size as a feature map of an image of the lesion;

In some aspects, the lesion segmentation module is configured to determine a long diameter and a short diameter of the segmented lesion according to a morphological method.

The third aspect of the present invention also provides an electronic device, which includes a processor and a memory, where the memory stores one or more readable instructions, and the one or more readable instructions, when executed by the processor, implement the above method.

The fourth aspect of the present invention also provides a computer readable medium storing a computer program which, when executed by a processor, implements the above method.

The invention has the following outstanding technical effects:

1. The invention can provide real-time auxiliary detection segmentation results without changing an ultrasonic machine and changing the existing diagnosis process of a doctor;

2. the invention can solve all the focuses related to the mammary ultrasonic image in a one-stop way, including nodules, low echo areas, structural disorder areas, lymph nodes, catheter abnormality (catheter dilation and foreign bodies in the catheter), calcification and the like;

3. the intelligent detection and segmentation system based on the dynamic breast ultrasound video can automatically detect the focus in real time while scanning a patient, automatically and intelligently segment the detected focus, achieve the calculation efficiency of calculating 50 times per second under the condition of ensuring high accuracy, save the time of a doctor operating an ultrasound machine to measure the focus, improve the efficiency and completely meet the real-time requirement;

4. by adopting a fastercnnn network with speed and precision, a good precision effect is obtained while a real-time effect is achieved;

5. the overfitting problem caused by the undersize medical data is solved in a data amplification mode;

6. on the basis of fully observing the flow and the method for detecting the focus of a study doctor, an LSTM module is introduced to extract time dimension information, the information of the front frame and the back frame is effectively utilized, and the false positive of the detection is greatly reduced;

7. By introducing an attention mechanism, the detection rate is improved, and false positives are reduced;

8. the method can adapt to the condition of uneven ultrasonic image quality levels caused by different models and different parameter settings by controlling the distribution of the data set and preprocessing the ultrasonic image of the mammary gland, and has good robustness and stable performance.

Generally, the invention can realize automatic real-time detection of all focuses related to the dynamic breast ultrasound video image on the premise of not changing an ultrasonic machine and the existing diagnosis process, and can intelligently segment and measure the detected focuses, thereby improving the efficiency and the accuracy and effectively helping doctors to reduce missed diagnosis.

Drawings

FIG. 1 illustrates a system for artificial intelligence based dynamic breast ultrasound video full lesion real-time detection and segmentation in accordance with the present invention;

FIG. 2 illustrates a network structure of fast RCNN;

FIG. 3 illustrates feature extraction using a recursive feature pyramid RFP network;

FIG. 4 illustrates learning a feature offset using a Deformable conditional network;

FIG. 5 illustrates the extraction of time dimension information using an LSTM network;

FIG. 6 illustrates a regression of a frame and classification of a lesion obtained using an attention mechanism;

Fig. 7 shows a flow chart of lesion segmentation for a bounding box of a lesion;

FIG. 8 is a diagram of the result of active area segmentation of an ultrasound image;

fig. 9 is a diagram showing the result of data normalization processing performed on an ultrasound image;

FIG. 10 is a graph showing the effect of inverting the ultrasonic jump left and right and up and down;

FIG. 11 is a plot of the detection of FROC according to the present invention;

fig. 12-17 are diagrams illustrating the effect of detecting segmentation of an ultrasound image.

Detailed Description

In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.

Definition of

Unless defined otherwise below, all technical and scientific terms used herein are intended to have the same meaning as commonly understood by one of ordinary skill in the art. Reference to the techniques used herein is intended to refer to those techniques commonly understood in the art, including those variations of or alternatives to those techniques that would be apparent to those skilled in the art. While the following terms are believed to be well understood by those skilled in the art, the following definitions are set forth to better explain the present invention.

As used herein, the terms "comprising," "including," "having," "containing," or "involving," and other variations thereof herein, are inclusive (or open-ended) and do not exclude additional unrecited elements or method steps.

Where an indefinite or definite article is used when referring to a singular noun e.g. "a" or "an", "the", this includes a plural of that noun.

The terms "about" and "substantially" in the present invention denote an interval of accuracy that can be understood by a person skilled in the art, which still guarantees the technical effect of the feature in question. The term generally means ± 10%, preferably ± 5% of the indicated value.

Furthermore, the terms first, second, third, (a), (b), (c), and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments of the invention described herein are capable of operation in other sequences than described or illustrated herein.

The embodiment of the invention provides a dynamic breast ultrasound video full-focus real-time detection and segmentation system based on artificial intelligence. As shown in fig. 1, the system at least comprises an ultrasound machine, an ultrasound machine display, an AI server, and an AI display. Wherein, the AI-based dynamic breast ultrasound video detection and segmentation device is deployed in an AI server. The ultrasonic machine equipment provides a video output port, the ultrasonic machine video output line is connected with the AI server through the video output port, the AI server can receive ultrasonic dynamic video signals in real time, real-time analysis is carried out, and finally, analysis results are displayed to doctors in real time through the AI display.

The AI-based dynamic breast ultrasound video detection and segmentation device at least comprises: (1) the system comprises a system robustness design module, (2) a data preprocessing module, (3) a data amplification module, (4) a focus detection module and (5) a focus segmentation module.

(1) System robustness design module

Deep learning is based on big data, but in the past, people pay more attention to the design of a model architecture, but the attention degree to the data is insufficient, more and more researches prove that the quality and the quantity of the data are crucial to the final performance of the model, the performance of the model can be greatly improved by a reasonably distributed high-quality data set, in order to improve the accuracy of the method and the robustness of ultrasonic images with different qualities caused by different models and different parameters and improve the efficiency and the accuracy of subsequent detection and segmentation, the method pertinently designs the data set, and the specific requirements are as follows:

1) the data of different parameters of different models are as follows 1: 1 was collected.

2) Data for normal and abnormal (including nodules, hypoechoic areas, disorganized areas, lymph nodes, ductal abnormalities (ductal dilatation and intraductal foreign bodies) and calcification) were as follows 1: 1 was collected.

3) The types of data in the abnormality data (including nodules, hypoechoic areas, areas of structural disorder, lymph nodes, ductal abnormalities (ductal dilatation and intraductal foreign bodies) and calcification) were as follows 1: 1 was collected.

The design objective of this module is: the method has the advantages that firstly, the system can automatically adapt to ultrasonic images of different types and different qualities under different parameter settings, and meanwhile, the efficiency and accuracy of subsequent detection and segmentation are improved; and secondly, all types of focuses designed for covering the ultrasonic image of the product.

(2) Data preprocessing module

In order to improve the calculation efficiency, reduce the calculation time consumption, accelerate the model convergence speed, save the training time and simultaneously improve the model precision, the invention designs a data preprocessing module which mainly comprises:

2.1 efficient area segmentation Module

Besides a truly meaningful ultrasonic image, the acquired ultrasonic video image also comprises a plurality of parts which are meaningless to the diagnosed focus, and the parts of the images which are meaningless to the diagnosed focus can increase the calculation amount and reduce the calculation efficiency, so the invention designs an effective region segmentation module, which specifically comprises the following steps:

1) setting an effective area range according to different machine types;

2) and reading the video images, and segmenting the images of each frame of image according to the corresponding effective area range for participating in subsequent processing and training.

Fig. 8 is a diagram showing the result of effective region segmentation of an ultrasound image.

2.2 data normalization Module

The invention adopts a Max/Min normalization method to map data to a specified range, reduces the difference caused by different parameters, simplifies calculation, accelerates the convergence speed of the model and improves the precision of the model. The method comprises the following steps: reading video images, and normalizing each frame of image;

5) traversing the whole image of the single-frame image, and finding out the maximum value and the minimum value of the gray value;

and calculating the gray value of each pixel according to a formula to obtain the value after normalization. Fig. 9 is a diagram showing the result of data normalization processing performed on an ultrasound image.

(3) Data amplification module

Due to the particularity of medical data, the acquisition and labeling of the medical data are difficult and extremely high in cost, so that the data volume of medical images is difficult to reach millions or hundreds of thousands like natural images, and even tens of thousands of medical images are extremely difficult, in this case, for deep learning based on large data, a training set sample is increased in a data amplification mode, and the problem of model overfitting caused by insufficient data volume can be greatly relieved, and the specific method comprises the following steps:

1) the video is read in sequence.

2) Whether to invert is randomly selected for the current video.

3) If the current video selection is reversed, the random selection reversing method (including left reversing and up-down reversing) is continued

Fig. 10 is a graph showing the effect of inverting the ultrasound jump in the left-right direction and in the up-down direction.

(4) Nidus detection module

The stage is mainly to use the ultrasonic video image of the patient to carry out the model training of detection and segmentation, and mainly comprises the following parts:

i. platform

The detection method is based on a Detectron2 platform, Detectron2 is a target detection platform disclosed by FAIR in 2018, and the detection method is realized based on PyTorch and takes a mask rcnn benchmark test as a starting point. With a completely new modular design, Detectron2 becomes more flexible and easily scalable, it can provide faster training speeds on single or multiple CPU servers, currently encompassing a large number of the most representative industry-wide target detection, image segmentation, keypoint detection algorithms.

ii. frame

In order to improve the calculation speed and achieve ideal calculation precision, the invention adopts a two-stage detection framework FasterRCNN (shown in figure 2) which is excellent in speed and precision.

Model improvement

Because the particularity of the ultrasonic video image is different from common static medical images such as CT, MR and the like and common natural video images, the currently disclosed framework can not achieve a real-time high-precision detection effect, so that the invention carries out a plurality of innovative changes on the disclosed framework, and finally the model can achieve a real-time high-precision detection effect in the detection task of the breast ultrasonic video image, and the specific steps are as follows:

1) The method comprises the steps of utilizing a Recursive Feature Pyramid (RFP) network to extract features, respectively inputting a plurality of continuous ultrasonic images into the Recursive Feature Pyramid (RFP) network, extracting the features, and generating FeatureMap, so that the robustness of a model to the scale can be increased and the precision of the model can be improved; and secondly, the RFP adds the additional feedback connection in the FPN layer into the trunk layer from bottom to top on the basis of the FPN, thereby increasing the attention frequency of the network to the pictures and improving the detection rate. The specific operation is as follows:

for a single frame image, firstly performing characteristic convolution on an input image from bottom to top, and performing convolution operation on the input image by adopting a 3 x 3 convolution kernel on the left side of an RFP part of a lower graph to obtain a feature map;

the Top-down network samples the high-level features one by 2 times, performs 1 x 1 convolution on the bottom-up features with the same scale as the sampled high-level features to reduce the dimensionality, and adds the bottom-up features with the reduced dimensionality and the corresponding elements of the top-down features with the same scale to obtain a new feature map;

c. The extra feedback connection in top-down is added to the bottom-up network (as indicated by the dashed part in fig. 3).

2) Learning feature offsets using a Deformable conditional network

The Deformable conditional network breaks through the limitation that the traditional frame can only be rectangular by learning offset, and can improve the space information modeling capability of the current CNN network on irregular objects, thereby improving the detection precision. The specific procedure is as follows (see fig. 4):

a. learning an oFFSet for each position on each feature map, selecting kernel to be 3 × 3, considering that there may be offsets in the xy direction, so that channel is 2 × 3 — 18, i.e. obtaining oFFSet map offsets by convolution of 18 × 3 for the original feature map;

b. deformable conv is made for the original feature map, and oFFSets are transmitted to obtain a new feature map, and the reference formula is as follows:

wherein is P ₀ The center point, R, is a convolution kernel sample grid point of size 3x 3: {(-1, -1),(-1,0),...,(0,1),(1,])}P _n 9 positions of kernel of 3x3 belonging to R, Δ P _n Is an offset.

After the stage is completed, each feature map generates a new feature map.

3) Extracting time dimension information using an LSTM network

After observing and learning a great deal of diagnosis logics of doctors, the doctors find that whether the focus is a focus can cause a great number of false positives by only using a single ultrasonic image, and generally, after observing the information of the previous and next frames, the doctors integrate the information of the previous and next frames to judge whether the focus is the focus. Therefore, the LSTM is added to extract the previous and subsequent frame information of the time dimension, and if the LSTM is directly performed on the original image, the operation speed is slow, and the real-time effect cannot be achieved, so the method performs the LSTM network extraction on the extracted feature map to extract the time dimension information, and the specific method is as follows (see fig. 5):

And taking the continuous feature map obtained in the previous stage as the input of the LSTM network to obtain a new feature map.

4) Adding attention mechanism to improve detection precision

Next, using the feature map of the previous stage, inputting rpn network of the fast rcnn network to generate a prosal, then generating a prosal feature map with uniform size by ROIploling, and then generating a final feature map by accumulating the prosal feature maps of a plurality of consecutive images through weights, so as to obtain the classification of the focus and the regression of the frame (see fig. 6).

Fig. 11 is a FROC plot for the detection of the present invention, with false positive rate on the horizontal axis, i.e.: FPR ═ FP/(FP + TN); the vertical axis is sensitivity, i.e.: and recall is TP/(TP + FN). (True Positive (TP) is modeled as a Positive sample; True Negative (TN) is modeled as a Negative sample; False Positive (FP) is modeled as a Positive Negative sample; False Negative (FN) is modeled as a Negative Positive sample.)

(5) Focus segmentation module

Using the frame bounding box of the lesion generated in the fourth stage to perform lesion segmentation, the specific method is as follows (see fig. 7):

1) cutting the focus from the original image according to the size of a bounding box;

2) extracting a feature map from the cut image by using a hole convolution ResNet network to obtain a feature map with the size of the original image 1/8;

3) And (3) acquiring context information of the feature map by adopting a pyramid pooling module with the depth of 4, wherein the pooling kernels are respectively all, half and small parts of the image, reducing feature dimensions into original 1/4 through a 1 × 1 convolution layer, directly upsampling the pyramid features to the same size as the input features, and then performing concat operation with the input features to obtain a final output global feature map. And connecting the global features obtained by fusion with the original feature map.

4) The final segmentation map is generated by a layer of convolutional layers.

5) The long and short diameters of the lesions are obtained according to a morphological method.

Fig. 12-17 are diagrams illustrating the segmentation results of the fourth and fifth stages of the detection according to the present invention, wherein the present invention can achieve very good detection and segmentation results regardless of whether the lesion is a large lesion, a small lesion, a benign lesion or a malignant lesion.

Claims

1. An ultrasound video lesion segmentation method, characterized in that the method comprises:

acquiring an image of a focus from an ultrasonic image;

2. The method of claim 1, wherein the segmenting the image of the lesion from the ultrasound image based on the pyramid feature map and the feature map of the image of the lesion comprises:

performing up-sampling on each pyramid feature map, wherein the feature map obtained through the up-sampling has the same size as the feature map of the image of the focus;

performing a connection operation on the feature map obtained by the upsampling and the feature map of the image of the focus to obtain a global feature map;

3. The method of claim 1 or 2, wherein after segmenting the image of the lesion from the ultrasound image, the method further comprises:

determining the major and minor diameters of the segmented lesion according to a morphological method.

4. The method of claim 1, wherein the size of the feature map of the image of the lesion is one-eighth of the size of the image of the lesion.

5. The method of claim 1, wherein the dimension of the feature in the pyramid feature map is one quarter of the dimension of the feature map of the image of the lesion.

6. An ultrasound video lesion segmentation apparatus, the apparatus comprising:

7. The apparatus of claim 6, wherein the lesion segmentation module is configured to perform upsampling on each of the pyramid feature maps, and a feature map obtained by the upsampling has a same size as a feature map of an image of the lesion;

8. The apparatus of claim 6 or 7, wherein the lesion segmentation module is configured to determine a major and a minor axis of the segmented lesion according to a morphological method.

9. The apparatus of claim 6, wherein the feature map of the image of the lesion is one-eighth the size of the image of the lesion.

10. The apparatus of claim 6, wherein the dimension of the feature in the pyramid feature map is one quarter of the dimension of the feature map of the image of the lesion.

11. An electronic device comprising a processor and a memory, the memory having stored thereon one or more readable instructions that, when executed by the processor, implement the method of any of claims 1-5.

12. A computer-readable medium, in which a computer program is stored which, when being executed by a processor, carries out the method of any one of claims 1-5.