CN117274603A

CN117274603A - Liver focus image description method based on semantic segmentation network

Info

Publication number: CN117274603A
Application number: CN202311316607.6A
Authority: CN
Inventors: 陆小浩; 宋珂; 王雪静
Original assignee: Feiyinuo Technology Co ltd
Current assignee: Feiyinuo Technology Co ltd
Priority date: 2023-10-12
Filing date: 2023-10-12
Publication date: 2023-12-22

Abstract

The invention discloses a liver focus image description method based on a semantic segmentation network, which comprises the steps of constructing a Unet semantic segmentation network based on a lightweight network GhostNet and constructing a liver focus segmentation model; acquiring a historical liver focus ultrasonic image data set, inputting the liver focus segmentation model for training, and obtaining an optimal segmentation model; based on the segmentation result, features of the lesion image are described. According to the invention, the liver focus is segmented by constructing the focus segmentation model and describing based on the segmentation result, so that the accuracy and reliability of liver disease diagnosis are improved, and misjudgment and subjective deviation are reduced; by combining the advantages of deep learning and traditional image processing, the characteristics of the deep learning and the traditional image processing are fully utilized, and the performance and stability of the algorithm are improved; the method realizes the automatic processing and analysis of the ultrasonic image, provides accurate disease positioning and type judgment for the focus description result, provides a better treatment scheme for doctors, and improves the treatment effect and the feeling of treatment of patients.

Description

Liver focus image description method based on semantic segmentation network

Technical Field

The invention relates to the technical field of image segmentation, in particular to a liver focus image description method based on a semantic segmentation network.

Background

Early diagnosis of liver disease is important for the treatment and rehabilitation of patients. The traditional liver image diagnosis method mainly depends on experience and manual analysis of doctors, and has the problems of subjectivity, long time consumption and easy influence by the level of operators. In recent years, with the development of deep learning, intelligent recognition algorithms based on images have made remarkable progress in the field of medical imaging.

Liver lesion segmentation and description are important medical image analysis techniques based on ultrasound images and play a key role in liver disease diagnosis. Ultrasound imaging is a non-invasive, non-radiative, low-cost imaging technique that has become one of the common means of examining liver disease.

Ultrasound imaging generates real-time images of the liver by utilizing the principle of propagation and reflection of high frequency sound waves in human tissue. The method can provide direct observation of the internal structure of the liver and abnormal lesions, and has good safety and repeatability. However, the features of liver ultrasound images also present challenges to lesion segmentation and description. Ultrasound images often have problems with low contrast, noise interference, and image blurring, which makes differentiation between liver lesions and surrounding tissue difficult. In addition, different types of liver lesions have different expression forms in an ultrasonic image, such as the shape of a tumor, boundary ambiguity, internal echo characteristics and the like, which have large differences, and increase the difficulty of accurate segmentation and description. This results in the need for a clinician with a high level of judgment and analysis of the lesion area, and remains relatively focused to accurately and quickly measure and describe the lesion. By the aid of deep learning, image processing and other technologies, doctors are assisted to complete the work, information such as focus size and description is automatically filled in, so that the doctors can put time and energy into more valuable diagnosis and treatment, and meanwhile, the feeling of the patients in the diagnosis is improved.

Disclosure of Invention

The invention aims to provide a liver focus image description method based on a semantic segmentation network, which solves the problems that focus analysis is inaccurate and long diagnosis time is consumed until a patient has poor feeling of treatment caused by subjective factors in the prior art.

In order to achieve one of the above objects, an embodiment of the present invention provides a method for describing a liver focus image based on a semantic segmentation network, the method comprising:

building a Unet semantic segmentation network based on a lightweight network GhostNet, and building a liver focus segmentation model;

acquiring a historical liver focus ultrasonic image data set, inputting the liver focus segmentation model for training, and obtaining an optimal segmentation model;

based on the segmentation result, features of the lesion image are described.

As a further improvement of an embodiment of the present invention, the method further includes: the said Unet semantic segmentation network is made up of symmetrical downsampling route encoder and upsampling route decoder;

constructing convolution units in an encoder and a decoder in the Unet semantic segmentation network based on a Ghost module of the lightweight network GhostNet;

the Ghost module is used to extract more image features, including,

defining input information X ε R ^c×h×w Wherein R is a real numberA domain, c is an input channel, and h and w are height and width, respectively;

the input information is passed through a convolution kernel f ' with m filters and a kernel size of k x, so as to obtain m output feature graphs Y ', where the output feature graphs Y ' are expressed as:

Y′＝X*f′

obtaining n output characteristic diagrams Y through linear transformation of the m output characteristic diagrams Y', wherein m×s=n, n is larger than or equal to m, and Y is the output characteristic of the Ghost module;

the formula of the linear transformation is expressed as:

wherein y is _i 'is the ith feature map in Y', Φ _i,j (y _i ') means that the ith feature map is subjected to the jth linear transformation to generate the jth feature map y _ij 。

As a further improvement of an embodiment of the present invention, the method further includes: constructing a Ghost bootbox unit based on the Ghost module;

the Ghost bottleneck unit is formed by connecting a first Ghost module, a attention module and a second Ghost module in series, and a jump link exists from an input end to an output end;

the first Ghost module is used for expanding the number of channels, the attention module is used for strengthening the feature extraction capacity of the Ghost module, and the second Ghost module is used for reducing the number of channels;

applying batch normalization and Hard-swish activation functions after the first Ghost module, and applying batch normalization after the second Ghost module;

the activation function is expressed as:

as a further improvement of an embodiment of the present invention, the method further includes: respectively inputting focus images into the liver focus segmentation model in different scales, extracting feature images by encoders in corresponding scales, and sequentially passing through feature image decoders in corresponding scale encoders to obtain a final output result;

the dimensions of the encoder and the decoder include 1, 1/2,1/4,1/8 and 1/16;

the encoder consists of 2 Ghost bott leneck units, wherein the total number of the encoder is 5, and each time the feature map is downsampled by 2 times;

the decoder consists of 2 Ghost bott leneck units, 5 stages are total, and each time the feature map is sampled 2 times by one stage.

As a further improvement of an embodiment of the present invention, the method further includes: adding a cavity space convolution pooling layer ASPP into an input layer of the liver focus segmentation model;

before the characteristic images with the encoded scale of 1/16 pass through the decoder, the characteristic images with other scales are subjected to channel dimension splicing operation through the hole space convolution pooling layer ASPP;

the cavity space convolution pooling layer ASPP comprises 4 parallel cavity convolutions and 1 global average pooling;

the 4 parallel hole convolutions include a convolution of 1*1, a convolution of 3*3 having a coefficient of expansion of 3, a convolution of 3*3 having a coefficient of expansion of 6, and a convolution of 3*3 having a coefficient of expansion of 9;

the global average pooling gets up-sampling by convolution operation of 1*1;

the characteristic images obtained by the four parallel cavity convolutions and the upsampling are spliced together, and the final output is obtained by the convolution operation of 1*1;

the final output includes receptive fields of different levels.

As a further improvement of an embodiment of the present invention, the method further includes: training the liver lesion segmentation model, including,

the method comprises the steps of using a Dice-Loss function and a cross entropy Loss function for a result with an output scale of 1, and using a cross entropy Loss function for output results with other scales;

the Dice-Loss function is expressed as:

wherein, X represents the pixel label of the real segmented image, Y represents the pixel class of the model predictive segmented image, x|and Y| respectively represent the number of elements in X, Y, x|n Y| represents the number of intersection elements between X and Y;

the cross entropy loss function is expressed as:

wherein M represents the number of categories, y _ic Representing a sign function, taking 1 when the true class of sample i is equal to c, otherwise taking 0, p _ic Representing the prediction probability that the observation sample i belongs to the category c;

the scale of the output image corresponding to the weight of the loss function is 1, 0.8, 0.6, 0.4 and 0.2 respectively;

and weighting and fusing the Loss functions of different weights to obtain a total Loss function, and obtaining an optimal segmentation model by reducing the total Loss function.

As a further improvement of an embodiment of the present invention, the method further includes: based on the segmentation result, the size of the lesion is described, including,

obtaining focus edge contour points based on a segmentation result, and calculating a linear equation of a focus main direction, wherein two intersection points of the linear equation of the main direction and the focus contour are focus contour long axis endpoints;

setting a short axis searching range in a line segment between the long axis endpoints to obtain N candidate point coordinates, respectively calculating a linear equation which is vertical to the long axis and calculated by the candidate point coordinates, and intersecting the linear variance with the focus contour to obtain N groups of candidate short axis endpoints; calculating the pixel length of the short axis to be selected, wherein the minimum distance value is the focus short axis;

calculating an included angle between the long axis and the short axis and a horizontal straight line, wherein the included angle is a horizontal axis and the included angle is a vertical axis; and obtaining the horizontal axis pixel length distance by importing the actual length of a single pixel of the focus image, and calculating to obtain the actual horizontal axis and vertical axis size of the focus.

As a further improvement of an embodiment of the present invention, the method further includes: the definition of the boundary of a lesion is described based on the lesion contour, including,

performing outward expansion operation on the focus outline to obtain a focus region of interest, performing graying operation, and processing to obtain a target region of interest image and a corresponding focus region mask image;

respectively carrying out corrosion and expansion operation on the mask image to obtain an area inside the edge of the mask image, an area outside the edge of the mask image and an edge area of the mask image;

obtaining an edge image of the image of interest by using an edge algorithm, and obtaining an edge information map within a certain range of a focus contour peripheral area by combining the edge area of the mask image;

outputting the focus edge contour points on an all-zero graph, counting non-zero numbers as the total number of focus edges, and calculating the edge ratio by the non-zero values of the edge information graph;

combining the interested image and the area in the edge of the mask image, and calculating to obtain an image gray average value in a circle of range in the outline of the focus edge; combining the interested image and the area outside the edge of the mask image, and calculating to obtain an image gray average value in a circle of range outside the outline of the focus edge; performing difference on the two average values, taking an absolute value of the result, and calculating edge difference;

and judging whether the boundary of the focus is clear or not according to the edge ratio and the edge difference.

As a further improvement of an embodiment of the present invention, the method further includes: the shape of the lesion is described based on the lesion profile, including,

calculating convex hull characteristics of the focus contour to obtain a convex hull of the contour and concave points of the contour coordinates farthest from the corresponding convex hull in each concave;

screening the concave points, calculating the average distance and the maximum distance between the concave points and the convex hull, and judging whether the focus shape is regular.

As a further improvement of an embodiment of the present invention, the method further includes: based on the segmentation result, the echo type and the echo texture of the focus are described, including,

the echo types include strong echo, high echo, low echo and anechoic echo;

calculating the ratio of the gray average value in the focus outline area to the gray average value of the focus area removed in the focus circumscribed rectangular area, and judging the echo type based on the ratio;

judging the echo quality based on the gray scale characteristics in the focus;

calculating the gray mean value and variance of a focus image, determining the gray threshold value of the focus image, binarizing the focus image according to the threshold value to obtain a relatively high-brightness area in the focus, and obtaining the duty ratio of the high-brightness area in the focus image by counting the number of pixels of the high-brightness area and the number of pixels in the focus;

the entropy of the lesion image is expressed as:

wherein i is a pixel gray value, i is more than or equal to 0 and less than or equal to 225, and p _i Is the probability that pixel i appears in the entire graph;

and judging whether the echo texture is uniform or not by combining the duty ratio of the highlight region in the focus image, the variance and the entropy of the focus image.

Compared with the prior art, the liver focus image description method based on the semantic segmentation network provided by the invention has the advantages that the focus segmentation model is constructed to segment the liver focus and describe the liver focus based on the segmentation result, so that the accuracy and reliability of liver disease diagnosis are improved, and misjudgment and subjective deviation are reduced; by combining the advantages of deep learning and traditional image processing, the characteristics of the deep learning and the traditional image processing are fully utilized, and the performance and stability of the algorithm are improved; the automatic processing and analysis of the ultrasonic image are realized, the burden of doctors is reduced, and the working efficiency is improved; the result of focus description provides accurate disease location and type judgment, provides a better treatment scheme for doctors, and improves the treatment effect and the feeling of treatment of patients.

Drawings

Fig. 1 is an overall flowchart of a liver focus image description method based on a semantic segmentation network according to the present invention.

Fig. 2 is a schematic diagram of a Ghost module replacement convolution unit of the liver focus image description method based on the semantic segmentation network.

Fig. 3 is a Ghost bott leneck unit schematic diagram of a liver focus image description method based on a semantic segmentation network according to the present invention.

Fig. 4 is a schematic view of a lesion segmentation model of the liver lesion image description method based on the semantic segmentation network according to the present invention.

Fig. 5 is an ASPP schematic diagram of a cavity space convolution pooling layer of the liver focus image description method based on the semantic segmentation network.

Fig. 6 is a schematic diagram of a lesion ultrasonic image and a description result of the liver lesion image description method based on the semantic segmentation network.

Fig. 7 is a schematic diagram of a size description of a liver focus image description method based on a semantic segmentation network according to the present invention.

Fig. 8 is a focus target area interested image and a mask image of the liver focus image description method based on the semantic segmentation network.

Fig. 9 is a mask image erosion and dilation image of a liver lesion image description method based on a semantic segmentation network according to the present invention.

Fig. 10 is an image of an area inside the edge, an area outside the edge, and an edge of a mask image of the semantic segmentation network-based liver lesion image description method according to the present invention.

Fig. 11 is a schematic view of total number of focus edges and a schematic view of focus edges of the liver focus image description method based on semantic segmentation network according to the present invention.

Fig. 12 is a schematic view illustrating a lesion shape of a liver lesion image description method based on a semantic segmentation network according to the present invention.

Fig. 13 is an echo texture description schematic diagram of the liver focus image description method based on the semantic segmentation network according to the present invention.

Detailed Description

The present invention will be described in detail below with reference to specific embodiments shown in the drawings. These embodiments are not intended to limit the invention and structural, methodological, or functional modifications of these embodiments that may be made by one of ordinary skill in the art are included within the scope of the invention.

Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are exemplary only for explaining the present invention and are not to be construed as limiting the present invention.

In a first embodiment of the present invention, the present invention provides a method for describing a liver focus image based on a semantic segmentation network, as shown in fig. 1, the method includes:

s1: building a Unet semantic segmentation network based on a lightweight network GhostNet, and building a liver focus segmentation model;

s2: acquiring a historical liver focus ultrasonic image data set, inputting the liver focus segmentation model for training, and obtaining an optimal segmentation model;

s3: based on the segmentation result, features of the lesion image are described.

In one embodiment of the invention, a convolution unit in an encoder and a decoder in a net semantic segmentation network is built based on a Ghost module of a lightweight network GhostNet, specifically,

the Unet is a deep learning network architecture for semantic segmentation tasks, consisting of an encoder and a decoder. The encoder section is similar to a common convolutional neural network in that the spatial dimensions of the image are progressively reduced through a series of convolution and pooling operations to extract advanced features of the image. The decoder gradually restores the space size of the image through up-sampling and convolution operation, and fuses the features extracted by the encoder with the features of the decoder to finally generate a pixel-level semantic segmentation result. The GhostNet realizes efficient image classification and target detection in a resource-constrained environment through the combination of a Ghost module and other lightweight technologies.

In particular, the Ghost module may be used to extract more image features, as shown in fig. 2, which may be implemented in a manner that includes,

defining input information X ε R ^c×h×w Wherein R is a real number domain, c is an input channel, and h and w are respectively the height and the width;

the input information is passed through a convolution kernel f ' with m filters and a kernel size k x k, and m output feature graphs Y ' are obtained, where the output feature graphs Y ' are expressed as:

Y′＝X*f′

obtaining n output characteristic diagrams Y through linear transformation of m output characteristic diagrams Y', wherein m×s=n, n is larger than or equal to m, and Y is the output characteristic of the Ghost module;

the formula of the linear transformation is expressed as:

In one embodiment of the present invention, a Ghost bootlenck unit is constructed based on a Ghost module, as shown in fig. 3, in particular,

applying batch normalization and Hard-swish activation functions after a first Ghost module, and applying batch normalization after a second Ghost module;

the activation function is expressed as:

it should be noted that, the batch normalization (Batch Normalization) is used to normalize the input of each layer of the network to zero mean and unit variance, which helps to accelerate the convergence of the model and improve the stability and robustness of the model. The BN layer and the Hard-swish activation function are combined, so that the gradient disappearance problem can be relieved, training convergence is accelerated, the input distribution of Hard-swish is more stable, the Hard-swish activation function has smoother curves while retaining the nonlinear characteristics of ReLU, and the performance and generalization capability of the model are improved.

And an attention mechanism module is added into the Ghost bottleneck unit and used for enhancing the characteristic extraction capability of the encoder, and the attention mechanism module can adopt typical SE (Squeeze-and-Excitation Networks), SK (Select ive Kernel Networks) and other structures, so that in actual use, the segmented focus outline can be more complete and attached by the addition of the attention mechanism.

In one specific embodiment of the invention, focus images are respectively input into a liver focus segmentation model in different scales, feature images are extracted by encoders in corresponding scales, and then the feature images sequentially pass through feature image decoders in corresponding scale encoders to obtain a final output result, specifically,

as shown in fig. 4, the dimensions of the encoder and decoder include 1, 1/2,1/4,1/8 and 1/16; the encoder consists of 2 Ghost bott leneck units, namely 5 stages, and each time the feature map is downsampled by 2 times; the decoder consists of 2 Ghost bott leneck units, 5 stages in total, and each time the feature map is up-sampled by 2 times.

In a specific embodiment of the invention, a cavity space convolution pooling layer ASPP is added to an input layer of a liver focus segmentation model, wherein the characteristic images with the scale of 1/16 after encoding need to pass through the cavity space convolution pooling layer ASPP before passing through a decoder, and the characteristic images with other scales are subjected to channel dimension splicing operation;

specifically, as shown in fig. 5, the hole space convolution pooling layer ASPP includes 4 parallel hole convolutions and 1 global average pooling; the 4 parallel hole convolutions include a convolution of 1*1, a 3*3 convolution with a coefficient of expansion of 3, a 3*3 convolution with a coefficient of expansion of 6, and a 3*3 convolution with a coefficient of expansion of 9; global average pooling results in upsampling by the convolution operation of 1*1; splicing the characteristic images obtained by the four parallel cavity convolutions and the up-sampling, and obtaining final output through the convolution operation of 1*1; the final output includes receptive fields of different levels.

In one embodiment of the present invention, a liver lesion segmentation model is trained, specifically,

specifically, firstly, acquiring historical data of liver focus, acquiring a data set containing as many liver focus (cyst, hemangioma, liver cancer, strong echo focus and the like) as possible, and preprocessing the data such as cleaning, data amplification, data enhancement and the like; the data collection needs to collect representative medical institutions from a plurality of families, different regions and different levels, so that the diversity of the data is ensured, and the data distribution of focus maps of various diseases is relatively uniform and proper. Further, the collected data is cleaned, and bad data such as long-term old and poor quality are deleted; on the other hand, the data is desensitized, and relevant information about patients and hospitals in the image is deleted. Further, an ultrasonic image containing a focus is subjected to online cutting out according to random pixel or focus size proportion and a certain size in a certain range according to a focus region of interest, and data to be trained are obtained; in addition, these data are subjected to online data enhancement to obtain training data for more conditions, including, but not limited to, vertical inversion at a probability level, random rotation, clipping, translation, pixel value perturbation, filtering, noise addition, and the like. The region of interest of the lesion may be obtained by a pre-positioned object detection model or manually framed.

And inputting focus images meeting the training requirements into a liver focus segmentation model for training, outputting the segmentation model in a multi-segmentation head deep supervision mode, respectively setting different loss functions for different segmentation heads according to different positions of the segmentation heads, and finally fusing the loss functions by formaldehyde to serve as a final optimization target.

As shown in fig. 4, in the network structure of the segmentation model, down-Sample 2 (4, 8, 16) represents 2 times, 4 times, 8 times, 16 times downsampling, up-Sample 2 represents 2 times upsampling, ghos tbott leneck x 2 represents using 2 Ghost bott leneck units, concate represents feature map and channel dimension concatenation, and ASPP represents a hole space convolution pooling layer.

Specifically, a Dice-Loss function and a cross entropy Loss function are used for the result with the output scale of 1, and a cross entropy Loss function is used for the output results with other scales;

the Dice-Loss function is expressed as:

the cross entropy loss function is expressed as:

wherein M represents the number of categories, y _ic Representing a sign function when the true class of sample i is equal toc is 1, otherwise 0, p _ic The prediction probability that the observation sample i belongs to the category c is represented by Cross-Entropy-Loss, the Cross Entropy is used for evaluating the difference condition between the probability distribution obtained by current training and the real distribution, the distance between the actual output (probability) and the expected output (probability) is represented, and the smaller the value of the Cross Entropy is, the closer the two probability distributions are.

The scale of the weight corresponding output image of the loss function is 1, 0.8, 0.6, 0.4 and 0.2 respectively;

and weighting and fusing the Loss functions with different weights to obtain a total Loss function, and obtaining an optimal segmentation model by reducing the total Loss function.

In one embodiment of the present invention, shown in fig. 6, an ultrasound image of a liver lesion is provided, and the lesion size is described based on the segmentation results, as shown in fig. 7, in particular,

specifically, a straight line equation in which the principal direction of the focus is located is calculated by using a principal component analysis method of PCA, wherein the PCA is an unsupervised algorithm, and the direction of a sample can be well found.

specifically, the short axis searching range set by the invention starts from 1/3 of the short axis line segment to 2/3 of the short axis line segment;

preferably, in the actual selection area of the short axis points, in order to obtain the short axis more practically, the obtained straight line equation is not only perpendicular to the long axis, but also obtains straight line equations with angles of 81 degrees, 84 degrees, 87 degrees, 93 degrees, 96 degrees and 99 degrees clockwise to the long axis, and when a plurality of groups of minimum short axes with equal distances exist, the short axis closest to the center point of the long axis and most perpendicular to the long axis is selected.

Further, calculating an included angle between the long axis and the short axis and a horizontal straight line, wherein the included angle is a horizontal axis and the included angle is a vertical axis; and obtaining the horizontal axis pixel length distance by importing the actual length of a single pixel of the focus image, and calculating to obtain the actual horizontal axis and vertical axis size of the focus.

In one embodiment of the present invention, the definition of the boundary of a lesion is described based on the contour of the lesion, specifically,

as shown in fig. 8, performing an expansion operation on the outline of the focus to obtain a focus region of interest, performing a graying operation, and processing to obtain a target region interest Image roi_image and a corresponding focus region Mask Image mask_image;

as shown in fig. 9, the Mask image is subjected to etching and expansion operations, respectively, to obtain an etching image mask_Erode and an expansion image mask_Dilate. As shown In fig. 10, subtracting the etching Image mask_Erode from the Mask Image mask_image to obtain a mask_in region In the Edge of the Mask Image, subtracting the Mask Image mask_image from the expansion Image mask_Dilate to obtain a mask_out region outside the Edge of the Mask Image, and adding the mask_in region In the Edge of the Mask Image to the mask_out region outside the Edge of the Mask Image to obtain an Edge region mask_edge of the Mask Image;

obtaining an Edge Image mask_canny by using an Edge algorithm for the interested Image ROI_image, and obtaining an Edge information map mask_canny_edge within a certain range of a focus contour peripheral region by combining an Edge region mask_edge of the Mask Image;

preferably, the edge algorithm used in the present invention is a Canny edge algorithm, wherein the Canny algorithm needs a low threshold of 30 and a high threshold of 180.

As shown in fig. 11, the Edge contour points of the focus are output on an all-zero graph, and the non-zero value Definit ionEdgeNumber of the Edge information graph mask_canny_edge and the non-zero number is counted as the total number Edge number of the focus, so as to calculate an Edge ratio Defini t ionEdgeRat io;

combining the interested Image ROI_image and the region mask_in In the Mask Image edge, and calculating to obtain an Image gray average Inneravg In a circle of range In the focus edge outline; combining the interested Image ROI_image and the region mask_Out outside the Mask Image edge, and calculating to obtain an Image gray average value OutAvg in a circle of range outside the focus edge outline; taking a difference between the two average values and taking an absolute value of the result, and calculating an edge difference Definit ionEdgeDiff;

|InnerAvg-OuterAvg|＝Definit ionEdgeDiff

and judging whether the boundary of the focus is clear or not according to the edge ratio and the edge difference. Specifically, when Definit ionEdgeDiff is more than or equal to 10, the edges are clear; when Definit ionEdgeDiff is less than 10 and Definit ionEdgeRat io is more than 0.19, the edges are clear; the edges of other cases are unclear.

In one embodiment of the present invention, the shape of a lesion is described based on the contour of the lesion, specifically,

preferably, as shown in fig. 12, a convex hull of the contour is obtained by calling a convex hull detection algorithm in an open source image processing library, such as a yellow contour, and a contour coordinate point farthest from the corresponding convex hull in each concave is obtained, such as a red point.

Further, a threshold value with a distance of 5 pixels is set to screen the concave points, so that the number ConcaveNum of the concave points meeting the conditions is calculated, the average distance ConcaveAvgDepth and the maximum distance ConcaveMaxPedth of the concave points from the convex hull are calculated, and in order to eliminate the influence of focuses with different sizes on the calculation result, the ConcaveAvgDepth and the ConcaveMaxPedth are divided by the obtained focus short axes respectively, so that the algorithm can obtain the self-adaption capability. And carrying out logic judgment through the calculated value, and further obtaining focus shape description. Specifically, when ConcaveAvgDepth > 0.1 or ConcaveMaxDepth > 0.22 is satisfied, the shape is described as irregular; otherwise, the rule is the rule.

In one embodiment of the present invention, the echo type and the echo texture of the lesion are described based on the segmentation result, specifically,

echo types include strong echo, high echo, low echo, and anechoic echo;

calculating gray average value M in focus contour area ₁ The gray average value M of the focus area removed from the focus circumscribed rectangular area ₂ Judging the echo type based on the ratio R of the echo type;

specifically, in one aspect, R>1.1 is hyperechoic or hyperechoic, further R<1.5 is hyperechoic; r is R>2、M ₁ <High echo at 80, otherwise strong echo; on the other hand, R.ltoreq.1.1 is hypoechoic or anechoic, further R>0.7 is hypoechoic, otherwise anechoic.

As shown in fig. 13, the echo quality is judged based on the intra-lesion gray scale characteristics;

calculating a gray Mean value Mean and a variance Std of a focus image, determining a gray Threshold value Threshold of the focus image, binarizing the focus image according to the Threshold value to obtain a relatively high-brightness area in the focus, and obtaining the duty ratio of the high-brightness area in the focus image by counting the number of pixels of the high-brightness area, namely the lightSpotsNum and the number of pixels of the focus, namely the Les ionPixelNum;

it should be noted that the gray mean value reflects the average gray level of the lesion image, and the variance reflects the degree of change of the gray value in the image;

the calculation formula of the threshold value is as follows:

Threshold＝Mean＝K×Std

wherein K is an empirical weight value, and 1.8 is taken in the invention.

The entropy of the lesion image is expressed as:

wherein i is a pixel gray value, i is more than or equal to 0 and less than or equal to 225, and p _i The probability of pixel i appearing throughout the figure is expressed as:

it should be noted that, the entropy in the focal region is calculated, and the entropy of the digital image refers to the information contained in the aggregation feature of the gray distribution in the image, which may reflect the complexity of the texture in the image. The larger the entropy, the more complex the texture is, and the smaller the entropy, the smoother the texture is.

Further, judging whether the echo texture is uniform or not by combining the duty ratio LightSpotsRat io of the highlight region in the focus image, the variance Std and the entropy H of the focus image. Specifically, when H is more than 5 and Std is more than 25, the echo texture is uneven; on the contrary, when the LightSpotsRat io is more than 0.035, the echo quality is further judged to be uneven; other cases return echoes that are uniform in texture.

In summary, according to the liver focus image description method based on the semantic segmentation network, a focus segmentation model is constructed to segment liver focuses, and description is performed based on segmentation results, so that accuracy and reliability of liver disease diagnosis are improved, and misjudgment and subjective deviation are reduced; by combining the advantages of deep learning and traditional image processing, the characteristics of the deep learning and the traditional image processing are fully utilized, and the performance and stability of the algorithm are improved; the automatic processing and analysis of the ultrasonic image are realized, the burden of doctors is reduced, and the working efficiency is improved; the result of focus description provides accurate disease location and type judgment, provides a better treatment scheme for doctors, and improves the treatment effect and the feeling of treatment of patients.

It should be understood that although the present disclosure describes embodiments, not every embodiment is provided with a separate embodiment, and that this description is for clarity only, and that the skilled artisan should recognize that the embodiments may be combined as appropriate to form other embodiments that will be understood by those skilled in the art.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described modules may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.

The modules illustrated as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the embodiment.

In addition, each functional module in each embodiment of the present application may be integrated into one processing module, or each module may exist alone physically, or 2 or more modules may be integrated into one module. The integrated modules may be implemented in hardware or in hardware plus software functional modules.

The integrated modules, which are implemented in the form of software functional modules, may be stored in a computer readable storage medium. The software functional modules described above are stored in a storage medium and include instructions for causing a computer system (which may be a personal computer, a server, or a network system, etc.) or processor (processor) to perform some of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims

1. A liver focus image description method based on semantic segmentation network is characterized by comprising the following steps of,

based on the segmentation result, features of the lesion image are described.

2. The liver focus image description method based on semantic segmentation network according to claim 1, wherein the said nnet semantic segmentation network is composed of symmetrical downsampling path encoder and upsampling path decoder;

the Ghost module is used to extract more image features, including,

passing the input information through a convolution kernel f having m filters and a kernel size k ^′ Obtaining m output characteristic diagrams Y ^′ The output characteristic diagram Y ^′ Expressed as:

Y ^′ ＝*f ^′

the m output characteristic diagrams Y ^′ Obtaining n output characteristic diagrams Y through linear transformation, wherein m×s=n, n is larger than or equal to m, and Y is the output characteristic of the Ghost module;

the formula of the linear transformation is expressed as:

wherein y is _i ^′ Is Y ^′ In (i) th feature map, phi _i,j (y _i ^′ ) Representing the jth linear transformation of the ith feature map to generate the jth feature map y _ij 。

3. The method for describing liver focus image based on semantic segmentation network according to claim 2, further comprising,

constructing a Ghost bootbox unit based on the Ghost module;

the activation function is expressed as:

4. the method for describing liver focus image based on semantic segmentation network according to any one of claims 1, wherein focus images are respectively input into the liver focus segmentation model in different scales, feature images are extracted by encoders in corresponding scales, and then the feature images sequentially pass through feature image decoders of the encoders in corresponding scales to obtain final output results;

the dimensions of the encoder and the decoder include 1, 1/2,1/4,1/8 and 1/16;

the encoder consists of 2 Ghost bottleneck units, 5 stages are total, and each feature map is downsampled by 2 times through one stage;

the decoder consists of 2 Ghost bottleneck units, 5 stages are total, and each time the feature map is up-sampled by 2 times.

5. The semantic segmentation network-based liver focus image description method according to claim 4, wherein a cavity space convolution pooling layer ASPP is added to an input layer of the liver focus segmentation model;

the global average pooling gets up-sampling by convolution operation of 1*1;

the final output includes receptive fields of different levels.

6. The method of claim 5, wherein training the liver lesion segmentation model comprises,

the Dice-Loss function is expressed as:

wherein, X represents the pixel label of the real segmented image, Y represents the pixel class of the model predictive segmented image, I and I respectively represent the number of elements in X, Y, and I and Y represent the number of intersection elements between X and Y;

the cross entropy loss function is expressed as:

7. The method for describing liver focus image based on semantic segmentation network according to claim 1, wherein the size of the focus is described based on the segmentation result, comprising,

8. The method for describing liver focus image based on semantic segmentation network according to claim 7, wherein the boundary definition of the focus is described based on the focus contour, comprising,

9. The method for describing a liver focus image based on a semantic segmentation network according to claim 7, wherein the description of the shape of the focus is based on the contour of the focus, comprising,

10. The method for describing liver focus image based on semantic segmentation network according to claim 1, wherein the description of the echo type and the echo texture of the focus based on the segmentation result comprises,

the echo types include strong echo, high echo, low echo and anechoic echo;

judging the echo quality based on the gray scale characteristics in the focus;

the entropy of the lesion image is expressed as:

wherein i is the gray value of the pixel, i is more than or equal to 0 and less than or equal to 225，p _i Is the probability that pixel i appears in the entire graph;