CN112614108A

CN112614108A - Method and device for detecting nodules in thyroid ultrasound image based on deep learning

Info

Publication number: CN112614108A
Application number: CN202011548496.8A
Authority: CN
Inventors: 罗渝昆; 谢芳; 林科汝; 陈东浩; 张艳; 田晓琦; 张颖; 王筱斐; 叶丹; 任改霞; 李发根; 欧阳勇春
Original assignee: Beijing Hongyun Zhisheng Technology Co ltd; Beijing University of Posts and Telecommunications; First Medical Center of PLA General Hospital
Current assignee: Beijing Hongyun Zhisheng Technology Co ltd; Beijing University of Posts and Telecommunications; First Medical Center of PLA General Hospital
Priority date: 2020-12-24
Filing date: 2020-12-24
Publication date: 2021-04-06
Anticipated expiration: 2040-12-24
Also published as: CN112614108B

Abstract

The disclosure provides a method for detecting nodules in a thyroid ultrasound image based on deep learning, which comprises the steps of preprocessing the thyroid ultrasound image; extracting the characteristics of the preprocessed thyroid ultrasound image to obtain a characteristic image; respectively inputting the obtained characteristic images into corresponding classification and regression structures, and obtaining specific position information of a thyroid nodule region in each characteristic image; calculating the classification loss, the center point distance regression loss and the offset loss generated by the feature image input into the corresponding classification and regression structure, and calculating by weighted summation to obtain the total loss of the model to be trained; and training and testing the model to be trained. The method has the advantages that the anchor box does not need to be arranged, the nodule area in the thyroid ultrasound image is efficiently detected, calculation and resource waste related to the anchor box are avoided, the training speed is increased, and the generalization performance of an experimental result is enhanced. The present disclosure also provides an apparatus for detecting nodules in a thyroid ultrasound image based on deep learning.

Description

Method and device for detecting nodules in thyroid ultrasound image based on deep learning

Technical Field

The disclosure relates to the technical field of computer biology, in particular to a method and a device for detecting nodules in a thyroid ultrasound image based on deep learning.

Background

At present, many technologies for performing auxiliary diagnosis on medical images by using a deep learning method exist, and for auxiliary diagnosis of thyroid nodules in a thyroid ultrasound image, the commonly used deep learning methods can be divided into two types: one is a dual-stage detector based on regional candidate networks and the other is a single-stage detector not based on regional candidate networks. Both methods generally require the following steps: firstly, obtaining a characteristic diagram of an ultrasonic image after characteristics are extracted through a convolutional neural network; secondly, setting anchor points (anchors) and anchor frames (anchors box) with certain quantity and size on the characteristic diagram; and finally, performing an anchor frame position regression algorithm and a category division algorithm to complete nodule detection in the thyroid ultrasound image.

However, the prior art solutions have the following two disadvantages: (1) three anchor boxes with the sizes of 64,128 and 256 and the proportions of 1:2,1:1 and 2:1 are required to be arranged before an experiment, however, in actual situations, the size of an input thyroid ultrasound image is not fixed, the sizes of nodule areas in the image are different, and the size, the aspect ratio, the number and other over-parameters of the anchor boxes can bring great influence to an experiment result, and a huge test is faced in the process of actual auxiliary diagnosis. (2) In order to bring a high recall rate to the experimental results, a large number of and dense anchor boxes are often arranged in one image, and it is obvious that the number of nodules in each thyroid ultrasound image is very small, and is often a single digit, and the large number of anchor boxes brings a great problem of imbalance between positive and negative sample categories during category division in the training phase, and aggravates the calculation amount and consumes memory resources during calculation IoU in the training and testing phase.

Generally speaking, the prior art can face the problems of poor generalization performance of experimental results, slow model training process, waste of computing resources and the like.

Disclosure of Invention

In order to solve technical problems in the prior art, the embodiment of the disclosure provides a method and a device for detecting nodules in a thyroid ultrasound image based on deep learning, an anchor box does not need to be set, an algorithm for accelerating the regression of the anchor box is adopted, the model training speed is accelerated to the maximum extent, and the generalization performance of experimental results is enhanced.

In a first aspect, an embodiment of the present disclosure provides a method for detecting a nodule in a thyroid ultrasound image based on deep learning, the method including: carrying out pretreatment operation on the thyroid ultrasound image; performing feature extraction on the preprocessed thyroid ultrasound image to obtain a feature image; inputting the obtained characteristic images into corresponding classification and regression structures respectively to obtain specific position information of a thyroid nodule region in each characteristic image; calculating the classification loss, the center point distance regression loss and the offset loss generated by the feature image input into the corresponding classification and regression structure, and calculating and obtaining the total loss of the model to be trained through weighted summation; and training and testing the model to be trained.

In one embodiment, the preprocessing operation performed on the thyroid ultrasound image includes: and sequentially carrying out scaling, filling and normalization processing operations on the thyroid ultrasound image.

In one embodiment, the performing feature extraction on the preprocessed thyroid ultrasound image to obtain a feature image includes: inputting the preprocessed thyroid ultrasound image into a feature extraction network for feature extraction so as to obtain a multi-scale feature map; the feature extraction network is composed of a residual error feature extraction network and a multi-scale feature fusion network.

In one embodiment, the classification and regression structure is comprised of a classification network and a regression network.

In one embodiment, inputting the acquired feature images into corresponding classification and regression structures respectively to acquire specific location information of the thyroid nodule region in each feature image comprises: extracting deep classification features of the images from the feature images through a plurality of convolution modules; each convolution module comprises a convolution layer, a batch normalization layer and a nonlinear activation function; passing the deep classified features through a convolution layer with a larger receptive field to obtain a tensor with the size of W × H × 1; and predicting each feature point in the feature image into a classification confidence coefficient of a nodule region, wherein W x H is the size of the feature image.

In one embodiment, the method further comprises the following steps: and the classification features execute convolution layer operation through an offset branch, extract the offset features of the image and obtain the offset predicted value from each prediction frame in the feature image to the center point of the target frame.

In one embodiment, the calculating the classification loss, the center point distance regression loss and the offset loss generated by the feature image input into the corresponding classification and regression structure, and the obtaining the total loss of the model to be trained through the weighted summation calculation comprises: calculating the generated classification loss of the feature images input into the corresponding classification and regression structures through a first preset formula; wherein the first preset formula is as follows:

L_cls＝-a(1-pred)²log(pred)

wherein pred represents the probability that the sample belongs to the correct class, and a is 0.75 when the correct sample is a positive sample, and a is 0.25 when the correct sample is a negative sample;

calculating the generated center point distance regression loss by using a second preset formula; wherein the second preset formula is as follows:

wherein IoU represents the intersection ratio of the target frame and the prediction frame, c^gtRespectively representing the central points of the prediction frame and the target frame, rho represents the Euclidean distance between the central points of the two rectangular frames, and d is the diagonal distance of the minimum rectangle capable of simultaneously containing the prediction frame and the target frame;

calculating the offset loss through a third preset formula; wherein the third preset formula is as follows:

wherein, l in the formula^*，t^*，r^*，b^*Respectively representing coordinates (x) from the center point of the prediction frame to the target frame_min，y_min，x_max，y_max) The distance of (d);

finally, weighting and summing the classification loss, the center point distance regression loss and the offset loss through a fourth preset formula to obtain the total loss of the model to be trained; wherein the fourth preset formula is:

L＝λ₁L_cls+λ₂L_reg+λ₃L_ctr

wherein in the formula lambda₁＝1，λ₂＝2.5，λ₃＝1。

In a second aspect, the disclosed embodiments provide a computer-readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of the method described above.

In a third aspect, the disclosed embodiments provide a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method described above when executing the program.

In a fourth aspect, an embodiment of the present disclosure provides an apparatus for detecting a nodule in a thyroid ultrasound image based on deep learning, the apparatus including: the preprocessing module is used for preprocessing the thyroid ultrasound image; the characteristic extraction module is used for extracting the characteristics of the preprocessed thyroid ultrasound image so as to obtain a characteristic image; the acquisition module is used for respectively inputting the acquired feature images into corresponding classification and regression structures so as to acquire specific position information of the thyroid nodule region in each feature image; the calculation module is used for calculating the classification loss, the center point distance regression loss and the offset loss generated by the feature image input into the corresponding classification and regression structure, and obtaining the total loss of the model to be trained through weighted summation calculation; and the training and testing module is used for training and testing the model to be trained.

The method and the device for detecting the nodule in the thyroid ultrasound image based on the deep learning provided by the invention have the advantages that the thyroid ultrasound image is preprocessed; performing feature extraction on the preprocessed thyroid ultrasound image to obtain a feature image; inputting the obtained characteristic images into corresponding classification and regression structures respectively to obtain specific position information of a thyroid nodule region in each characteristic image; calculating the classification loss, the center point distance regression loss and the offset loss generated by the feature image input into the corresponding classification and regression structure, and calculating and obtaining the total loss of the model to be trained through weighted summation; and training and testing the model to be trained. The method has the advantages that the anchor box does not need to be arranged, the nodule area in the thyroid ultrasound image is efficiently detected, calculation related to the anchor box can be avoided, meanwhile, resource waste is avoided, the training speed is increased, and the generalization performance of experimental results is enhanced; meanwhile, a central point distance regression algorithm for accelerating the regression of the position of the prediction frame is adopted, so that the experimental test index is improved, and the model training speed is accelerated to the maximum extent; and adding an offset prediction branch, representing the distance between the prediction frame and the central point of the target frame in the test stage, helping the model to screen a correct detection frame, and improving the test index.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings needed to be used in the description of the embodiments are briefly introduced as follows:

fig. 1 is a flowchart illustrating steps of a method for detecting nodules in a thyroid ultrasound image based on deep learning according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a feature extraction network in the method for detecting nodules in a thyroid ultrasound image based on deep learning according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a classification and regression structure in a method for detecting nodules in a thyroid ultrasound image based on deep learning according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an apparatus for detecting nodules in a thyroid ultrasound image based on deep learning according to an embodiment of the present invention;

FIG. 5 is a hardware block diagram of an apparatus for detecting nodules in an ultrasound image of a thyroid based on deep learning according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of a computer-readable storage medium in one embodiment of the invention.

Detailed Description

The present application will now be described in further detail with reference to the accompanying drawings and examples.

In the following description, the terms "first" and "second" are used for descriptive purposes only and are not intended to indicate or imply relative importance. The following description provides embodiments of the disclosure, which may be combined or substituted for various embodiments, and this application is therefore intended to cover all possible combinations of the same and/or different embodiments described. Thus, if one embodiment includes feature A, B, C and another embodiment includes feature B, D, then this application should also be considered to include an embodiment that includes one or more of all other possible combinations of A, B, C, D, even though this embodiment may not be explicitly recited in text below.

In order to make the objects, technical solutions and advantages of the present invention more clearly understood, the following describes in detail a specific embodiment of the method and apparatus for detecting a nodule in a thyroid ultrasound image based on deep learning according to the present invention by way of example and with reference to the accompanying drawings. Specifically, the method is used for detecting the nodule area of the thyroid ultrasound image based on an anchor-free method in deep learning. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

As shown in fig. 1, a schematic flowchart of a method for detecting a nodule in a thyroid ultrasound image based on deep learning in an embodiment specifically includes the following steps:

and step 11, carrying out preprocessing operation on the thyroid ultrasound image.

It should be noted that the preprocessing operation performed on the thyroid ultrasound image includes: and sequentially carrying out scaling, filling and normalization processing operations on the thyroid ultrasound image.

Specifically, the scaling operation specifically includes: if the short side of the picture is smaller than 800 pixels (pixel, hereinafter abbreviated as px), multiplying the short side by the corresponding magnification to 800px, and multiplying the long side by the same magnification; if the long side is larger than 1333px, the multiplication of the long side by the corresponding magnification is reduced to 1333px, and the multiplication of the short side by the same magnification is also large. I.e. ensuring that the picture short edge is not shorter than 800px, while the long side is no greater than 1333 px. In addition, the filling operation is specifically: and (4) if the length and the width of the zoomed picture can not be exactly divided by 32, filling the pixel point with the pixel value of 0 until the pixel point can be exactly divided by 32. Further, the normalization operation specifically includes: each pixel value minus the mean μ of the channel_cAnd divided by the variance σ_cSuch as formula

Wherein x_nRepresenting the pixel value, mu_c∈[μ_R,μ_G,μ_B]Representing the mean, σ, of the three channels of RGB_c＝[σ_R,σ_G,σ_B]Representing the variance of the RGB three channels.

And step 12, performing feature extraction on the preprocessed thyroid ultrasound image to obtain a feature image.

Specifically, the step of performing feature extraction on the preprocessed thyroid ultrasound image to obtain a feature image includes: inputting the preprocessed thyroid ultrasound image into a feature extraction network for feature extraction so as to obtain a multi-scale feature map; the feature extraction network is composed of a residual feature extraction network and a multi-scale feature fusion network, as shown in fig. 2.

The residual error feature extraction network is a convolutional neural network formed by residual error blocks, and the residual error blocks are formed by connecting convolutional layers, batch normalization layers and nonlinear activation functions through residual errors. After each residual block of the residual feature extraction network, namely Conv1, Conv2, Conv3, Conv4 and Conv5, respectively, one convolution feature map is obtained, and 5 feature maps of { C1, C2, C3, C4 and C5} are obtained in the whole network. C1 and C2 with insufficient semantic information are removed, and the remaining C3-C5 levels are used as input feature maps of the multi-scale feature fusion network, and respectively correspond to the downsampling multiples of input pictures, namely Stride {8, 16 and 32 }.

The multi-scale feature fusion network fuses input adjacent hierarchical feature maps into new features, each level of { C3, C4 and C5} is subjected to convolutional layer operation firstly to obtain shallow features of the level, if a next-level deep feature map exists, the shallow features and the next-level deep feature map are subjected to the operation of adding elements of the same spatial position with an upsampling result which is 2 times of the next level, the shallow features and the deep features are fused to obtain multi-scale feature maps { P3, P4 and P5}, wherein the feature map P5 is subjected to convolutional layer operation twice to obtain feature maps P6 and P7 sequentially, and the number of channels is 256.

And step 13, respectively inputting the obtained characteristic images into corresponding classification and regression structures to obtain specific position information of the thyroid nodule region in each characteristic image. Wherein, the classification and regression structure is composed of a classification network and a regression network, as shown in fig. 3.

Specifically, inputting the obtained feature images into corresponding classification and regression structures, respectively, so as to obtain specific location information for the thyroid nodule region in each of the feature images, including: extracting deep classification features of the images from the feature images through a plurality of convolution modules; each convolution module comprises a convolution layer, a batch normalization layer and a nonlinear activation function; passing the deep classified features through a convolution layer with a larger receptive field to obtain a tensor with the size of W × H × 1; and predicting each feature point in the feature image into a classification confidence coefficient of a nodule region, wherein W x H is the size of the feature image. In addition, the method for detecting a nodule in a thyroid ultrasound image based on deep learning of the present disclosure further includes: and the classification features execute convolution layer operation through an offset branch, extract the offset features of the image and obtain the offset predicted value from each prediction frame in the feature image to the center point of the target frame.

In addition, in the regression network, the feature map is operated by a plurality of convolution modules, and deep regression features are extracted, wherein the structure of each convolution module is as described above. Deep regression features are subjected to convolution layer with larger receptive field to obtain tensor with the size of W x H x 4, 4-dimensional tensors (l, r, t, b) predicted by each feature point in the feature map are obtained, and indexes are taken from the 4-dimensional tensors to obtain (e)^l,e^r,e^t,e^b) Coordinates (x, y) representing the feature point as a center point to a predicted frame coordinate (x), respectively_min,y_min,x_max,y_max) Perpendicular distance of

I.e. e^l＝|x-x_min|,e^r＝|x_max-x|,e^t＝|y-y_min|,e^b＝|y_max-y|。

For features of different scales, the range of bounding box regression is defined as follows:

l3 ∈ [0,64], L4 ∈ [64,128], L5 ∈ [128,256], L6 ∈ [256,512], L7 ∈ [512, ∞ ], wherein { L3, L4, L5, L6, L7} corresponds to the maximum of the predicted 4-dimensional tensors (L, r, t, b) in eigenmaps { P3, P4, P5, P6, P7}, respectively.

It should be noted that the positive and negative sample selection method adopted in the present invention is based on distance and size, that is, feature points in the target frame are considered as positive samples (nodule regions) during network training, and the rest are negative samples (background regions).

And 14, calculating the classification loss, the central point distance regression loss and the offset loss generated by the feature image input into the corresponding classification and regression structure, and calculating by weighted summation to obtain the total loss of the model to be trained.

Specifically, the step of calculating the classification loss, the center point distance regression loss and the offset loss generated by the feature image input into the corresponding classification and regression structure by weighting and summing to obtain the total loss of the model to be trained includes: calculating the generated classification loss of the feature images input into the corresponding classification and regression structures through a first preset formula; wherein the first preset formula is as follows:

L_cls＝-a(1-pred)²log(pred)

L＝λ₁L_cls+λ₂L_reg+λ₃L_ctr

wherein in the formula lambda₁＝1，λ₂＝2.5，λ₃＝1。

And step 15, training and testing the model to be trained.

Specifically, in the training stage, hyper-parameters of deep learning training stages such as a learning rate, an optimizer, iteration times, batch size and the like need to be set, and the network is trained; and in the testing stage, after the test set is read into the network, a series of prediction boxes for network prediction can be obtained, the weight values of the offset branches are multiplied by the corresponding classification confidence degrees, the final scores are used for sorting the prediction boxes from large to small in score, and then non-maximum suppression (NMS) operation is carried out to obtain the final prediction result of the network.

The method for detecting the nodule in the thyroid ultrasound image based on the deep learning, which is related by the disclosure, is used for preprocessing the thyroid ultrasound image; performing feature extraction on the preprocessed thyroid ultrasound image to obtain a feature image; inputting the obtained characteristic images into corresponding classification and regression structures respectively to obtain specific position information of a thyroid nodule region in each characteristic image; calculating the classification loss, the center point distance regression loss and the offset loss generated by the feature image input into the corresponding classification and regression structure, and calculating and obtaining the total loss of the model to be trained through weighted summation; and training and testing the model to be trained. The method has the advantages that the anchor box does not need to be arranged, the nodule area in the thyroid ultrasound image is efficiently detected, calculation related to the anchor box can be avoided, meanwhile, resource waste is avoided, the training speed is increased, and the generalization performance of experimental results is enhanced; meanwhile, a central point distance regression algorithm for accelerating the regression of the position of the prediction frame is adopted, so that the experimental test index is improved, and the model training speed is accelerated to the maximum extent; and adding an offset prediction branch, representing the distance between the prediction frame and the central point of the target frame in the test stage, helping the model to screen a correct detection frame, and improving the test index.

Based on the same inventive concept, the device for detecting the nodule in the thyroid ultrasound image based on deep learning is also provided. Because the principle of the device for solving the problems is similar to the method for detecting the nodules in the thyroid ultrasound image based on the deep learning, the implementation of the device can be realized according to the specific steps of the method, and repeated parts are not repeated.

Fig. 4 is a schematic structural diagram of an apparatus for detecting a nodule in a thyroid ultrasound image based on deep learning in an embodiment. The apparatus 10 for detecting a nodule in a thyroid ultrasound image based on deep learning includes: a preprocessing module 100, a feature extraction module 200, an acquisition module 300, a calculation module 400, and a training and testing module 500.

The preprocessing module 100 is configured to perform preprocessing operation on the thyroid ultrasound image; the feature extraction module 200 is configured to perform feature extraction on the preprocessed thyroid ultrasound image to obtain a feature image; the obtaining module 300 is configured to input the obtained feature images into corresponding classification and regression structures, respectively, so as to obtain specific location information for a thyroid nodule region in each of the feature images; the calculation module 400 is configured to calculate a classification loss, a center point distance regression loss, and an offset loss generated by the feature image input into the corresponding classification and regression structure, and obtain a total loss of the model to be trained through weighted summation calculation; the training and testing module 500 is used for training and testing the model to be trained.

According to the device for detecting the nodules in the thyroid ultrasound image based on the deep learning, firstly, the thyroid ultrasound image is preprocessed through a preprocessing module; then, performing feature extraction on the preprocessed thyroid ultrasound image through a feature extraction module to obtain a feature image; respectively inputting the acquired feature images into corresponding classification and regression structures through an acquisition module so as to acquire specific position information of the thyroid nodule region in each feature image; calculating the classification loss, the central point distance regression loss and the offset loss generated by the feature images input into the corresponding classification and regression structures through a calculation module, and calculating and obtaining the total loss of the model to be trained through weighted summation; and finally, training and testing the model to be trained through a training and testing module. The device provides a method for efficiently detecting the nodule area in the thyroid ultrasound image without setting an anchor box, can avoid the calculation related to the anchor box, avoids the resource waste, accelerates the training speed and enhances the generalization performance of the experimental result; meanwhile, a central point distance regression algorithm for accelerating the regression of the position of the prediction frame is adopted, so that the experimental test index is improved, and the model training speed is accelerated to the maximum extent; and adding an offset prediction branch, representing the distance between the prediction frame and the central point of the target frame in the test stage, helping the model to screen a correct detection frame, and improving the test index.

Fig. 5 is a hardware block diagram illustrating an apparatus for detecting nodules in a thyroid ultrasound image based on deep learning according to an embodiment of the present disclosure. As shown in fig. 5, the apparatus 50 for detecting a nodule in a thyroid ultrasound image based on deep learning according to an embodiment of the present disclosure includes a memory 501 and a processor 502. The various components of the apparatus 50 for detecting nodules in thyroid ultrasound images based on deep learning are interconnected by a bus system and/or other form of connection mechanism (not shown).

The memory 501 is used to store non-transitory computer readable instructions. In particular, memory 501 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory can include, for example, Random Access Memory (RAM), cache memory (or the like). The non-volatile memory may include, for example, Read Only Memory (ROM), a hard disk, flash memory, and the like.

The processor 502 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the apparatus 50 for detecting nodules in thyroid ultrasound images based on deep learning to perform desired functions. In an embodiment of the present disclosure, the processor 502 is configured to execute the computer readable instructions stored in the memory 501, so that the apparatus 50 for detecting a nodule in a thyroid ultrasound image based on deep learning performs the above method for detecting a nodule in a thyroid ultrasound image based on deep learning. The apparatus for detecting a nodule in a thyroid ultrasound image based on deep learning is the same as the above-described embodiment of the method for detecting a nodule in a thyroid ultrasound image based on deep learning, and a repeated description thereof will be omitted herein.

Fig. 6 is a schematic diagram illustrating a computer-readable storage medium according to an embodiment of the present disclosure. As shown in fig. 6, a computer-readable storage medium 600 according to embodiments of the present disclosure has non-transitory computer-readable instructions 601 stored thereon. The non-transitory computer readable instructions 601, when executed by a processor, perform the method for detecting nodules in a thyroid ultrasound image based on deep learning according to an embodiment of the present disclosure described above with reference to the above.

In the above, according to the method and device for detecting nodules in a thyroid ultrasound image based on deep learning and the computer-readable storage medium of the embodiment of the disclosure, a method for efficiently detecting a nodule region in a thyroid ultrasound image without setting an anchor box can be used, calculation related to the anchor box can be avoided, resource waste is avoided, training speed is increased, and generalization performance of experimental results is enhanced; meanwhile, a central point distance regression algorithm for accelerating the regression of the position of the prediction frame is adopted, so that the experimental test index is improved, and the model training speed is accelerated to the maximum extent; and adding an offset prediction branch, representing the distance between the prediction frame and the central point of the target frame in the test stage, helping the model to screen a correct detection frame, and improving the beneficial effect of the test index.

The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.

The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

Also, as used herein, "or" as used in a list of items beginning with "at least one" indicates a separate list, such that, for example, a list of "A, B or at least one of C" means A or B or C, or AB or AC or BC, or ABC (i.e., A and B and C). Furthermore, the word "exemplary" does not mean that the described example is preferred or better than other examples.

It is also noted that in the systems and methods of the present disclosure, components or steps may be decomposed and/or re-combined. These decompositions and/or recombinations are to be considered equivalents of the present disclosure.

Various changes, substitutions and alterations to the techniques described herein may be made without departing from the techniques of the teachings as defined by the appended claims. Moreover, the scope of the claims of the present disclosure is not limited to the particular aspects of the process, machine, manufacture, composition of matter, means, methods and acts described above. Processes, machines, manufacture, compositions of matter, means, methods, or acts, presently existing or later to be developed that perform substantially the same function or achieve substantially the same result as the corresponding aspects described herein may be utilized. Accordingly, the appended claims are intended to include within their scope such processes, machines, manufacture, compositions of matter, means, methods, or acts.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the disclosure to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims

1. A method for detecting nodules in a thyroid ultrasound image based on deep learning, the method comprising:

carrying out pretreatment operation on the thyroid ultrasound image;

performing feature extraction on the preprocessed thyroid ultrasound image to obtain a feature image;

inputting the obtained characteristic images into corresponding classification and regression structures respectively to obtain specific position information of a thyroid nodule region in each characteristic image;

calculating the classification loss, the center point distance regression loss and the offset loss generated by the feature image input into the corresponding classification and regression structure, and calculating and obtaining the total loss of the model to be trained through weighted summation;

and training and testing the model to be trained.

2. The method for detecting the nodule in the thyroid ultrasound image based on the deep learning of claim 1, wherein the preprocessing the thyroid ultrasound image comprises: and sequentially carrying out scaling, filling and normalization processing operations on the thyroid ultrasound image.

3. The method for detecting a nodule in a thyroid ultrasound image based on deep learning of claim 1, wherein the step of performing feature extraction on the preprocessed thyroid ultrasound image to obtain a feature image comprises: inputting the preprocessed thyroid ultrasound image into a feature extraction network for feature extraction so as to obtain a multi-scale feature map; the feature extraction network is composed of a residual error feature extraction network and a multi-scale feature fusion network.

4. The method for detecting nodules in thyroid ultrasound images based on deep learning of claim 1, wherein the classification and regression structure is composed of a classification network and a regression network.

5. The method for detecting the nodule in the thyroid ultrasound image based on the deep learning of claim 1, wherein the step of inputting the obtained feature images into corresponding classification and regression structures respectively to obtain specific position information of the thyroid nodule region in each feature image comprises:

extracting deep classification features of the images from the feature images through a plurality of convolution modules; each convolution module comprises a convolution layer, a batch normalization layer and a nonlinear activation function;

passing the deep classified features through a convolution layer with a larger receptive field to obtain a tensor with the size of W × H × 1; and predicting each feature point in the feature image into a classification confidence coefficient of a nodule region, wherein W x H is the size of the feature image.

6. The method for detecting nodules in a thyroid ultrasound image based on deep learning of claim 5, further comprising: and the classification features execute convolution layer operation through an offset branch, extract the offset features of the image and obtain the offset predicted value from each prediction frame in the feature image to the center point of the target frame.

7. The method for detecting the nodule in the thyroid ultrasound image based on the deep learning of claim 1, wherein the step of calculating the classification loss, the center point distance regression loss and the offset loss generated by the feature image input into the corresponding classification and regression structure comprises the step of obtaining the total loss of the model to be trained by weighted summation calculation, wherein the step of obtaining the total loss of the model to be trained comprises the steps of:

calculating the generated classification loss of the feature images input into the corresponding classification and regression structures through a first preset formula; wherein the first preset formula is as follows:

L_cls＝-a(1-pred)²log(pred)

L＝λ₁L_cls+λ₂L_reg+λ₃L_ctr

wherein in the formula lambda₁＝1，λ₂＝2.5，λ₃＝1。

8. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1-7 are implemented when the program is executed by the processor.

10. Device for detecting a nodule in a thyroid ultrasound image based on deep learning, the device comprising:

the preprocessing module is used for preprocessing the thyroid ultrasound image;

the characteristic extraction module is used for extracting the characteristics of the preprocessed thyroid ultrasound image so as to obtain a characteristic image;

the acquisition module is used for respectively inputting the acquired feature images into corresponding classification and regression structures so as to acquire specific position information of the thyroid nodule region in each feature image;

the calculation module is used for calculating the classification loss, the center point distance regression loss and the offset loss generated by the feature image input into the corresponding classification and regression structure, and obtaining the total loss of the model to be trained through weighted summation calculation;

and the training and testing module is used for training and testing the model to be trained.