CN111539947A

CN111539947A - Image detection method, training method of related model, related device and equipment

Info

Publication number: CN111539947A
Application number: CN202010362766.XA
Authority: CN
Inventors: 黄锐; 胡志强; 张少霆; 李鸿升
Original assignee: Shanghai Sensetime Intelligent Technology Co Ltd
Current assignee: Shanghai Shangtang Shancui Medical Technology Co ltd
Priority date: 2020-04-30
Filing date: 2020-04-30
Publication date: 2020-08-14
Anticipated expiration: 2040-04-30
Also published as: KR20220016213A; WO2021218215A1; JP2022538137A; TW202145249A; CN111539947B

Abstract

The application discloses an image detection method, a training method of a related model, a related device and equipment, wherein the training method of the image detection model comprises the following steps: acquiring a sample medical image, and pseudo-marking an actual region of at least one unmarked organ in the sample medical image; detecting the sample medical image by using an original detection model to obtain a first detection result of a first prediction region comprising an unmarked organ; detecting the sample medical image by using an image detection model to obtain a second detection result of a second prediction region comprising an unmarked organ, wherein the network parameter of the image detection model is determined based on the network parameter of the original detection model; and adjusting the network parameters of the original detection model by using the difference between the first prediction region and the actual region and the difference between the first prediction region and the second prediction region. According to the scheme, the detection accuracy can be improved during multi-organ detection.

Description

Image detection method, training method of related model, related device and equipment

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to an image detection method, a training method for a related model, and related devices and apparatuses.

Background

Medical images such as CT (Computed Tomography) and MRI (Magnetic resonance imaging) are clinically significant. Among them, multi-organ detection is performed on medical images such as CT and MRI to determine regions corresponding to organs on the medical images, and has wide application in clinical practice, such as computer-aided diagnosis and radiotherapy planning. Therefore, the trained image detection model suitable for multi-organ detection has high application value.

Currently, model training relies on a large number of labeled data sets. However, in the field of medical imaging, obtaining large quantities of high quality multi-organ labeling is very time consuming and laborious, and often only experienced radiologists have the ability to label the data. Limited to this, the conventional image detection model often has a problem of low accuracy when performing multi-organ detection. In view of the above, how to improve the detection accuracy in multi-organ detection is an urgent problem to be solved.

Disclosure of Invention

The application provides an image detection method, a training method of a related model, a related device and equipment.

The first aspect of the present application provides a training method for an image detection model, including: acquiring a sample medical image, wherein the sample medical image is used for pseudo-marking an actual region of at least one unmarked organ; detecting a sample medical image by using an original detection model to obtain a first detection result, wherein the first detection result comprises a first prediction region of an unmarked organ; detecting the sample medical image by using the image detection model to obtain a second detection result, wherein the second detection result comprises a second prediction region of the unmarked organ, and the network parameter of the image detection model is determined based on the network parameter of the original detection model; and adjusting the network parameters of the original detection model by using the difference between the first prediction region and the actual region and the difference between the first prediction region and the second prediction region.

Therefore, by acquiring the sample medical image and pseudo-marking the actual region of at least one unmarked organ in the sample medical image, the multiple organs do not need to be really marked in the sample medical image, so that the original detection model is used for detecting and detecting the sample medical image to obtain a first detection result of a first preset region containing the unmarked organ, the image detection model is used for detecting the sample medical image to obtain a second detection result of a second prediction region containing the unmarked organ, the difference between the first prediction region and the actual region and the second prediction region is further used for adjusting the network parameters of the original detection model, and the network parameters of the image detection model are determined based on the network parameters of the original detection model, so that the image detection model can supervise the training of the original detection model, and the accumulated errors of the network parameters caused by the pseudo-marked actual region in the multiple training processes can be restrained, the accuracy of the image detection model is improved, so that the image detection model can accurately supervise the original detection model for training, and further the network parameters of the original detection model can be accurately adjusted in the training process, and therefore, the detection accuracy of the image detection model can be improved in the multi-organ detection process.

The original detection model comprises a first original detection model and a second original detection model, and the image detection model comprises a first image detection model corresponding to the first original detection model and a second image detection model corresponding to the second original detection model; detecting the sample medical image by using an original detection model to obtain a first detection result, wherein the method comprises the following steps: respectively utilizing the first original detection model and the second original detection model to carry out the step of detecting the sample medical image to obtain a first detection result; detecting the sample medical image by using the image detection model to obtain a second detection result, wherein the method comprises the following steps: respectively utilizing the first image detection model and the second image detection model to execute the step of detecting the sample medical image to obtain a second detection result; utilizing the difference between the first prediction area and the actual area and the difference between the first prediction area and the second prediction area respectively, and adjusting the network parameters of the original detection model, wherein the method comprises the following steps: adjusting network parameters of the first original detection model by using the difference between a first prediction region of the first original detection model and an actual region and the difference between a second prediction region of the second image detection model and the actual region; and adjusting the network parameters of the second original detection model by using the difference between the first prediction region of the second original detection model and the actual region and the difference between the first prediction region of the first original detection model and the second prediction region of the first image detection model.

Therefore, the original detection model is set to include a first original detection model and a second original detection model, the image detection model is set to include a first image detection model corresponding to the first original detection model and a second image detection model corresponding to the second original detection model, the step of detecting the medical image of the sample to obtain the first detection result is performed by using the first original detection model and the second original detection model, respectively, and the step of detecting the medical image of the sample to obtain the second detection result is performed by using the first image detection model and the second detection model, respectively, so that the network parameters of the first original detection model are adjusted by using the difference between the first prediction region of the first original detection model and the actual region, respectively, and the first prediction region of the second original detection model and the actual region, respectively, The difference between the second prediction areas of the first image detection model is used for adjusting the network parameters of the second original detection model, so that the first image detection model corresponding to the first original detection model can be used for monitoring the training of the second original detection model, the second image detection model corresponding to the second original detection model is used for monitoring the training of the first original detection model, the accumulated error of the network parameters generated in the real area of the pseudo-label in the multi-training process can be further restrained, and the accuracy of the image detection model is improved.

Wherein, using the difference between the first prediction region and the actual region and the difference between the first prediction region and the second prediction region, respectively, adjusting the network parameters of the original detection model comprises: determining a first loss value of the original detection model by using the difference between the first prediction region and the actual region; determining a second loss value of the original detection model by using the difference between the first prediction region and the second prediction region; and adjusting the network parameters of the original detection model by using the first loss value and the second loss value.

Therefore, the first loss value of the original detection model is determined through the difference between the first prediction region and the actual region, the second loss value of the original detection model is determined through the difference between the first prediction region and the second prediction region, and the network parameters of the original detection model are adjusted by utilizing the first loss value and the second loss value, so that the loss of the original detection model can be measured through two dimensions, namely the difference between the first prediction region predicted by the original detection model and the second prediction region predicted by the pseudo-labeled actual region and the corresponding image detection model, the accuracy of loss calculation can be improved, the accuracy of network parameters of the original detection model can be improved, and the accuracy of the image detection model can be improved.

Wherein determining a first loss value of the original detection model using a difference between the first prediction region and the actual region comprises: processing the first prediction area and the actual area by using a focus loss function to obtain a focus first loss value; and/or processing the first prediction region and the actual region by using a set similarity loss function to obtain a set similarity first loss value; and/or, determining a second loss value of the original detection model using a difference between the first prediction region and the second prediction region comprises: processing the first prediction area and the second prediction area by using a consistency loss function to obtain a second loss value; and/or, adjusting the network parameters of the original detection model by using the first loss value and the second loss value comprises: weighting the first loss value and the second loss value to obtain a weighted loss value; and adjusting the network parameters of the original detection model by using the weighted loss value.

Therefore, the focus first loss value is obtained by processing the first prediction region and the actual region by using the focus loss function, so that the attention of the model to a difficult sample can be improved, and the accuracy of the image detection model can be improved; the first prediction region and the actual region are processed by utilizing the set similarity loss function to obtain a set similarity first loss value, so that the model can be fitted with the pseudo-labeled actual region, and the accuracy of the image detection model can be improved; the consistency loss function is utilized to process the first prediction area and the second prediction area to obtain a second loss value, so that the consistency of the original model and the image detection model can be improved, and the accuracy of the image detection model can be improved; the weighted loss value is obtained by weighting the first loss value and the second loss value, the network parameters of the original detection model are adjusted by using the weighted loss value, and the importance degree of each loss value in the training process can be balanced, so that the accuracy of the network parameters can be improved, and the accuracy of the image detection model can be improved.

The sample medical image also comprises an actual region of the labeled organ, the first detection result also comprises a first prediction region of the labeled organ, and the second detection result also comprises a second prediction region of the labeled organ; determining a first loss value of the original detection model using a difference between the first prediction region and the actual region, comprising: determining a first loss value of the original detection model by using the difference between the first prediction region and the actual region of the unmarked organ and the marked organ; determining a second loss value of the original detection model using a difference between the first prediction region and the second prediction region, comprising: and determining a second loss value of the original detection model by using the difference between the first prediction region and the corresponding second prediction region of the unmarked organ.

Therefore, the actual region of the labeled organ is set in the sample medical image, the first detection result further comprises the first prediction region of the labeled organ, the second detection result further comprises the second prediction region of the labeled organ, the difference between the first prediction region and the actual region is comprehensively considered when the first loss value of the original detection model is determined, and only the difference between the first prediction region of the unlabeled organ and the corresponding second prediction region is considered when the second loss value of the original detection model is determined, so that the robustness of the consistency constraint between the original detection model and the image detection model can be improved, and the accuracy of the image detection model can be improved.

After adjusting the network parameters of the original detection model by using the difference between the first prediction region and the actual region and the difference between the first prediction region and the second prediction region, the method further comprises the following steps: and updating the network parameters of the image detection model by using the adjusted network parameters in the training and the previous training for a plurality of times.

Therefore, the network parameters of the image detection model are updated by utilizing the network parameters of the original detection model which are adjusted during the training and the previous training for a plurality of times, so that the accumulated errors of the network parameters caused by the false labeled real area in the training process of a plurality of times can be further restrained, and the accuracy of the image detection model is improved.

The method for updating the network parameters of the image detection model by using the adjusted network parameters during the training and the previous training for a plurality of times comprises the following steps: counting the average value of the network parameters adjusted by the original detection model in the current training and a plurality of previous training; and updating the network parameters of the image detection model into the average values of the network parameters of the corresponding original detection model.

Therefore, by counting the average value of the network parameters adjusted by the original detection model in the training and the previous training for a plurality of times and updating the network parameters of the image detection model into the average value of the network parameters of the corresponding original detection model, the accumulated errors generated in the training process of a plurality of times can be restrained quickly, and the accuracy of the image detection model is improved.

Wherein acquiring the sample medical image comprises: acquiring a medical image to be subjected to pseudo-labeling, wherein the medical image to be subjected to pseudo-labeling has at least one unmarked organ; respectively detecting the medical image to be pseudo-labeled by using a single organ detection model corresponding to each unmarked organ to obtain an organ prediction area of each unmarked organ; and pseudo-labeling the organ prediction region of the unmarked organ as the actual region of the unmarked organ, and taking the pseudo-labeled medical image to be pseudo-labeled as a sample medical image.

Therefore, the pseudo-annotated medical image to be labeled with at least one unmarked organ is obtained, the single organ detection model corresponding to each unmarked organ is utilized to detect the pseudo-annotated medical image to be labeled, so as to obtain the organ prediction area of each unmarked organ, the organ prediction area of the unmarked organ is pseudo-labeled as the actual area of the unmarked organ, the pseudo-annotated medical image to be labeled is taken as the sample medical image, the single organ detection model can be utilized to avoid the workload of manually labeling multiple organs, thereby being beneficial to reducing the labor cost for training the image detection model for detecting the multiple organs and improving the training efficiency.

The medical image to be pseudo-labeled comprises at least one labeled organ; before the pseudo-labeled medical image is detected by using the single organ detection model corresponding to each unlabeled organ, the method further comprises the following steps: and training a single organ detection model corresponding to the labeled organ in the medical image to be pseudo-labeled by using the medical image to be pseudo-labeled.

Therefore, the medical image to be subjected to pseudo labeling comprises at least one labeled organ, the medical image to be subjected to pseudo labeling is used for training the single organ detection model corresponding to the labeled organ in the medical image to be subjected to pseudo labeling, the accuracy of the single organ detection model can be improved, the accuracy of subsequent pseudo labeling can be improved, and the accuracy of the subsequent training image detection model can be improved.

Wherein, the medical image to be pseudo-labeled is obtained, including: acquiring a three-dimensional medical image, and preprocessing the three-dimensional medical image; and cutting the preprocessed three-dimensional medical image to obtain at least one two-dimensional medical image to be pseudo-labeled.

Therefore, the three-dimensional medical image is obtained and preprocessed, so that the preprocessed three-dimensional medical image is cut to obtain at least one two-dimensional medical image to be subjected to pseudo-labeling, the medical image meeting model training can be favorably obtained, and the accuracy of subsequent image detection model training can be favorably improved.

Wherein the preprocessing of the three-dimensional medical image comprises at least one of: adjusting the voxel resolution of the three-dimensional medical image to a preset resolution; normalizing the voxel value of the three-dimensional medical image to be within a preset range by utilizing a preset window value; gaussian noise is added to at least a portion of the voxels of the three-dimensional medical image.

Therefore, the voxel resolution of the three-dimensional medical image is adjusted to a preset resolution, which is beneficial to the subsequent model prediction processing; the voxel value of the three-dimensional medical image is normalized to be within a preset range by utilizing a preset window value, so that accurate characteristics can be favorably extracted from the model; gaussian noise is added into at least part of voxels of the three-dimensional medical image, so that data amplification can be favorably realized, data diversity is improved, and accuracy of subsequent model training is improved.

A second aspect of the present application provides an image detection method, including: acquiring a medical image to be detected, wherein the medical image to be detected comprises a plurality of organs; detecting medical science to be detected by using an image detection model to obtain prediction regions of a plurality of organs; the image detection model is obtained by training with the training method of the image detection model in the first aspect.

Therefore, the medical image to be detected is detected and detected by using the image detection model obtained by training in the first aspect, so that the prediction regions of a plurality of organs are obtained, and the detection accuracy can be improved during multi-organ detection.

The third aspect of the application provides a training device for an image detection model, which comprises an image acquisition module, a first detection module, a second detection module and a parameter adjustment module, wherein the image acquisition module is used for acquiring a sample medical image, and the sample medical image is used for pseudo-marking an actual region of at least one unmarked organ; the first detection module is used for detecting the sample medical image by using the original detection model to obtain a first detection result, wherein the first detection result comprises a first prediction region of an unmarked organ; the second detection module is used for detecting the sample medical image by using the image detection model to obtain a second detection result, wherein the network parameters of the image detection model are determined based on the network parameters of the original detection model, and the second detection result comprises a second prediction region of the unmarked organ; the parameter adjusting module is used for adjusting the network parameters of the original detection model by using the difference between the first prediction region and the actual region and the difference between the first prediction region and the second prediction region.

The fourth aspect of the present application provides an image detection apparatus, including an image acquisition module and an image detection module, where the image acquisition module is configured to acquire a medical image to be detected, where the medical image to be detected includes a plurality of organs; the image detection module is used for detecting the medical science to be detected by using the image detection model to obtain the prediction regions of a plurality of organs; wherein the image detection model is obtained by training with the training device of the image detection model in the second aspect.

A fifth aspect of the present application provides an electronic device, which includes a memory and a processor coupled to each other, where the processor is configured to execute program instructions stored in the memory to implement the method for training an image detection model in the first aspect or to implement the method for image detection in the second aspect.

A sixth aspect of the present application provides a computer-readable storage medium, on which program instructions are stored, which program instructions, when executed by a processor, implement the method for training an image detection model in the above first aspect, or implement the method for image detection in the above second aspect.

In the scheme, the sample medical image is obtained, and the sample medical image is pseudo-marked with the actual region of at least one unmarked organ, so that multiple organs do not need to be really marked in the sample medical image, the original detection model is used for detecting the sample medical image to obtain a first detection result of a first preset region containing the unmarked organ, the image detection model is used for detecting the sample medical image to obtain a second detection result of a second prediction region containing the unmarked organ, the difference between the first prediction region and the actual region and the difference between the first prediction region and the second prediction region are further used for adjusting the network parameters of the original detection model, and the network parameters of the image detection model are determined based on the network parameters of the original detection model, so that the image detection model can supervise the training of the original detection model, and the accumulated errors of the network parameters caused by the pseudo-marked actual region in the multiple training processes can be restrained, the accuracy of the image detection model is improved, so that the image detection model can accurately supervise the original detection model for training, and further the network parameters of the original detection model can be accurately adjusted in the training process, and therefore, the detection accuracy of the image detection model can be improved in the multi-organ detection process.

Drawings

FIG. 1 is a schematic flow chart diagram illustrating an embodiment of a training method for an image detection model according to the present application;

FIG. 2 is a flowchart illustrating an embodiment of step S11 in FIG. 1;

FIG. 3 is a schematic flowchart of another embodiment of a training method for an image detection model according to the present application;

FIG. 4 is a schematic diagram of an embodiment of a training process for an image detection model;

FIG. 5 is a schematic flowchart of an embodiment of an image detection method of the present application;

FIG. 6 is a block diagram of an embodiment of an image inspection model training apparatus according to the present application;

FIG. 7 is a block diagram of an embodiment of an image detection apparatus according to the present application;

FIG. 8 is a block diagram of an embodiment of an electronic device of the present application;

FIG. 9 is a block diagram of an embodiment of a computer-readable storage medium of the present application.

Detailed Description

The following describes in detail the embodiments of the present application with reference to the drawings attached hereto.

In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, interfaces, techniques, etc. in order to provide a thorough understanding of the present application.

The terms "system" and "network" are often used interchangeably herein. The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship. Further, the term "plurality" herein means two or more than two.

Referring to fig. 1, fig. 1 is a schematic flowchart illustrating an embodiment of a training method for an image detection model according to the present application. Specifically, the method may include the steps of:

step S11: a sample medical image is acquired, wherein the sample medical image pseudo-labels an actual region of at least one unlabeled organ.

The medical image of the sample may include a CT image and an MR image, but is not limited thereto. In a specific implementation scenario, the sample medical image may be obtained by scanning the abdomen, the chest, the skull, and the like, and may be specifically set according to an actual application, which is not limited herein. For example, scanning the abdomen, the organs in the sample medical image may include: kidney, spleen, liver, pancreas, etc.; alternatively, the chest is scanned and the organs in the sample medical image may include: heart, lung lobes, thyroid gland, etc.; alternatively, the skull is scanned and the organs in the sample medical image may include: brainstem, cerebellum, diencephalon, telencephalon, etc.

In one implementation scenario, the actual region of the unlabeled organ may be detected by using a single-organ detection model corresponding to the unlabeled organ, for example, the sample medical image is obtained by abdominal scanning, and the unlabeled organ may include: at least one of the kidney, spleen, liver and pancreas, detecting the sample medical image by using the single-organ detection model corresponding to the kidney to obtain an organ prediction region corresponding to the kidney, detecting the sample medical image by using the single-organ detection model corresponding to the spleen to obtain an organ prediction region corresponding to the spleen, detecting the sample medical image by using the single-organ detection model corresponding to the liver to obtain an organ prediction region corresponding to the liver, detecting the sample medical image by using the single-organ detection model corresponding to the pancreas to obtain an organ prediction region corresponding to the pancreas, and pseudo-labeling the organ prediction regions corresponding to the kidney, spleen, liver and pancreas in the sample medical image to obtain pseudo-labeled actual regions of the kidney, spleen, liver and pancreas which are not labeled, in the embodiment of the present application, pseudo labeling refers to a process of using an organ prediction region of an unlabeled organ detected by a single organ detection model as an actual region. When the unlabeled organ is other organs, the analogy can be repeated, and no one example is given here. In a specific implementation scenario, the single-organ detection model of the unlabeled organ is obtained by training using a single-organ dataset of an actual region labeled with the unlabeled organ, for example, the single-organ detection model corresponding to the kidney is obtained by training using a kidney dataset of the actual region labeled with the kidney, the single-organ detection model corresponding to the spleen is obtained by training using a spleen dataset of the actual region labeled with the spleen, and so on, which are not illustrated herein.

Step S12: and detecting the sample medical image by using the original detection model to obtain a first detection result, wherein the first detection result comprises a first prediction region of the unmarked organ.

The original detection model may specifically include any one of Mask R-cnn (Mask Region with volumetric neural Network), FCN (full volumetric Network), PSP-net (pyramid scene analysis Network), and set-net and U-net, and may be specifically set according to an actual situation, which is not limited herein.

The original detection model is used for detecting the sample medical image, and a first detection result of a first prediction region containing an unmarked organ can be obtained. For example, the sample medical image is an image obtained by scanning an abdomen, and the unlabeled organs include a kidney, a spleen, and a pancreas, so that the first prediction region of the kidney, the first prediction region of the spleen, and the first prediction region of the pancreas can be obtained by detecting the sample medical image by using the original detection model, and other scenes can be analogized, which is not illustrated here.

Step S13: and detecting the sample medical image by using the image detection model to obtain a second detection result, wherein the second detection result comprises a second prediction region of the unmarked organ.

The network structure of the original detection model, the network structure of the image detection model corresponding to the original detection model may be the same. For example, when the original detection model is Mask R-CNN, the corresponding image detection model may also be Mask R-CNN; or, when the original detection model is FCN, the corresponding image detection model may also be FCN; or, when the original detection model is PSP-net, the corresponding image detection model may also be PSP-net, and when the original detection model is other network, the analogy may be performed, which is not illustrated one by one here.

The network parameters of the image detection model may be determined based on the network parameters of the original detection model, for example, the network parameters of the image detection model may be obtained based on the network parameters of the original detection model after being adjusted in a plurality of training processes. For example, during the kth training, the network parameters of the image detection model may be obtained by using the network parameters of the original detection model adjusted in the k-nth to k-1 training processes; or, when training for the (k + 1) th time, the network parameters of the image detection model may be obtained by using the network parameters of the original detection model adjusted in the (k + 1-n) th to k-th training processes, and so on, which is not illustrated herein. Specifically, the number of times (i.e., n) of the above-mentioned multiple training may be set according to practical situations, such as 5, 10, 15, etc., and is not limited herein.

And detecting the sample medical image by using the image detection model to obtain a second detection result of a second prediction region containing the unmarked organ. Still taking the sample medical image as an example of an image obtained by scanning the abdomen, the unlabeled organs include the kidney, the spleen and the pancreas, so that the second prediction region of the kidney, the second prediction region of the spleen and the second prediction region of the pancreas can be obtained by detecting the sample medical image by using the image detection model, and other scenes can be analogized, which is not illustrated herein.

In an implementation scenario, the steps S12 and S13 may be executed in a sequential order, for example, step S12 is executed first, and then step S13 is executed; alternatively, step S13 is executed first, and then step S12 is executed. In another implementation scenario, the step S12 and the step S13 may also be executed simultaneously, and may be specifically set according to an actual application, which is not limited herein.

Step S14: and adjusting the network parameters of the original detection model by using the difference between the first prediction region and the actual region and the difference between the first prediction region and the second prediction region.

Specifically, a first loss value of the original detection model may be determined using a difference between the first prediction region and the actual region. For example, in order to improve the attention of the model to a difficult sample, a first prediction region and an actual region may be processed by using a focal loss (focal loss) function to obtain a focal first loss value; alternatively, in order to fit the model to the pseudo-labeled actual region, the first prediction region and the actual region may be processed by using a set similarity loss (dice loss) function, so as to obtain a set similarity first loss value.

Specifically, the difference between the first prediction region and the second prediction region may also be used to determine a second loss value of the original detection model. For example, in order to improve the consistency of the predictions of the original detection model and the image detection model, the consistency loss function may be used to process the first prediction region and the second prediction region to obtain the second loss value, and in a specific implementation scenario, the consistency loss function may be a cross entropy loss function, and may be specifically set according to an actual application situation, which is not limited herein.

Specifically, the network parameters of the original detection model can be adjusted by using the first loss value and the second loss value. For example, in order to balance the importance of each loss value in the training process, the first loss value and the second loss value may be weighted to obtain a weighted loss value, so as to adjust the network parameters of the original detection model by using the weighted loss value. The weights corresponding to the first loss value and the second loss value may be set according to an actual situation, for example, both are set to 0.5; or, the weight corresponding to the first loss value is set to 0.6, and the weight corresponding to the second loss value is set to 0.4, which is not limited herein. In addition, when the first loss value includes a focus first loss value and an aggregation similarity first loss value, the focus first loss value, the aggregation similarity first loss value, and the second loss value may be weighted to obtain a weighted loss value, and the weighted loss value is used to adjust a network parameter of the original detection model. In a specific implementation scenario, network parameters of an original detection model may be adjusted by using a weighting loss value in a random Gradient Descent (SGD), Batch Gradient Descent (BGD), small-Batch Gradient Descent (Mini-Batch Gradient Descent, MBGD), and the like, where the Batch Gradient Descent refers to updating parameters by using all samples during each iteration; the random gradient descent means that one sample is used for parameter updating in each iteration; the small batch gradient descent means that a batch of samples is used for parameter updating at each iteration, and details are not repeated here.

In one implementation scenario, the sample medical image may further include an actual region of the labeled organ, the first detection result may further include a first predicted region of the labeled organ, and the second detection result may further include a second predicted region of the labeled organ. Still taking the example that the sample medical image is an image obtained by scanning the abdomen, the unlabeled organ includes a kidney, a spleen and a pancreas, and the labeled organ includes a liver, so that the original detection model is used for detecting the sample medical image, a first prediction region corresponding to the unlabeled organ kidney, a first prediction region corresponding to the unlabeled organ spleen, a first prediction region corresponding to the unlabeled organ pancreas and a first prediction region corresponding to the labeled organ liver can be obtained, and the image detection model corresponding to the original detection model is used for detecting the sample medical image, so that a second prediction region corresponding to the unlabeled organ kidney, a second prediction region corresponding to the unlabeled organ spleen, a second prediction region corresponding to the unlabeled organ pancreas and a second prediction region corresponding to the labeled organ liver can be obtained. Thus, a first loss value of the original detection model may be determined using a difference between the first prediction region and the actual region of the unlabeled organ and the labeled organ, and a second loss value of the original detection model may be determined using a difference between the first prediction region and the corresponding second prediction region of the unlabeled organ. Still taking the example that the sample medical image is an image obtained by scanning an abdomen, the unlabeled organ includes a kidney, a spleen, and a pancreas, the labeled organ includes a liver, and a first loss value of the original detection model may be determined by using a difference between a first prediction region corresponding to the kidney of the unlabeled organ and a pseudo-labeled actual region, a difference between a first prediction region corresponding to the spleen of the unlabeled organ and a pseudo-labeled actual region, a difference between a first prediction region corresponding to the pancreas of the unlabeled organ and a pseudo-labeled actual region, and a difference between a first prediction region corresponding to the liver of the labeled organ and a real labeled actual region, where the first loss value may specifically include at least one of a first loss value of a focus and a first loss value of an aggregate similarity, which may be referred to the foregoing steps, and is not repeated here. In addition, a second loss value of the original detection model may be determined by using a difference between the first prediction region and the second prediction region corresponding to the kidney of the organ that is not labeled, a difference between the first prediction region and the second prediction region corresponding to the spleen of the organ that is not labeled, and a difference between the first prediction region and the second prediction region corresponding to the pancreas of the organ that is not labeled. Therefore, when the first loss value of the original detection model is determined, the difference between the first prediction region and the actual region is comprehensively considered, and when the second loss value of the original detection model is determined, only the difference between the first prediction region of the unmarked organ and the corresponding second prediction region is considered, so that the robustness of the consistency constraint of the original detection model and the image detection model can be improved, and the accuracy of the image detection model can be further improved.

In another implementation scenario, after the network parameters of the original detection model are adjusted, the network parameters of the image detection model can be updated by using the adjusted network parameters during the training and the previous training for several times, so as to further constrain the accumulated error of the network parameters generated by the pseudo-labeled real area in the multiple training processes, and improve the accuracy of the image detection model. In addition, after the network parameters of the original detection model are adjusted, the network parameters of the image detection model may not be updated as needed, and after a preset number of times (e.g., 2 times, 3 times, etc.) of training, the network parameters adjusted during the current training and the previous training are used to update the network parameters of the image detection model, which is not limited herein. For example, in the kth training, the network parameters of the image detection model may not be updated, and in the kth + i training, the original detection model may be used to train from the kth + i-n to the kth + i, where i may be set to an integer not less than 1 according to an actual situation, such as 1, 2, 3, and so on, which is not limited herein.

In a specific implementation scenario, when updating the network parameters of the image detection model, the average values of the network parameters adjusted by the original detection model in the current training and the previous training for several times may be counted, and then the network parameters of the image detection model are updated to the average values of the network parameters of the corresponding original detection model. In this embodiment, the average values of the network parameters are all average values corresponding to the same network parameter, and specifically, may be average values of a certain weight (or bias) corresponding to the same neuron after being adjusted in multiple training processes, so that the average values of the weights (or biases) of the neurons after being adjusted in the multiple training processes can be obtained through statistics, and the corresponding weights (or biases) of the neurons in the image detection model are updated by using the average values. For example, the training time is the kth training time, and the average value of the network parameters adjusted by the original detection model in the training time and n-1 preceding training times may be counted, where the specific value of n may be set according to an actual application, for example, may be set to 5, 10, 15, and the like, and is not limited herein. Therefore, when training for the (k + 1) th time, the network parameters of the image detection model are obtained by updating the average values of the adjusted network parameters in the (k-n + 1) th to k-th training processes, so that the accumulated errors generated in the multiple training processes can be restrained quickly, and the accuracy of the image detection model is improved.

In another implementation scenario, a preset training ending condition may be further set, and if the preset training ending condition is not met, the step S12 and the subsequent steps may be executed again to continue to adjust the network parameters of the original detection model. In a specific implementation scenario, the presetting of the training end condition may include: the current training times reach a preset time threshold (e.g., 500 times, 1000 times, etc.), and the loss value of the original detection model is smaller than any one of a preset loss threshold, which is not limited herein. In another specific implementation scenario, after training is finished, the medical image to be detected can be detected by using the image detection model, so that regions corresponding to multiple organs in the medical image to be detected can be directly obtained, and the operation of respectively detecting the medical image to be detected by using multiple single-organ detections can be omitted, so that the detection calculation amount can be reduced.

In the scheme, the sample medical image is obtained, and the sample medical image is pseudo-marked with the actual region of at least one unmarked organ, so that multiple organs do not need to be really marked in the sample medical image, the original detection model is used for detecting the sample medical image to obtain a first detection result of a first preset region containing the unmarked organ, the image detection model is used for detecting the sample medical image to obtain a second detection result of a second prediction region containing the unmarked organ, the difference between the first prediction region and the actual region and the difference between the first prediction region and the second prediction region are further used for adjusting the network parameters of the original detection model, and the network parameters of the image detection model are determined by the network parameters of the original detection model, so that the image detection model can supervise the training of the original detection model, and the accumulated errors of the network parameters caused by the pseudo-marked actual region in the multiple training processes can be restrained, the accuracy of the image detection model is improved, so that the image detection model can accurately supervise the original detection model for training, and further the network parameters of the original detection model can be accurately adjusted in the training process, and therefore, the detection accuracy of the image detection model can be improved in the multi-organ detection process.

Referring to fig. 2, fig. 2 is a schematic flowchart illustrating an embodiment of step S11 in fig. 1. Specifically, fig. 2 is a schematic flow chart of an embodiment of acquiring a sample medical image, which specifically includes the following steps:

step S111: acquiring a medical image to be subjected to pseudo-labeling, wherein at least one unmarked organ exists in the medical image to be subjected to pseudo-labeling.

The medical image to be pseudo-labeled may be obtained by scanning the abdomen, and the unmarked organ in the medical image to be pseudo-labeled may include: for example, the medical image to be pseudolabeled may be obtained by scanning other parts, such as the chest, the skull, and the like, and reference may be made to the relevant steps in the foregoing embodiments, which are not limited herein.

In an implementation scenario, the acquired original medical image may be a three-dimensional medical image, such as a three-dimensional CT image or a three-dimensional MR image, which is not limited herein, so that the three-dimensional medical image may be preprocessed, and the preprocessed three-dimensional medical image may be cut to obtain at least one medical image to be pseudo-labeled. The cropping processing may specifically be performing center cropping on the preprocessed three-dimensional medical image, which is not limited herein. For example, the medical image to be pseudo-labeled may be two-dimensional by clipping along a plane parallel to the three-dimensional medical image in a dimension perpendicular to the plane. The specific size of the medical image to be pseudo-labeled may be set according to practical situations, for example, 352 × 352, and is not limited herein.

In one particular implementation scenario, the pre-processing may include adjusting a voxel resolution of the three-dimensional medical image to a preset resolution. The voxel of the three-dimensional medical image is the minimum unit of the three-dimensional medical image in the three-dimensional space segmentation, the preset resolution may be 1 × 3mm, and the preset resolution may also be set as other resolutions according to the actual situation, for example, 1 × 4mm, 2 × 3mm, and the like, which is not limited herein. By adjusting the voxel resolution of the three-dimensional medical image to a preset resolution, the subsequent model prediction processing can be facilitated.

In another specific implementation scenario, the preprocessing may further include normalizing the voxel values of the three-dimensional medical image to be within a predetermined range by using a predetermined window value. The voxel value may be a different value depending on the three-dimensional medical image, for example, in the case of a three-dimensional CT image, the voxel value may be a Hu (hounsfield unit) value. The preset window value may be specifically set according to a portion corresponding to the three-dimensional medical image, and still taking the three-dimensional CT image as an example, for the abdominal CT, the preset window value may be set to-125 to 275, and other portions may be set according to an actual situation, which is not illustrated herein. The preset range may be set according to practical applications, for example, the preset range may be set to 0 to 1, and still take a three-dimensional CT image as an example, for an abdominal CT, the preset window value may be set to-125 to 275, and when the preset range is 0 to 1, voxels with a voxel value less than or equal to-125 may be uniformly reset to a voxel value of 0, voxels with a voxel value greater than or equal to 275 may be uniformly reset to a voxel value of 1, and voxels with a voxel value between-125 and 275 may be reset to a voxel value of 0 to 1, so that it may be beneficial to enhance contrast between different organs in an image, and further may improve accurate features of a model.

In yet another specific implementation scenario, the preprocessing may further include adding gaussian noise to at least a portion of the voxels of the three-dimensional medical image. At least some of the voxels may be set according to the actual application, for example, 1/3 voxels of the three-dimensional medical image, or 1/2 voxels of the three-dimensional medical image, or all voxels of the three-dimensional medical image, which is not limited herein. By adding Gaussian noise into at least part of voxels of the three-dimensional medical image, the two-dimensional medical image to be pseudo-labeled can be cut on the basis of the three-dimensional medical image and the three-dimensional medical image without the Gaussian noise, so that the method is favorable for realizing data amplification, improves the data diversity and improves the accuracy of subsequent model training.

Step S112: and respectively detecting the pseudo-labeled medical image by using the single organ detection model corresponding to each unlabeled organ to obtain the organ prediction area of each unlabeled organ.

In an implementation scenario, the single-organ detection model corresponding to each unlabeled organ may be obtained by training using a single-organ dataset labeled with the unlabeled organ, for example, the single-organ detection model corresponding to the kidney may be obtained by training using a single-organ dataset labeled with the kidney, the single-organ detection model corresponding to the spleen may be obtained by training using a single-organ dataset labeled with the spleen, and the like for other organs, which are not illustrated herein.

In another implementation scenario, the medical image to be pseudo-labeled may further include at least one labeled organ, and then the single-organ detection model corresponding to the labeled organ in the medical image to be pseudo-labeled may be trained by using the medical image to be pseudo-labeled including the labeled organ, so as to obtain the corresponding single-organ detection model. For example, if the medical image to be pseudo-labeled includes a labeled liver, the medical image to be pseudo-labeled including the labeled liver may be used to train the single-organ detection model corresponding to the liver, so as to obtain the single-organ detection model corresponding to the liver, and so on.

In addition, the single-organ detection model may specifically include any one of Mask R-cnn (Mask Region with volumetric Neural Network), FCN (full volumetric Network), PSP-net (Pyramid Scene analysis Network), or may specifically be set-net, U-net, and the like, and may specifically be set according to an actual situation, which is not limited herein.

The organ prediction region of each unmarked organ can be obtained by detecting the pseudo-marked medical image by using the single organ detection model corresponding to each unmarked organ. Taking the medical image to be pseudo-labeled as an image obtained by scanning the abdomen as an example, the non-labeled organs comprise the kidney, the spleen and the pancreas, the single organ detection model corresponding to the kidney is used for detecting the medical image to be pseudo-labeled, the organ prediction region of the kidney can be obtained, the single organ detection model corresponding to the spleen is used for detecting the medical image to be pseudo-labeled, the organ prediction region of the spleen can be obtained, the single organ detection model corresponding to the pancreas is used for detecting the medical image to be pseudo-labeled, the organ prediction region of the pancreas can be obtained, the step of detecting the medical image to be pseudo-labeled by using the single organ detection model corresponding to each non-labeled organ can be executed simultaneously, and finally the organ prediction region of each non-labeled organ is pseudo-labeled in the medical image to be pseudo-labeled, so that the pseudo-labeling efficiency can be improved, for example, the above steps of detecting the medical image to be pseudo-labeled by using the single-organ detection model corresponding to the kidney, detecting the medical image to be pseudo-labeled by using the single-organ detection model corresponding to the spleen, and detecting the medical image to be pseudo-labeled by using the single-organ detection model corresponding to the pancreas may be performed simultaneously, and finally pseudo-labeling is uniformly performed on the medical image to be pseudo-labeled on the single-organ prediction regions of the kidney, the spleen, and the pancreas; alternatively, the step of detecting the pseudo-labeled medical image by using the single-organ detection model corresponding to each unlabeled organ may be performed sequentially, so that the organ prediction region of each unlabeled organ does not need to be labeled pseudo in the pseudo-labeled medical image, for example, the steps of detecting the pseudo-labeled medical image by using the single-organ detection model corresponding to the kidney, detecting the pseudo-labeled medical image by using the single-organ detection model corresponding to the spleen, and detecting the pseudo-labeled medical image by using the single-organ detection model corresponding to the pancreas may be performed sequentially, and the finally obtained pseudo-labeled medical image may include the single-organ prediction regions of the kidney, the spleen, and the pancreas. The setting can be specifically performed according to the actual situation, and is not limited herein.

Step S113: and pseudo-labeling the organ prediction region of the unmarked organ as the actual region of the unmarked organ, and taking the pseudo-labeled medical image to be pseudo-labeled as a sample medical image.

After the organ prediction area of each unmarked organ is obtained, the organ prediction area of the unmarked organ can be marked as the actual area of the unmarked organ in a pseudo way, and the pseudo-marked medical image to be marked is used as the sample medical image.

Different from the embodiment, the pseudo-labeled medical image to be detected with at least one unlabeled organ is obtained, the single organ detection model corresponding to each unlabeled organ is used for detecting the pseudo-labeled medical image to be detected, so that the organ prediction area of each unlabeled organ is obtained, the organ prediction area of the unlabeled organ is pseudo-labeled as the actual area of the unlabeled organ, the pseudo-labeled medical image to be detected is used as the sample medical image, the workload of manually labeling multiple organs can be avoided by using the single organ detection model, the labor cost of training the image detection model for detecting the multiple organs can be reduced, and the training efficiency is improved.

Referring to fig. 3, fig. 3 is a schematic flowchart illustrating a training method of an image detection model according to another embodiment of the present application. Specifically, the method may include the steps of:

step S31: a sample medical image is acquired, wherein the sample medical image pseudo-labels an actual region of at least one unlabeled organ.

Reference may be made in particular to the relevant steps in the preceding embodiments.

Step S32: the step of detecting the sample medical image to obtain a first detection result is performed by using the first original detection model and the second original detection model, respectively.

The raw detection models may specifically comprise a first raw detection model and a second raw detection model. The first original detection model may specifically include any one of Mask R-cnn (Mask Region with volumetric Neural Network), FCN (full volumetric Network), PSP-net (Pyramid Scene analysis Network), and the like, and may specifically be set-net, U-net, and the like, and may specifically be set according to an actual situation, which is not limited herein. The second original detection model may specifically include any one of Mask R-cnn (Mask Region with volumetric Neural Network), FCN (full volumetric Network), PSP-net (Pyramid Scene analysis Network), and the like, and may also be set-net, U-net, and the like, which may be specifically set according to an actual situation, and is not limited herein.

The step of detecting the sample medical image to obtain the first detection result is performed by using the first original detection model and the second original detection model, which may refer to the relevant steps in the foregoing embodiments specifically, and details are not repeated herein. In one implementation scenario, the first detection result detected by the first original detection model may include a first prediction region of an unlabeled organ, or the first detection result detected by the first original detection model may further include a first prediction region of an unlabeled organ and a first prediction region of a labeled organ. In another implementation scenario, the first detection result detected by the second original detection model may include the first prediction region of the unlabeled organ, or the first detection result detected by the second original detection model may further include the first prediction region of the unlabeled organ and the first prediction region of the labeled organ.

Referring to fig. 4, fig. 4 is a schematic diagram illustrating an embodiment of a training process of an image detection model. As shown in fig. 4, for ease of description, the first original detection model is denoted as net1 and the second original detection model is denoted as net 2. As shown in fig. 4, the first original testing model net1 tests the sample medical image to obtain a first testing result corresponding to the first original testing model net1, and the second original testing model net2 tests the sample medical image to obtain a first testing result corresponding to the second original testing model net 2.

Step S33: and respectively utilizing the first image detection model and the second image detection model to execute the step of detecting the sample medical image to obtain a second detection result.

The image detection model may specifically include a first image detection model corresponding to the first original detection model and a second image detection model corresponding to the second original detection model, and specific network structures and network parameters of the first image detection model and the second image detection model may refer to relevant steps in the foregoing embodiments, which are not described herein again.

The step of detecting the sample medical image to obtain the second detection result is performed by using the first image detection model and the second image detection model, which may refer to the relevant steps in the foregoing embodiments specifically, and details are not repeated here. In an implementation scenario, the second detection result detected by the first image detection model may include a second prediction region of an unlabeled organ, or the second detection result detected by the first image detection model may further include a second prediction region of an unlabeled organ and a second prediction region of an labeled organ. In another implementation scenario, the second detection result detected by the second image detection model may include a second prediction region of an unlabeled organ, or the second detection result detected by the second image detection model may further include a second prediction region of an unlabeled organ and a second prediction region of an labeled organ.

Referring to fig. 4, for convenience of description, the first image sensing model corresponding to the first original sensing model net1 is denoted as EMA net1, and the second image sensing model corresponding to the second original sensing model net2 is denoted as EMA net 2. As shown in fig. 4, the first image detection model EMA net1 detects the sample medical image to obtain a second detection result corresponding to the first image detection model EMA net1, and the second image detection model EMA net2 detects the sample medical image to obtain a second detection result corresponding to the second image detection model EMA net 2.

In an implementation scenario, the steps S32 and S33 may be executed in a sequential order, for example, step S32 is executed first, and then step S33 is executed, or step S33 is executed first, and then step S32 is executed. In another implementation scenario, the step S32 and the step S33 may also be executed simultaneously, and may be specifically set according to an actual application, which is not limited herein.

Step S34: and adjusting the network parameters of the first original detection model by utilizing the difference between the first prediction region of the first original detection model and the actual region and the difference between the first prediction region of the second original detection model and the second prediction region of the second image detection model.

Specifically, a first loss value of the first original detection model may be determined using a difference between a first prediction region of the first original detection model and the pseudo-labeled actual region, and a second loss value of the first original detection model may be determined using a difference between the first prediction region of the first original detection model and a second prediction region of the second image detection model, so that a network parameter of the first original detection model is adjusted using the first loss value and the second loss value. The specific calculation manner of the first loss value and the second loss value may refer to the related steps in the foregoing embodiments, and is not described herein again. In a specific implementation scenario, when the second loss value is calculated, only the first prediction region and the second prediction region of the unlabeled organ may be subjected to the calculation, so that the robustness of the consistency constraint of the first original detection model and the second image detection model may be improved, and the accuracy of the image detection model may be further improved.

Step S35: and adjusting the network parameters of the second original detection model by utilizing the difference between the first prediction region of the second original detection model and the actual region and the difference between the first prediction region of the first original detection model and the second prediction region of the first image detection model.

Specifically, a first loss value of the second original detection model may be determined using a difference between a first prediction region of the second original detection model and the pseudo-labeled actual region, and a second loss value of the second original detection model may be determined using a difference between the first prediction region of the second original detection model and a second prediction region of the first image detection model, so that a network parameter of the second original detection model is adjusted using the first loss value and the second loss value. The specific calculation manner of the first loss value and the second loss value may refer to the related steps in the foregoing embodiments, and is not described herein again. In a specific implementation scenario, when the second loss value is calculated, only the first prediction region and the second prediction region of the unlabeled organ may be subjected to the calculation, so that robustness of consistency constraint between the second original detection model and the first image detection model may be improved, and accuracy of the image detection model may be improved.

In an implementation scenario, the steps S34 and S35 may be executed in a sequential order, for example, step S34 is executed first, and then step S35 is executed, or step S35 is executed first, and then step S34 is executed. In another implementation scenario, the step S24 and the step S35 may also be executed simultaneously, and may be specifically set according to an actual application, which is not limited herein.

Step S36: and updating the network parameters of the first image detection model by using the adjusted network parameters of the first original detection model during the training and the previous training for a plurality of times.

Specifically, the average value of the network parameters adjusted by the first original detection model in the current training and the previous training for several times may be counted, and the network parameters of the first image detection model may be updated to the average value of the network parameters of the corresponding first original detection model. Specifically, reference may be made to the relevant steps in the foregoing embodiments, which are not described herein again.

Referring to fig. 4, the average value of the network parameters adjusted by the first original sensing model net1 during the training and the previous training times can be counted, and the network parameters of the first image sensing model EMA net1 can be updated to the average value of the network parameters of the first original sensing model net 1.

Step S37: and updating the network parameters of the second image detection model by using the adjusted network parameters of the second original detection model during the training and the previous training for a plurality of times.

Specifically, the average value of the network parameters adjusted by the second original detection model in the current training and the previous training for several times may be counted, and the network parameters of the second image detection model may be updated to the average value of the network parameters of the corresponding second original detection model. Specifically, reference may be made to the relevant steps in the foregoing embodiments, which are not described herein again.

Referring to fig. 4, the average value of the network parameters adjusted by the second original sensing model net2 during the training and the previous training times can be counted, and the network parameters of the second image sensing model EMA net2 can be updated to the average value of the network parameters of the second original sensing model net 2.

In an implementation scenario, the steps S36 and S37 may be executed in a sequential order, for example, step S36 is executed first, and then step S37 is executed, or step S37 is executed first, and then step S36 is executed. In another implementation scenario, the step S36 and the step S37 may also be executed simultaneously, and may be specifically set according to an actual application, which is not limited herein.

In an implementation scenario, after updating the network parameters of the first image detection model and the second image detection model, if the preset training end condition is not satisfied, the above step S32 and subsequent steps may be executed again to continue adjusting the network parameters of the first original detection model and the second original detection model, and update the network parameters of the first image detection model corresponding to the first original detection model and the network parameters of the second image detection model corresponding to the second original detection model. In a specific implementation scenario, the presetting of the training end condition may include: the current training times reach a preset time threshold (e.g., 500 times, 1000 times, etc.), and the loss value of the first original detection model and the second original detection model is smaller than any one of a preset loss threshold, which is not limited herein. In another specific implementation scenario, after training is finished, any one of the first image detection model and the second image detection model can be used as a network model for subsequent image detection, so that regions corresponding to multiple organs in a medical image to be detected can be directly obtained, and further, the operation of respectively detecting the medical image to be detected by utilizing multiple single organ detections can be omitted, so that the detection calculation amount can be reduced.

Different from the foregoing embodiment, the original detection model is set to include a first original detection model and a second original detection model, the image detection model is set to include a first image detection model corresponding to the first original detection model and a second image detection model corresponding to the second original detection model, the step of detecting the medical image of the sample to obtain the first detection result is performed by using the first original detection model and the second original detection model respectively, and the step of detecting the medical image of the sample to obtain the second detection result is performed by using the first image detection model and the second detection model respectively, so that the network parameters of the first original detection model are adjusted by using the difference between the first prediction region of the first original detection model and the actual region respectively, and the second prediction region of the second original detection model and the actual region respectively, The difference between the second prediction areas of the first image detection model is used for adjusting the network parameters of the second original detection model, so that the first image detection model corresponding to the first original detection model can be used for monitoring the training of the second original detection model, the second image detection model corresponding to the second original detection model is used for monitoring the training of the first original detection model, the accumulated error of the network parameters generated in the real area of the pseudo-label in the multi-training process can be further restrained, and the accuracy of the image detection model is improved.

Referring to fig. 5, fig. 5 is a schematic flowchart illustrating an embodiment of an image detection method according to the present application. Specifically, the method may include the steps of:

step S51: acquiring a medical image to be detected, wherein the medical image to be detected comprises a plurality of organs.

The medical image to be detected may include a CT image, an MR image, and is not limited herein. In a specific implementation scenario, the medical image to be detected may be obtained by scanning the abdomen, the chest, the skull, and the like, and may be specifically set according to the actual application condition, which is not limited herein. For example, the abdomen is scanned and the organs in the medical image to be examined may include: kidney, spleen, liver, pancreas, etc.; alternatively, the chest is scanned and the organs in the medical image to be examined may include: heart, lung lobes, thyroid gland, etc.; alternatively, the skull is scanned and the organs in the medical image to be detected may include: brainstem, cerebellum, diencephalon, telencephalon, etc.

Step S52: and detecting the medical science to be detected by using the image detection model to obtain the prediction regions of the organs.

The image detection model is obtained by training the steps in any one of the above embodiments of the training method for an image detection model, and reference may be made to the relevant steps in the above embodiments, which are not described herein again. By using the image detection model to detect the medical image to be detected, the prediction regions of a plurality of organs can be directly obtained, and the operation of respectively detecting the medical image to be detected by using a plurality of single-organ detections can be omitted, so that the detection calculation amount can be reduced.

According to the scheme, the image detection model obtained by training in the steps of the training method embodiment of any image detection model is used for detecting and detecting the medical image to be detected to obtain the prediction regions of a plurality of organs, so that the detection accuracy can be improved during multi-organ detection.

Referring to fig. 6, fig. 6 is a schematic diagram of a framework of an embodiment of an image detection model training apparatus 60 of the present application. The training device 60 of the image detection model comprises an image acquisition module 61, a first detection module 62, a second detection module 63 and a parameter adjustment module 64, wherein the image acquisition module 61 is used for acquiring a sample medical image, and the sample medical image is used for pseudo-marking an actual region of at least one unmarked organ; the first detection module 62 is configured to detect the sample medical image by using the original detection model to obtain a first detection result, where the first detection result includes a first prediction region of an unlabeled organ; the second detection module 63 is configured to detect the sample medical image by using the image detection model to obtain a second detection result, where the second detection result includes a second prediction region of the unlabeled organ, and the network parameters of the image detection model are determined by using the network parameters of the original detection model; the parameter adjusting module 64 is configured to adjust the network parameters of the original detection model by using the difference between the first prediction region and the actual region, and the difference between the first prediction region and the second prediction region.

In some embodiments, the original detection models comprise a first original detection model and a second original detection model, the image detection models comprise a first image detection model corresponding to the first original detection model and a second image detection model corresponding to the second original detection model, the first detection module 62 is specifically configured to perform the step of detecting the medical image of the sample to obtain the first detection result using the first original detection model and the second original detection model, respectively, the second detection module 63 is specifically configured to perform the step of detecting the medical image of the sample to obtain the second detection result using the first image detection model and the second image detection model, respectively, the parameter adjustment module 64 is specifically configured to use a difference between the first prediction region of the first original detection model and the actual region, respectively, the second prediction region of the second image detection model, the parameter adjusting module 64 is further specifically configured to adjust the network parameters of the second original detection model by using differences between the first prediction region of the second original detection model and the actual region, and the second prediction region of the first image detection model, respectively.

In some embodiments, the parameter adjustment module 64 includes a first loss determination sub-module for determining a first loss value of the original detection model using a difference between the first prediction region and the actual region, the parameter adjustment module 64 includes a second loss determination sub-module for determining a second loss value of the original detection model using a difference between the first prediction region and the second prediction region, and the parameter adjustment module 64 includes a parameter adjustment sub-module for adjusting a network parameter of the original detection model using the first loss value and the second loss value.

Different from the foregoing embodiment, a first loss value of the original detection model is determined according to a difference between the first prediction region and the actual region, a second loss value of the original detection model is determined according to a difference between the first prediction region and the second prediction region, and network parameters of the original detection model are adjusted by using the first loss value and the second loss value, so that the loss of the original detection model can be measured from two dimensions, namely, the difference between the first prediction region predicted by the original detection model and the second prediction region predicted by the pseudo-labeled actual region and the corresponding image detection model, and the accuracy of loss calculation can be improved.

In some embodiments, the first loss determination submodule comprises a focus loss determination unit configured to process the first prediction region and the actual region using a focus loss function to obtain a focus first loss value, the first loss determination submodule comprises an aggregate similarity loss determination unit, the parameter adjusting submodule is used for processing the first prediction region and the actual region by utilizing an aggregation similarity loss function to obtain an aggregation similarity first loss value, the second loss determining submodule is specifically used for processing the first prediction region and the second prediction region by utilizing a consistency loss function to obtain a second loss value, the parameter adjusting submodule comprises a weighting processing unit, the parameter adjusting submodule comprises a parameter adjusting unit and a parameter adjusting submodule, wherein the parameter adjusting unit is used for adjusting the network parameters of the original detection model by using the weighted loss value.

Different from the foregoing embodiment, the focus first loss value is obtained by processing the first prediction region and the actual region by using the focus loss function, so that the attention of the model to the difficult sample can be improved, and the accuracy of the image detection model can be improved; the first prediction region and the actual region are processed by utilizing the set similarity loss function to obtain a set similarity first loss value, so that the model can be fitted with the pseudo-labeled actual region, and the accuracy of the image detection model can be improved; the consistency loss function is utilized to process the first prediction area and the second prediction area to obtain a second loss value, so that the consistency of the original model and the image detection model can be improved, and the accuracy of the image detection model can be improved; the weighted loss value is obtained by weighting the first loss value and the second loss value, the network parameters of the original detection model are adjusted by using the weighted loss value, and the importance degree of each loss value in the training process can be balanced, so that the accuracy of the network parameters can be improved, and the accuracy of the image detection model can be improved.

In some embodiments, the sample medical image further includes an actual region of the labeled organ, the first detection result further includes a first predicted region of the labeled organ, and the second detection result further includes a second predicted region of the labeled organ. The first loss determination submodule is specifically configured to determine a first loss value of the original detection model using a difference between a first prediction region and an actual region of the unlabeled organ and the labeled organ, and the second loss determination submodule is specifically configured to determine a second loss value of the original detection model using a difference between the first prediction region and a corresponding second prediction region of the unlabeled organ.

Different from the foregoing embodiment, by setting the actual region of the labeled organ in the sample medical image, wherein the first detection result further includes the first prediction region of the labeled organ, and the second detection result further includes the second prediction region of the labeled organ, and when determining the first loss value of the original detection model, the difference between the first prediction region and the actual region is comprehensively considered, and when determining the second loss value of the original detection model, only the difference between the first prediction region of the unlabeled organ and the corresponding second prediction region is considered, so that the robustness of the consistency constraint between the original detection model and the image detection model can be improved, and the accuracy of the image detection model can be improved.

In some embodiments, the training apparatus 60 of the image detection model further includes a parameter updating module, configured to update the network parameters of the image detection model by using the network parameters adjusted in the current training and the network parameters adjusted in the previous training.

Different from the embodiment, the network parameters of the image detection model are updated by using the network parameters of the original detection model which are adjusted during the training and the previous training for a plurality of times, so that the accumulated errors of the network parameters caused by the pseudo-labeled real area in the training process for a plurality of times can be further restrained, and the accuracy of the image detection model is improved.

In some embodiments, the parameter updating module includes a statistics submodule configured to count an average value of the network parameters adjusted by the original detection model in the current training and a plurality of previous training, and the parameter updating module includes an updating submodule configured to update the network parameters of the image detection model to the average value of the network parameters of the corresponding original detection model.

Different from the embodiment, by counting the average values of the network parameters adjusted by the original detection model in the training and the previous training for a plurality of times and updating the network parameters of the image detection model to the average values of the network parameters of the corresponding original detection model, the method can help to quickly restrict the accumulated errors generated in the training process for a plurality of times and improve the accuracy of the image detection model.

In some embodiments, the image obtaining module 61 includes an image obtaining sub-module, configured to obtain a medical image to be pseudo-labeled, where the medical image to be pseudo-labeled has at least one unmarked organ, the image obtaining module 61 includes a single organ detecting sub-module, configured to detect the medical image to be pseudo-labeled by using a single organ detecting model corresponding to each unmarked organ respectively, so as to obtain an organ prediction region of each unmarked organ, and the image obtaining module 61 includes a pseudo-labeling sub-module, configured to pseudo-label the organ prediction region of the unmarked organ as an actual region of the unmarked organ, and use the pseudo-labeled medical image to be pseudo-labeled as the sample medical image.

In some embodiments, the medical image to be pseudo-labeled includes at least one labeled organ, and the image obtaining module 61 further includes a single organ training submodule, configured to train a single organ detection model corresponding to the labeled organ in the medical image to be pseudo-labeled, by using the medical image to be pseudo-labeled.

Different from the embodiment, the medical image to be pseudo-labeled comprises at least one labeled organ, and the medical image to be pseudo-labeled is used for training the single organ detection model corresponding to the labeled organ in the medical image to be pseudo-labeled, so that the accuracy of the single organ detection model can be improved, the accuracy of subsequent pseudo-labeling can be improved, and the accuracy of the detection model of the subsequent training image can be improved.

In some embodiments, the image obtaining sub-module includes a three-dimensional image obtaining unit configured to obtain a three-dimensional medical image, the image obtaining sub-module includes a preprocessing unit configured to preprocess the three-dimensional medical image, and the image obtaining sub-module includes an image clipping unit configured to clip the preprocessed three-dimensional medical image to obtain at least one two-dimensional medical image to be pseudo-labeled.

Different from the embodiment, the three-dimensional medical image is obtained and preprocessed, so that the preprocessed three-dimensional medical image is cut to obtain at least one two-dimensional medical image to be pseudo-labeled, and the method is favorable for obtaining the medical image meeting model training and improving the accuracy of subsequent image detection model training.

In some embodiments, the preprocessing unit is specifically configured to perform at least one of: adjusting the voxel resolution of the three-dimensional medical image to a preset resolution; normalizing the voxel value of the three-dimensional medical image to be within a preset range by utilizing a preset window value; gaussian noise is added to at least a portion of the voxels of the three-dimensional medical image.

Different from the embodiment, the voxel resolution of the three-dimensional medical image is adjusted to a preset resolution, so that the subsequent model prediction processing can be facilitated; the voxel value of the three-dimensional medical image is normalized to be within a preset range by utilizing a preset window value, so that accurate characteristics can be favorably extracted from the model; gaussian noise is added into at least part of voxels of the three-dimensional medical image, so that data amplification can be favorably realized, data diversity is improved, and accuracy of subsequent model training is improved.

Referring to fig. 7, fig. 7 is a schematic diagram of an embodiment of an image detection apparatus 70 according to the present application. The image detection device 70 comprises an image acquisition module 71 and an image detection module 72, wherein the image acquisition module 71 is used for acquiring a medical image to be detected, and the medical image to be detected comprises a plurality of organs; the image detection module 72 is configured to detect a medical science to be detected by using an image detection model to obtain prediction regions of a plurality of organs; the image detection model is obtained by training with a training device of the image detection model in any one of the above-mentioned training device embodiments of the image detection model.

According to the scheme, the medical image to be detected is detected and detected by the image detection model obtained by training the training device of the image detection model in the training device embodiment of any image detection model, so that the prediction regions of a plurality of organs are obtained, and the detection accuracy can be improved during multi-organ detection.

Referring to fig. 8, fig. 8 is a schematic block diagram of an embodiment of an electronic device 80 according to the present application. The electronic device 80 comprises a memory 81 and a processor 82 coupled to each other, and the processor 82 is configured to execute program instructions stored in the memory 81 to implement the steps of any of the above-described embodiments of the image detection model training method, or to implement the steps of any of the above-described embodiments of the image detection method. In one particular implementation scenario, the electronic device 80 may include, but is not limited to: a microcomputer, a server, and the electronic device 80 may also include a mobile device such as a notebook computer, a tablet computer, and the like, which is not limited herein.

Specifically, the processor 82 is configured to control itself and the memory 81 to implement the steps of any of the above-described embodiments of the training method of the image detection model, or to implement the steps of any of the above-described embodiments of the image detection method. The processor 82 may also be referred to as a CPU (Central Processing Unit). The processor 82 may be an integrated circuit chip having signal processing capabilities. The Processor 82 may also be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 82 may be collectively implemented by an integrated circuit chip.

According to the scheme, the detection accuracy can be improved during multi-organ detection.

Referring to fig. 9, fig. 9 is a block diagram illustrating an embodiment of a computer-readable storage medium 90 according to the present application. The computer readable storage medium 90 stores program instructions 901 capable of being executed by a processor, where the program instructions 901 are used to implement the steps of any of the above-described embodiments of the training method for an image detection model, or to implement the steps of any of the above-described embodiments of the image detection method.

In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a module or a unit is merely one type of logical division, and an actual implementation may have another division, for example, a unit or a component may be combined or integrated with another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some interfaces, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on network elements. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims

1. A training method of an image detection model is characterized by comprising the following steps:

obtaining a sample medical image, wherein the sample medical image pseudo-labels an actual region of at least one unlabeled organ;

detecting the sample medical image by using an original detection model to obtain a first detection result, wherein the first detection result comprises a first prediction region of the unmarked organ; and the number of the first and second groups,

detecting the sample medical image by using an image detection model to obtain a second detection result, wherein the second detection result comprises a second prediction region of the unmarked organ; network parameters of the image detection model are determined based on network parameters of the original detection model;

and adjusting the network parameters of the original detection model by using the difference between the first prediction region and the actual region and the difference between the first prediction region and the second prediction region.

2. The training method according to claim 1, wherein the original detection model includes a first original detection model and a second original detection model, and the image detection model includes a first image detection model corresponding to the first original detection model and a second image detection model corresponding to the second original detection model;

the detecting the sample medical image by using the original detection model to obtain a first detection result comprises:

performing the step of detecting the sample medical image to obtain a first detection result by using the first original detection model and the second original detection model respectively;

the detecting the sample medical image by using the image detection model to obtain a second detection result comprises:

the step of detecting the sample medical image to obtain a second detection result is executed by respectively utilizing the first image detection model and the second image detection model;

the adjusting the network parameters of the original detection model by using the difference between the first prediction region and the actual region and the difference between the first prediction region and the second prediction region respectively comprises:

adjusting network parameters of the first original detection model by using differences between a first prediction region of the first original detection model and the actual region and a second prediction region of the second image detection model respectively; and the number of the first and second groups,

and adjusting the network parameters of the second original detection model by using the difference between the first prediction region of the second original detection model and the actual region and the difference between the first prediction region of the first original detection model and the second prediction region of the first image detection model.

3. Training method according to claim 1 or 2, wherein said adapting network parameters of said original detection model using differences between said first prediction region and said actual region and said second prediction region, respectively, comprises:

determining a first loss value of the original detection model using a difference between the first prediction region and the actual region; and the number of the first and second groups,

determining a second loss value of the original detection model using a difference between the first prediction region and the second prediction region;

and adjusting the network parameters of the original detection model by using the first loss value and the second loss value.

4. The training method of claim 3, wherein the determining a first loss value of the original detection model using the difference between the first prediction region and the actual region comprises:

processing the first prediction area and the actual area by using a focus loss function to obtain a focus first loss value; and/or the presence of a gas in the gas,

processing the first prediction region and the actual region by using a set similarity loss function to obtain a set similarity first loss value;

and/or, the determining a second loss value of the original detection model using a difference between the first prediction region and the second prediction region comprises:

processing the first prediction region and the second prediction region by using a consistency loss function to obtain a second loss value;

and/or, the adjusting the network parameter of the original detection model by using the first loss value and the second loss value comprises:

weighting the first loss value and the second loss value to obtain a weighted loss value;

and adjusting the network parameters of the original detection model by using the weighted loss value.

5. The training method according to claim 3 or 4, wherein the sample medical image further includes an actual region of a labeled organ, the first detection result further includes a first predicted region of the labeled organ, and the second detection result further includes a second predicted region of the labeled organ;

the determining a first loss value of the original detection model using a difference between the first prediction region and the actual region includes:

determining a first loss value of the original detection model using a difference between a first predicted region and the actual region of the unlabeled organ and the labeled organ;

the determining a second loss value of the original detection model using a difference between the first prediction region and the second prediction region includes:

and determining a second loss value of the original detection model by using the difference between the first prediction region of the unmarked organ and the corresponding second prediction region.

6. A training method as claimed in any one of claims 1 to 5, wherein after adjusting the network parameters of the original detection model by using the difference between the first prediction region and the actual region and the second prediction region, the method further comprises:

and updating the network parameters of the image detection model by using the adjusted network parameters in the training and the previous training for a plurality of times.

7. The training method according to claim 6, wherein the updating the network parameters of the image detection model by using the adjusted network parameters during the current training and the training for a plurality of times comprises:

counting the average value of the network parameters adjusted by the original detection model in the current training and a plurality of previous training;

and updating the network parameters of the image detection model into the average values of the network parameters of the corresponding original detection model.

8. Training method according to any of claims 1 to 7, wherein said acquiring sample medical images comprises:

acquiring a medical image to be subjected to pseudo-labeling, wherein the medical image to be subjected to pseudo-labeling has at least one unmarked organ;

respectively detecting the medical image to be pseudo-labeled by using a single organ detection model corresponding to each unmarked organ to obtain an organ prediction region of each unmarked organ;

and pseudo-labeling the organ prediction region of the unmarked organ as the actual region of the unmarked organ, and taking the pseudo-labeled medical image to be pseudo-labeled as the sample medical image.

9. Training method according to claim 8, wherein the medical image to be pseudo-labeled comprises at least one labeled organ; before the single organ detection model corresponding to each unmarked organ is used for detecting the medical image to be subjected to pseudo-marking, the method further comprises the following steps:

and training a single organ detection model corresponding to the labeled organ in the medical image to be pseudo-labeled by using the medical image to be pseudo-labeled.

10. The training method according to claim 8, wherein the acquiring the medical image to be pseudo-labeled comprises:

acquiring a three-dimensional medical image, and preprocessing the three-dimensional medical image;

and cutting the preprocessed three-dimensional medical image to obtain at least one two-dimensional medical image to be pseudo-labeled.

11. Training method according to claim 10, wherein the pre-processing of the three-dimensional medical image comprises at least one of:

adjusting the voxel resolution of the three-dimensional medical image to a preset resolution;

normalizing the voxel value of the three-dimensional medical image to be within a preset range by utilizing a preset window value;

gaussian noise is added to at least a portion of the voxels of the three-dimensional medical image.

12. An image detection method, comprising:

acquiring a medical image to be detected, wherein the medical image to be detected comprises a plurality of organs;

detecting the medical science to be detected by using an image detection model to obtain the prediction regions of the organs;

wherein the image detection model is obtained by training by using the training method of the image detection model according to any one of claims 1 to 11.

13. An apparatus for training an image detection model, comprising:

an image acquisition module for acquiring a sample medical image, wherein the sample medical image pseudo-labels an actual region of at least one unlabeled organ;

a first detection module, configured to detect the sample medical image by using an original detection model to obtain a first detection result, where the first detection result includes a first prediction region of the unlabeled organ; and the number of the first and second groups,

a second detection module, configured to detect the sample medical image by using an image detection model to obtain a second detection result, where the second detection result includes a second prediction region of the unlabeled organ, and a network parameter of the image detection model is determined based on a network parameter of the original detection model;

and the parameter adjusting module is used for adjusting the network parameters of the original detection model by utilizing the difference between the first prediction region and the actual region and the difference between the first prediction region and the second prediction region.

14. An image detection apparatus, characterized by comprising:

the system comprises an image acquisition module, a detection module and a processing module, wherein the image acquisition module is used for acquiring a medical image to be detected, and the medical image to be detected comprises a plurality of organs;

the image detection module is used for detecting the medical science to be detected by using an image detection model to obtain the prediction regions of the organs;

wherein the image detection model is trained by the training apparatus of the image detection model according to claim 13.

15. An electronic device comprising a memory and a processor coupled to each other, the processor being configured to execute program instructions stored in the memory to implement the method for training an image detection model according to any one of claims 1 to 11 or to implement the method for image detection according to claim 12.

16. A computer-readable storage medium, on which program instructions are stored, which program instructions, when executed by a processor, implement the method of training an image detection model according to any one of claims 1 to 11, or implement the method of image detection according to claim 12.