CN115690100A

CN115690100A - Semi-supervised signal point detection model training method, signal point detection method and device

Info

Publication number: CN115690100A
Application number: CN202211688086.2A
Authority: CN
Inventors: 王华嘉; 吕行; 邝英兰; 林淋; 周燕玲; 叶莘
Original assignee: Zhuhai Hengqin Shengao Yunzhi Technology Co ltd
Current assignee: Zhuhai Hengqin Shengao Yunzhi Technology Co ltd
Priority date: 2022-12-28
Filing date: 2022-12-28
Publication date: 2023-02-03
Anticipated expiration: 2042-12-28
Also published as: CN115690100B

Abstract

The invention provides a semi-supervised signal point detection model training method, a signal point detection method and a signal point detection device, wherein the method comprises the following steps: performing signal point segmentation on the training cell image based on the signal point detection model to obtain a prediction result; if the training is earlier stage, calculating the segmentation loss by using a first loss function based on the prediction result of each training cell image and the segmentation label thereof; the first loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation labels based on the segmentation labels; if the training later stage is achieved, calculating the segmentation loss by using a second loss function based on the prediction result of each training cell image and the segmentation label thereof; the second loss function comprises a first loss function and a noise robust loss function, and the noise robust loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation labels based on the prediction result; the model is parametrically adjusted based on the segmentation losses. The invention improves the signal point detection precision of the model.

Description

Semi-supervised signal point detection model training method, signal point detection method and device

Technical Field

The invention relates to the technical field of image segmentation, in particular to a semi-supervised signal point detection model training method, a signal point detection method and a signal point detection device.

Background

The detection of the fluorescent signal points of the cells is a very important step in the task of identifying the circulating abnormal cells, so that the fluorescent signal points in the cell image can be accurately and completely used for identifying the downstream circulating abnormal cells. However, the conventional method requires manual identification of signal points in each channel image of the cell, which is time-consuming and labor-consuming. Therefore, the object detection model based on deep learning is emphasized because it can recognize signal points in cells more quickly.

Unfortunately, training a signal point detection model based on deep learning requires a large amount of signal point labeling data, and since pixel-level labeling is required for each signal point under each channel, a very high labeling cost is incurred. The semi-supervised mode proposed recently can train the deep learning model by using limited labeled data and a large amount of unlabelled data, and the labeling cost of the deep learning model is greatly reduced. However, in the existing semi-supervised mode, because the model learning needs to be guided by using the pseudo-labeled labels of the unlabeled data, the training effect of the deep learning model is very easily affected by the accuracy of the pseudo-label, so the training effect of the deep learning model cannot be guaranteed, and the requirement on the segmentation accuracy in the signal point detection scene is difficult to meet.

Disclosure of Invention

The invention provides a semi-supervised signal point detection model training method, a signal point detection method and a signal point detection device, which are used for solving the defect of poor signal point detection accuracy in the prior art.

The invention provides a semi-supervised signal point detection model training method, which comprises the following steps:

performing signal point segmentation on training cell images in a training set of the current round of training based on a signal point detection model to obtain a prediction result of each training cell image;

if the current round of training is a training early stage, calculating segmentation loss by using a first loss function based on the prediction result of each training cell image and the segmentation label thereof; the segmentation labels comprise labeled labels and pseudo-labeled labels; the first loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation label based on the segmentation label;

if the current round of training is in the later training stage, calculating the segmentation loss by using a second loss function based on the prediction result of each training cell image and the segmentation label thereof; the second loss function comprises the first loss function and a noise robust loss function, and the noise robust loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation labels based on the prediction result;

and performing parameter adjustment on the signal point detection model based on the segmentation loss.

According to the semi-supervised signal point detection model training method provided by the invention, based on the prediction result of each training cell image and the segmentation label thereof, the segmentation loss is calculated by using a second loss function, and the method specifically comprises the following steps:

based on the prediction result of each training cell image and the segmentation label corresponding to the training cell image, calculating the sample loss of each training cell image in the current training round by using the second loss function;

if the loss of any training cell image in the current training sample is larger than a preset loss threshold, determining that any training cell image is a noise sample, and removing any training cell image from the training set of the current training;

and determining the segmentation loss based on the sample loss of each training cell image in the training set of the current round of training in the current round of training.

According to the semi-supervised signal point detection model training method provided by the invention, the preset loss threshold is determined based on the following steps:

calculating the loss of the test sample of each training cell image in the training set based on the second loss function based on the prediction result of each training cell image in the training set in the last round of training process before training and the segmentation label corresponding to the training cell image;

determining the loss of the test sample of the marked sample and the loss of the test sample of the unmarked sample based on the loss of the test sample of each training cell image in the training set; the marked sample is a training cell image with a segmentation label as a marked label, and the unmarked sample is a training cell image with a segmentation label as a pseudo-marked label;

determining the preset loss threshold value based on the loss of the test sample of the marked sample and the loss of the test sample of the unmarked sample; wherein the proportion of marked samples with test sample loss lower than the preset loss threshold is higher than a first proportion threshold, and the proportion of unmarked samples with test sample loss higher than the preset loss threshold is lower than a second proportion threshold, and the first proportion threshold is larger than the second proportion threshold.

According to the semi-supervised signal point detection model training method provided by the invention, the training stage of any round of training is determined based on the following steps:

if the previous training period of the previous training period is the training early period, determining the training period of the previous training period based on the segmentation loss of the previous training period and a preset convergence threshold value; if the segmentation loss of the previous round of training is greater than the preset convergence threshold, determining that any round of training is a training early stage, and if the segmentation loss of the previous round of training is less than or equal to the preset convergence threshold, determining that any round of training is a training late stage;

and if the last training round of the any training round is the training later period, determining that the any training round is the training later period.

According to the semi-supervised signal point detection model training method provided by the invention, the first loss function is used for calculating a first product among the label value, the prediction probability difference value and the logarithm of the prediction probability of each pixel, calculating a second product among the label difference value, the prediction probability and the logarithm of the prediction probability difference value of each pixel, and calculating an average value of the sum of the first product and the second product of each pixel; wherein, the prediction probability difference value of any pixel is the difference value between the prediction probability of the corresponding pixel and 1; the label difference value of any pixel is the difference value between the label value of the corresponding pixel and 1;

the noise robust loss function is used for calculating a third product between the prediction probability of each pixel and the logarithm of the label value of the corresponding pixel, calculating a fourth product between the prediction probability difference value of each pixel and the logarithm of the label difference value, and calculating an average value of the sum of the third product and the fourth product of each pixel.

According to the semi-supervised signal point detection model training method provided by the invention, if the current round of training is in the later training stage, the segmentation loss is calculated by using a second loss function based on the prediction result of each training cell image and the segmentation label thereof, and the method specifically comprises the following steps:

calculating the sample loss of the marked sample in the current training round by utilizing the first loss function based on the prediction result of the marked sample with each partition label as the marked label and the partition label thereof;

calculating the sample loss of the unmarked sample in the current round of training by using the second loss function based on the prediction result of the unmarked sample of which each segmentation label is a pseudo-labeling label and the segmentation label thereof;

determining the segmentation loss based on the sample loss of the labeled samples in the current round of training and the sample loss of the unlabeled samples in the current round of training.

According to the semi-supervised signal point detection model training method provided by the invention, the segmentation loss is calculated based on the following steps:

calculating the sample loss of the marked sample in the current training round based on the prediction result of the marked sample with each partition label as the marked label and the partition label thereof;

calculating the sample loss of the unlabeled sample in the current round of training based on the prediction result of the unlabeled sample with each segmentation label as a pseudo-labeling label and the segmentation label thereof;

determining a loss weight of the labeled sample and a loss weight of the unlabeled sample; wherein the loss weight of the labeled sample and the loss weight of the unlabeled sample are determined based on the training phase of the current round of training and the number of trained rounds;

and determining the segmentation loss based on the sample loss and the loss weight of the marked sample in the current training round, and the sample loss and the loss weight of the unmarked sample in the current training round.

The invention also provides a signal point detection method, which comprises the following steps:

determining a cell image to be detected;

inputting the cell image to be detected into a signal point detection model to obtain a signal point detection result of the cell image to be detected output by the signal point detection model;

the signal point detection model is obtained by training based on any one of the semi-supervised signal point detection model training methods.

The invention also provides a semi-supervised signal point detection model training device, which comprises:

the forward propagation unit is used for carrying out signal point segmentation on training cell images in a training set of the current round of training based on a signal point detection model to obtain a prediction result of each training cell image;

the first loss calculation unit is used for calculating the segmentation loss by using a first loss function based on the prediction result of each training cell image and the segmentation label thereof if the current round of training is the training earlier stage; the segmentation labels comprise labeled labels and pseudo-labeled labels; the first loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation label based on the segmentation label;

the second loss calculation unit is used for calculating the segmentation loss by utilizing a second loss function based on the prediction result of each training cell image and the segmentation label thereof if the current round of training is the training later stage; the second loss function comprises the first loss function and a noise robust loss function, and the noise robust loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation labels based on the prediction result;

and the back propagation unit is used for carrying out parameter adjustment on the signal point detection model based on the segmentation loss.

The present invention also provides a signal point detecting device, including:

the image determining unit is used for determining the cell image to be detected;

the signal point detection unit is used for inputting the cell image to be detected into a signal point detection model to obtain a signal point detection result of the cell image to be detected, which is output by the signal point detection model;

The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the semi-supervised signal point detection model training method or the signal point detection method.

The present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the semi-supervised signal point detection model training method or the signal point detection method as described in any of the above.

The present invention also provides a computer program product comprising a computer program, which when executed by a processor implements the semi-supervised signal point detection model training method or the signal point detection method as described in any of the above.

According to the semi-supervised signal point detection model training method, the signal point detection method and the device, different loss functions are adopted in the early training stage and the later training stage to control the learning direction of the signal point detection model, specifically, in the early training stage, the first loss function is adopted to calculate the segmentation loss of the current wheel, and the parameters of the signal point detection model are adjusted according to the segmentation loss of the current wheel, so that the prediction result of a training cell image output by the signal point detection model is close to the corresponding segmentation label, and the detection precision of the model is improved; in the later training stage, on the basis of the first loss function, a noise robust loss function is combined, so that on one hand, the prediction result of the training cell image output by the signal point detection model is close to the corresponding segmentation label, and on the other hand, the segmentation label of the training cell image is close to the prediction result of the corresponding training cell image output by the signal point detection model, thereby removing the noise influence in the pseudo-labeling label of the unmarked sample, avoiding the noise in the pseudo-labeling label of the model fitting the unmarked sample, and further improving the detection precision of the model.

Drawings

In order to more clearly illustrate the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a schematic flow chart of a semi-supervised signal point detection model training method provided by the present invention;

FIG. 2 is a flow chart of a segmentation loss calculation method according to the present invention;

FIG. 3 is Y provided by the present invention _xy Gradient map of function when = 1;

FIG. 4 is Y provided by the present invention _xy A gradient map of the function when = 0;

FIG. 5 is a second flowchart of the segmentation loss calculation method according to the present invention;

FIG. 6 is a schematic flow chart of a signal point detection method provided by the present invention;

FIG. 7 is a schematic structural diagram of a semi-supervised signal point detection model training device provided by the present invention;

FIG. 8 is a schematic structural diagram of a signal point detection device provided in the present invention;

fig. 9 is a schematic structural diagram of an electronic device provided by the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Fig. 1 is a schematic flow chart of a semi-supervised signal point detection model training method provided by the present invention, as shown in fig. 1, the method includes:

step 110, performing signal point segmentation on training cell images in a training set of the current round of training based on a signal point detection model to obtain a prediction result of each training cell image;

step 120, if the current round of training is the training early stage, calculating the segmentation loss by using a first loss function based on the prediction result of each training cell image and the segmentation label thereof; the segmentation labels comprise labeled labels and pseudo-labeled labels; the first loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation labels based on the segmentation labels;

step 130, if the current round of training is in the later training stage, calculating the segmentation loss by using a second loss function based on the prediction result of each training cell image and the segmentation label thereof; the second loss function comprises the first loss function and a noise robust loss function, and the noise robust loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation labels based on the prediction result;

step 140, adjusting parameters of the signal point detection model based on the segmentation loss.

Specifically, the signal point detection model may adopt an existing object detection model such as yolo and RCNN. Before training the signal point detection model, the teacher model may be trained based on a labeled sample, where the structure of the teacher model may be the same as that of the signal point detection model, and the labeled sample refers to a cell image labeled with a label obtained by artificially labeling signal point pixels. The trained teacher model has preliminary signal point detection capability, and the signal point detection model needs a large amount of labeled data and the number of the existing labeled samples is insufficient, so that the trained teacher model can be used for carrying out signal point detection on a large amount of unlabeled samples (namely cell images of signal point pixels which are not artificially labeled), the signal point pixels in the unlabeled samples are determined, and the pseudo-labeled labels of the unlabeled samples are obtained. And performing multiple rounds of iterative training on the signal point detection model based on the marked samples and the marked labels thereof, and a large number of unmarked samples and pseudo-marked labels thereof (the marked samples and the unmarked samples are collectively referred to as training cell images in the following process).

In each round of training process, training cell images in a training set of the current round of training can be input into the signal point detection model for signal point segmentation, and a prediction result of each training cell image is obtained. Here, the prediction result of the trained cell image includes the prediction probability of each pixel in the image output by the signal point detection model, where the prediction probability includes the probability that the corresponding pixel is a signal point pixel and the probability that the corresponding pixel is a non-signal point pixel. In addition, because the pseudo-label of the unlabeled sample contains more noise with high confidence level, in order to avoid the noise in the pseudo-label being over-fitted by the signal point detection model, the training cell image can be input into the signal point detection model after being subjected to image enhancement. The image enhancement mode aiming at the unmarked sample is different from the image enhancement mode aiming at the marked sample. Aiming at marked samples, the adopted enhancement methods are letterbox and random _ periodic active, and aiming at unmarked samples, the adopted enhancement methods are random _ periodic active and more aggressive mosaic enhancement, wherein two pictures adopted during mosaic enhancement are from marked samples, and the other two pictures are from unmarked samples.

Because the pseudo-label of the unlabelled sample used in the model training process contains a lot of noises (namely wrong labels), in order to improve the training effect of the signal point detection model and avoid the reduction of the signal point detection precision of the signal point detection model caused by overfitting the noises, different training measures can be adopted aiming at the early stage and the later stage of the model training. In the conventional training mode using loss functions such as cross entropy and focal loss, it is found through analysis that the signal point detection model tends to fit the correct sample (i.e. the training cell image with the correct segmentation label) in the early stage of the model training, and tends to fit the noise sample (i.e. the training cell image with the wrong segmentation label) in the later stage of the model training. The reason is that the detection precision of the signal point detection model is low in the early stage of model training, the prediction probability of each pixel in a training cell image is low, the absolute value of the gradient is large, the updating change of the parameters of the model is large, and many correct samples exist in a training set, so that the parameters of the model are updated towards the correct samples. When the model is fitted to a certain degree to reach the later training stage, the model has high prediction precision and high prediction probability on correct samples, so that the corresponding gradient of the correct samples is small, and the model has low prediction precision and low prediction probability on noise samples, so that the corresponding gradient of the noise samples is much larger than that of normal samples, and the parameters of the model can be updated towards the direction of the noise samples.

Therefore, different loss functions can be adopted in the early training period and the later training period to control the learning direction of the signal point detection model. Specifically, when the current round of training is started, the training phase of the current round of training, i.e., the training early stage or the training late stage during the current round of training, is first confirmed. And if the current round of training is the early stage of training, calculating the segmentation loss by using a first loss function based on the prediction result of each training cell image and the segmentation label thereof. The segmentation labels comprise labeled labels and pseudo-labeled labels, the labeled segmentation labels of the samples are labeled labels, and the unlabeled segmentation labels of the samples are pseudo-labeled labels. The first loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation labels by taking the segmentation labels as a reference, and guiding model learning based on the segmentation loss calculated by the first loss function, so that the prediction result of the training cell image output by the signal point detection model in the future can be gradually close to the segmentation labels corresponding to the training cell image. Here, since the signal point detection model tends to fit the correct sample in the early stage of training, the first loss function may be used to calculate the segmentation loss of the current round in the early stage of training, and the parameters of the signal point detection model are adjusted according to the segmentation loss of the current round, so that the prediction result of the training cell image output by the signal point detection model is close to the corresponding segmentation label, and the detection accuracy of the model can be improved.

And if the current round of training is in the later training stage, calculating the segmentation loss by using a second loss function based on the prediction result of each training cell image and the segmentation label thereof. The second loss function comprises a first loss function and a noise robust loss function, and the noise robust loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation labels by taking the prediction result as a reference. By combining the first loss function and the noise robust loss function, the prediction result of the training cell image output by the signal point detection model in the future can be gradually close to the segmentation label of the corresponding training cell image, and meanwhile, the segmentation label of the training cell image is also gradually close to the prediction result of the training cell image output by the signal point detection model in the future.

Here, since the signal point detection model tends to fit the noise sample in the later training period, if the same as the earlier training period, the first loss function is directly used to calculate the segmentation loss of the current wheel so as to adjust the parameters of the signal point detection model, and then the signal point detection model will be interfered by the noise sample, resulting in an error in the learning direction of the signal point detection model. In contrast, in the late stage of model training, the prediction result of the training cell image output by the signal point detection model can reflect the real distribution of the signal points in the training cell image to a certain extent, and compared with a noise sample in an unmarked sample, the prediction result of the training cell image output by the signal point detection model is closer to the real situation rather, and the accuracy is higher. Therefore, on the basis of the first loss function, the noise robust loss function is combined, so that on one hand, the prediction result of the training cell image output by the signal point detection model is close to the corresponding segmentation label, and on the other hand, the segmentation label of the training cell image is close to the prediction result of the corresponding training cell image output by the signal point detection model, so that the influence of the noise sample in the unlabeled sample is removed, the detection precision of the model is further improved, and the noise in the pseudo-labeled label of the model fitting the unlabeled sample is avoided. It should be noted that, the noise samples in the training cell images can be screened out through the noise robust loss function, and the noise samples are removed, so that the segmentation labels of the remaining training cell images are close to the prediction result of the corresponding training cell image output by the signal point detection model, and meanwhile, the prediction result of the remaining training cell images output by the signal point detection model is close to the corresponding segmentation labels. The screened noise samples do not participate in the subsequent training of the signal point detection model.

The signal point detection model is subjected to parameter adjustment through the segmentation loss obtained in the mode in each round of training process, and a model with high signal point detection precision can be obtained.

According to the method provided by the embodiment of the invention, different loss functions are adopted in the early stage of training and the later stage of training to control the learning direction of the signal point detection model, specifically, in the early stage of training, the first loss function is adopted to calculate the segmentation loss of the current round, and the parameters of the signal point detection model are adjusted according to the segmentation loss of the current round, so that the prediction result of the training cell image output by the signal point detection model is close to the corresponding segmentation label, and the detection precision of the model is improved; in the later stage of training, on the basis of the first loss function, a noise robust loss function is combined, so that on one hand, the prediction result of the training cell image output by the signal point detection model is close to the corresponding segmentation label, and on the other hand, the segmentation label of the training cell image is close to the prediction result of the corresponding training cell image output by the signal point detection model, thereby removing the noise influence in the pseudo-labeling label of the unmarked sample, avoiding the noise in the pseudo-labeling label of the model fitting unmarked sample, and further improving the detection precision of the model.

Based on the foregoing embodiment, as shown in fig. 2, the calculating a segmentation loss by using a second loss function based on the prediction result of each training cell image and the segmentation label thereof specifically includes:

step 131, calculating sample loss of each training cell image in the current training round by using the second loss function based on the prediction result of each training cell image and the segmentation label corresponding to the training cell image;

step 132, if the loss of any training cell image in the current round of training is greater than a preset loss threshold, determining that any training cell image is a noise sample, and removing any training cell image from the training set of the current round of training;

step 133, determining the segmentation loss based on the sample loss of each training cell image in the training set of the current round of training in the current round of training.

Specifically, based on the prediction result of each training cell image and the segmentation label corresponding to the training cell image, the second loss function is used to calculate the sample loss of each training cell image in the current training round. The first sample loss of each training cell image in the current round of training can be calculated based on the first loss function, the second sample loss of each training cell image in the current round of training can be calculated based on the noise robust loss function, and the first sample loss and the second sample loss of any training cell image are summed or weighted and summed to obtain the sample loss of the training cell image. When the first sample loss and the second sample loss of any training cell image are subjected to weighted summation, the respective weights corresponding to the first sample loss and the second sample loss can be preset.

If the loss of the sample of any training cell image in the current round of training is larger than the preset loss threshold, it indicates that the first sample loss and the second sample loss of the training cell image are both large, which means that the difference between the segmentation label of the training cell image and the prediction result of the training cell image is large, so that the training cell image can be considered as a noise sample, and the segmentation label of the training cell image has a mistake, which causes the difference between the segmentation label of the training cell image and the prediction result of the training cell image to be large. Subsequently, the training cell images determined as noise samples can be removed from the training set of the current training round without participating in the segmentation loss calculation in the training round. Here, in order to further improve the training effect of the model, if any training cell image is determined as a noise sample in the multi-round training process, the training cell image may be deleted from the training set and not participate in the subsequent rounds of training. In addition, if the number of training cell images determined as noise samples is large, the segmentation labels of the training cell images can be modified for the training cell images determined as noise samples in the multi-round training process, so that the error labels are corrected and the subsequent training process is performed.

In any of the above embodiments, the preset loss threshold is determined based on the following steps:

Specifically, at the beginning of each round of training, the training phase of the current round of training may be determined. If the current round of training is the later training period and the previous round of training is the earlier training period, that is, the previous round of training is the conversion node of the early training period, the prediction result obtained by each training sample in the training set in the previous round of training process and the segmentation label corresponding to the training cell image can be obtained. Subsequently, a loss of the test sample for each training cell image in the training set is calculated based on a second loss function. The training set comprises marked samples and unmarked samples, the segmentation labels of the marked samples are marked labels marked manually, the accuracy can be guaranteed, the segmentation labels of the unmarked samples are pseudo marked labels obtained by prediction of a teacher model, the accuracy cannot be guaranteed, partial (but small) noise exists in a large probability, and therefore the training cell images in the training set can be divided into two parts, namely the marked samples and the unmarked samples.

The predetermined loss threshold may be determined based on a loss of the test sample of the labeled sample and a loss of the test sample of the unlabeled sample. Wherein the determined preset loss threshold should satisfy the following condition: the proportion of marked samples with the loss of the test samples lower than the preset loss threshold is higher than a first proportion threshold, and the proportion of unmarked samples with the loss of the test samples higher than the preset loss threshold is lower than a second proportion threshold, wherein the first proportion threshold is larger and the second proportion threshold is smaller. Here, the number of marked samples with the loss of the test sample lower than the preset loss threshold may be obtained, and a ratio between the number and the total number of marked samples may be obtained as the ratio of the marked samples with the loss of the test sample lower than the preset loss threshold. In addition, the number of unlabeled samples with the loss of the test sample higher than the preset loss threshold may be obtained, and a ratio between the number and the total number of the unlabeled samples may be obtained as the proportion of the unlabeled samples with the loss of the test sample higher than the preset loss threshold. According to the determined preset loss threshold, the proportion of marked samples with the loss of the test samples lower than the preset loss threshold is higher, and the proportion of unmarked samples with the loss of the test samples higher than the preset loss threshold is lower, so that the preset loss threshold can be used for screening a small number of noise samples from the training set, particularly the unmarked samples.

Based on any of the above embodiments, the training phase of any round of training is determined based on the following steps:

and if the last training round of any training round is the training later period, determining that any training round is the training later period.

Specifically, the training phase of any round of training may be pre-training or post-training. When the training phase of the round of training is determined, the training phase of the previous round of training may be obtained first. If the last round of training is the training later period, the round of training can be directly determined to be the training later period. If the previous round of training is the training early stage, the judgment can be carried out according to the training result of the previous round of training, and if the signal point detection model tends to be converged after the previous round of training is finished, the fact that the round of training enters the training later stage can be determined. Specifically, the segmentation loss and the preset convergence threshold of the previous training round may be obtained, and if the segmentation loss of the previous training round is greater than the preset convergence threshold, it is determined that the model has not yet started to converge, and if the segmentation loss of the previous training round is less than or equal to the preset convergence threshold, it is determined that the model starts to converge, and thus it is determined that the training round is in the later training round. The preset convergence threshold may be preset.

Based on any of the above embodiments, the first loss function is configured to calculate a first product between the label value, the prediction probability difference value, and the logarithm of the prediction probability for each pixel, calculate a second product between the label difference value, the prediction probability, and the logarithm of the prediction probability difference value for each pixel, and calculate an average value of a sum of the first product and the second product for each pixel; wherein, the prediction probability difference value of any pixel is the difference value between the prediction probability of the corresponding pixel and 1; the label difference value of any pixel is the difference value between the label value of the corresponding pixel and 1;

Specifically, the first loss function may calculate a first product between the label value, the prediction probability difference value, and the logarithm of the prediction probability for each pixel, calculate a second product between the label difference value, the prediction probability, and the logarithm of the prediction probability difference value for each pixel, and calculate an average of the sum of the first product and the second product for each pixel using the following formulas:

wherein x and y correspond to the horizontal and vertical coordinates of any pixel in the training cell image, N is the number of pixels in the training cell image,

is the label value for that pixel,

is the predicted probability for that pixel,

and

for the hyper-parameter (which may be set to 1),

is the predicted probability difference for that pixel,

is the label difference value for that pixel,

is a first product of the pixel and is,

is the second product of the pixel.

The noise robust loss function may calculate a third product between the prediction probability of each pixel and the logarithm of the label value of the corresponding pixel, calculate a fourth product between the prediction probability difference of each pixel and the logarithm of the label difference, and calculate an average of the sum of the third product and the fourth product of each pixel using the following formulas:

wherein the content of the first and second substances,

is the third product of any pixel (xy),

is the fourth product of the pixel.

It can be seen that the second loss function can be expressed as follows:

to facilitate the derivation of the second loss function, which may be used to screen out noise samples, a common coefficient of 1/N may be taken to be 1, and the hyperparameter

And

all take 1, then the formula converts to:

when in use

When the ratio is not less than 1,

where log0= a, a may take-0.5.

Derived from the above equation:

its corresponding functional gradient map is shown in fig. 3, where the abscissa is the prediction probability and the ordinate is the gradient of the function. It can be seen that, in the function gradient map, the gradient values in the regions with high and medium probability values are more gradual, the absolute value difference of the gradients is more close, and the gradient values in the regions with low probability values are more obviously changed, so that when the sample loss of the training cell image is calculated, the contribution of the pixels with lower prediction probability (i.e. the pixels with wrong segmentation labels) is larger for all the pixels with 1 segmentation labels.

When in use

When the value is not less than 0, the reaction time is not less than 0,

derived from the above equation:

the corresponding functional gradient map is shown in fig. 4, where the abscissa is the prediction probability and the ordinate is the gradient of the function. As can be seen, in the function gradient map, the gradient values in the regions with the medium and low probability values are more gradual, the absolute value difference of the gradients is more approximate, and the gradient values in the regions with the high probability values are more obviously changed, so that when the sample loss of the training cell image is calculated, the contribution of the pixels with higher prediction probability (i.e., the pixels with wrong segmentation labels) is greater for all the pixels with the segmentation labels of 0.

In summary, for any training cell image, the proportion of pixels with wrong segmentation labels in the sample loss is large, so if the sample loss of the training cell image is large, it can be considered that many pixels with wrong segmentation labels are contained in the sample loss, and therefore, the samples can be determined as noise samples and can be removed from the training set of the current round, so that the model parameters of the current round are not adjusted, and the learning direction of the bias model is avoided.

Based on any of the embodiments, as shown in fig. 5, if the current round of training is the training later stage, calculating the segmentation loss by using the second loss function based on the prediction result of each training cell image and the segmentation label thereof, specifically includes:

step 510, calculating the sample loss of the labeled sample in the current round of training by using the first loss function based on the prediction result of the labeled sample with each segmented label as the labeled label and the segmented label thereof;

step 520, calculating the sample loss of the unmarked sample in the current round of training by using the second loss function based on the prediction result of the unmarked sample of which each segmentation label is a pseudo-labeling label and the segmentation labels thereof;

step 530, determining the segmentation loss based on the sample loss of the training of the labeled samples in the current round and the sample loss of the training of the unlabeled samples in the current round.

Specifically, because the labeled labels of the labeled samples are manually labeled, the accuracy of the labeled samples can be guaranteed, and for the labeled samples, the loss of the samples trained in the current round can be calculated by adopting the first loss function based on the prediction result and the segmentation labels of the labeled samples. For the unlabeled samples with the segmentation labels being the pseudo-labeling labels, because a small number of noise samples are included, the sample loss of each unlabeled sample in the current round of training can be calculated by using the second loss function based on the prediction result of the unlabeled sample and the segmentation labels thereof. Then, based on the loss of the samples of the marked samples in the current training round and the loss of the samples of the unmarked samples in the current training round, the segmentation loss of the current training round is determined.

Based on any of the above embodiments, the segmentation loss is calculated based on the following steps:

and determining the segmentation loss based on the sample loss and the loss weight of the marked sample in the current round of training, and the sample loss and the loss weight of the unmarked sample in the current round of training.

Specifically, different loss weights may be set for the labeled samples and the unlabeled samples, so that when the segmentation loss of the current round of training is determined, the loss of the samples of the labeled samples in the current round of training and the loss of the samples of the unlabeled samples in the current round of training are weighted and summed. Considering that the models in different training stages tend to fit different objects, for example, the pre-training stage tends to fit correct samples, the post-training stage tends to fit noise samples, and most of the marked samples are correct samples, so the loss weight can be set for the marked samples and the unmarked samples based on the training stage of the current round of training. For example, in an early stage of model training, the loss weight of the labeled samples may be larger, and the loss weight of the unlabeled samples may be smaller; conversely, in the later stage of model training, the loss weight of the labeled samples may be smaller, and the loss weight of the unlabeled samples may be larger. In addition, with the increase of the number of training rounds, the detection capability of the signal point detection model is improved, and the screening capability of the noise sample is also enhanced, so that the loss weight of the unlabeled sample can be finely adjusted and gradually increased at the later stage of model training based on the number of the current trained rounds.

Based on any of the above embodiments, fig. 6 is a schematic flow chart of a signal point detection method provided by the present invention, as shown in fig. 6, the method includes:

step 610, determining a cell image to be detected;

step 620, inputting the cell image to be detected into a signal point detection model to obtain a signal point detection result of the cell image to be detected output by the signal point detection model;

the signal point detection model is obtained by training based on a semi-supervised signal point detection model training method provided by any one of the embodiments shown.

The semi-supervised signal point detection model training device provided by the present invention is described below, and the semi-supervised signal point detection model training device described below and the semi-supervised signal point detection model training method described above may be referred to in correspondence with each other.

Based on any of the above embodiments, fig. 7 is a schematic structural diagram of a semi-supervised signal point detection model training apparatus provided by the present invention, as shown in fig. 7, the apparatus includes: a forward propagation unit 710, a first loss calculation unit 720, a second loss calculation unit 730, and a backward propagation unit 740.

The forward propagation unit 710 is configured to perform signal point segmentation on training cell images in a training set of a current round of training based on a signal point detection model to obtain a prediction result of each training cell image;

the first loss calculating unit 720 is configured to calculate, if the current round of training is a training earlier stage, a segmentation loss by using a first loss function based on the prediction result of each training cell image and the segmentation label thereof; the segmentation labels comprise labeled labels and pseudo-labeled labels; the first loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation labels based on the segmentation labels;

the second loss calculating unit 730 is configured to calculate, if the current round of training is a training later stage, a segmentation loss by using a second loss function based on the prediction result of each training cell image and the segmentation label thereof; the second loss function comprises the first loss function and a noise robust loss function, and the noise robust loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation labels based on the prediction result;

back propagation unit 740 is configured to perform parameter adjustment on the signal point detection model based on the segmentation loss.

According to the device provided by the embodiment of the invention, different loss functions are adopted in the early training stage and the later training stage to control the learning direction of the signal point detection model, specifically, in the early training stage, the first loss function is adopted to calculate the segmentation loss of the current round, and the parameters of the signal point detection model are adjusted according to the segmentation loss of the current round, so that the prediction result of the training cell image output by the signal point detection model is close to the corresponding segmentation label, and the detection precision of the model is improved; in the later training stage, on the basis of the first loss function, a noise robust loss function is combined, so that on one hand, the prediction result of the training cell image output by the signal point detection model is close to the corresponding segmentation label, and on the other hand, the segmentation label of the training cell image is close to the prediction result of the corresponding training cell image output by the signal point detection model, thereby removing the noise influence in the pseudo-labeling label of the unmarked sample, avoiding the noise in the pseudo-labeling label of the model fitting the unmarked sample, and further improving the detection precision of the model.

Based on any of the embodiments, the calculating, by using the second loss function, the segmentation loss based on the prediction result of each training cell image and the segmentation label thereof specifically includes:

calculating the loss of the test sample of each training cell image in the training set based on the second loss function based on the prediction result of each training cell image in the training set in the last round of training process before training and the segmentation label of the corresponding training cell image;

Based on any of the embodiments, if the current round of training is the training later stage, calculating the segmentation loss by using the second loss function based on the prediction result of each training cell image and the segmentation label thereof, specifically including:

Based on any of the above embodiments, the determining the segmentation loss based on the sample loss of the labeled sample in the current round of training and the sample loss of the unlabeled sample in the current round of training specifically includes:

The signal point detecting device provided by the present invention is described below, and the signal point detecting device described below and the signal point detecting method described above may be referred to in correspondence with each other.

Based on any of the above embodiments, fig. 8 is a schematic structural diagram of a signal point detection apparatus provided by the present invention, as shown in fig. 8, the apparatus includes: an image determination unit 810 and a signal point detection unit 820.

The image determining unit 810 is configured to determine an image of a cell to be detected;

the signal point detection unit 820 is configured to input the cell image to be detected to a signal point detection model, and obtain a signal point detection result of the cell image to be detected output by the signal point detection model;

the signal point detection model is obtained by training based on the semi-supervised signal point detection model training method provided by any one of the above embodiments.

Fig. 9 is a schematic structural diagram of an electronic device provided in the present invention, and as shown in fig. 9, the electronic device may include: a processor (processor) 910, a memory (memory) 920, a communication Interface (Communications Interface) 930, and a communication bus 940, wherein the processor 910, the memory 920, and the communication Interface 930 communicate with each other via the communication bus 940. Processor 910 may invoke logic instructions in memory 920 to perform a semi-supervised signal point detection model training method, the method comprising: performing signal point segmentation on training cell images in a training set of the current round of training based on a signal point detection model to obtain a prediction result of each training cell image; if the current round of training is in the early stage of training, calculating segmentation loss by using a first loss function based on the prediction result of each training cell image and the segmentation label thereof; the segmentation labels comprise labeled labels and pseudo-labeled labels; the first loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation labels based on the segmentation labels; if the current round of training is in the later training stage, calculating the segmentation loss by using a second loss function based on the prediction result of each training cell image and the segmentation label thereof; the second loss function comprises the first loss function and a noise robust loss function, and the noise robust loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation labels based on the prediction result; and performing parameter adjustment on the signal point detection model based on the segmentation loss.

Processor 910 may also invoke logic instructions in memory 920 to perform a signal point detection method comprising: determining a cell image to be detected; inputting the cell image to be detected into a signal point detection model to obtain a signal point detection result of the cell image to be detected output by the signal point detection model; the signal point detection model is obtained by training based on the semi-supervised signal point detection model training method provided by any one of the above embodiments.

In addition, the logic instructions in the memory 920 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions, which when executed by a computer, enable the computer to perform the semi-supervised signal point detection model training method provided by the above methods, the method comprising: performing signal point segmentation on training cell images in a training set of the current round of training based on a signal point detection model to obtain a prediction result of each training cell image; if the current round of training is in the early stage of training, calculating segmentation loss by using a first loss function based on the prediction result of each training cell image and the segmentation label thereof; the segmentation labels comprise labeled labels and pseudo-labeled labels; the first loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation labels based on the segmentation labels; if the current round of training is in the later training stage, calculating the segmentation loss by using a second loss function based on the prediction result of each training cell image and the segmentation label thereof; the second loss function comprises the first loss function and a noise robust loss function, and the noise robust loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation labels based on the prediction result; and performing parameter adjustment on the signal point detection model based on the segmentation loss.

The computer can also execute the signal point detection method provided by the methods, and the method comprises the following steps: determining a cell image to be detected; inputting the cell image to be detected into a signal point detection model to obtain a signal point detection result of the cell image to be detected output by the signal point detection model; the signal point detection model is obtained by training based on the semi-supervised signal point detection model training method provided by any one of the above embodiments.

In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor is implemented to perform the semi-supervised signal point detection model training method provided above, the method comprising: performing signal point segmentation on training cell images in a training set of the current round of training based on a signal point detection model to obtain a prediction result of each training cell image; if the current round of training is in the early stage of training, calculating segmentation loss by using a first loss function based on the prediction result of each training cell image and the segmentation label thereof; the segmentation labels comprise labeled labels and pseudo-labeled labels; the first loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation labels based on the segmentation labels; if the current round of training is in the later training stage, calculating the segmentation loss by using a second loss function based on the prediction result of each training cell image and the segmentation label thereof; the second loss function comprises the first loss function and a noise robust loss function, and the noise robust loss function is used for determining the difference between the distribution of the predicted result and the distribution of the segmentation label based on the predicted result; and performing parameter adjustment on the signal point detection model based on the segmentation loss.

The computer program, when executed by a processor, is further embodied to perform the signal point detection methods provided above, the method comprising: determining a cell image to be detected; inputting the cell image to be detected into a signal point detection model to obtain a signal point detection result of the cell image to be detected output by the signal point detection model; the signal point detection model is obtained by training based on the semi-supervised signal point detection model training method provided by any one of the above embodiments.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A semi-supervised signal point detection model training method is characterized by comprising the following steps:

if the current round of training is a training early stage, calculating segmentation loss by using a first loss function based on the prediction result of each training cell image and the segmentation label thereof; the segmentation labels comprise labeled labels and pseudo-labeled labels; the first loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation labels based on the segmentation labels;

2. The semi-supervised signal point detection model training method of claim 1, wherein the calculating the segmentation loss using the second loss function based on the prediction result of each training cell image and the segmentation label thereof specifically comprises:

3. The semi-supervised signal point detection model training method of claim 2, wherein the preset loss threshold is determined based on the following steps:

4. The semi-supervised signal point detection model training method of claim 3, wherein the training phase of any round of training is determined based on the following steps:

if the previous training period of the previous training period is the training period, determining the training period of the previous training period based on the segmentation loss of the previous training period and a preset convergence threshold value; if the segmentation loss of the previous round of training is greater than the preset convergence threshold, determining that any round of training is a training early stage, and if the segmentation loss of the previous round of training is less than or equal to the preset convergence threshold, determining that any round of training is a training late stage;

5. The semi-supervised signal point detection model training method of claim 1, wherein the first loss function is used for calculating a first product between the label value, the prediction probability difference value and the logarithm of the prediction probability of each pixel, calculating a second product between the label difference value, the prediction probability and the logarithm of the prediction probability difference value of each pixel, and calculating an average value of a sum of the first product and the second product of each pixel; wherein, the prediction probability difference value of any pixel is the difference value between the prediction probability of the corresponding pixel and 1; the label difference value of any pixel is the difference value between the label value of the corresponding pixel and 1;

6. The method according to claim 1, wherein if the current round of training is a late training stage, calculating a segmentation loss using a second loss function based on a prediction result of each training cell image and a segmentation label thereof, specifically comprising:

calculating the sample loss of the marked sample in the current round of training by utilizing the first loss function based on the prediction result of the marked sample with each division label as the marked label and the division label thereof;

7. The semi-supervised signal point detection model training method of claim 1, wherein the segmentation loss is calculated based on the following steps:

8. A signal point detection method, comprising:

determining a cell image to be detected;

wherein the signal point detection model is trained based on the semi-supervised signal point detection model training method according to any one of claims 1 to 7.

9. The utility model provides a semi-supervised signal point detection model training device which characterized in that includes:

the first loss calculation unit is used for calculating the segmentation loss by using a first loss function based on the prediction result of each training cell image and the segmentation label thereof if the current round of training is the training earlier stage; the segmentation labels comprise labeled labels and pseudo-labeled labels; the first loss function is used for determining the difference between the distribution of the prediction result and the distribution of the segmentation labels based on the segmentation labels;

10. A signal point detection device, comprising: