CN114639102A

CN114639102A - Cell segmentation method and device based on key point and size regression

Info

Publication number: CN114639102A
Application number: CN202210506262.XA
Authority: CN
Inventors: 吕行; 王华嘉; 邝英兰; 范献军; 蓝兴杰; 黄仁斌; 叶莘
Original assignee: Zhuhai Hengqin Shengao Yunzhi Technology Co ltd
Current assignee: Zhuhai Hengqin Shengao Yunzhi Technology Co ltd
Priority date: 2022-05-11
Filing date: 2022-05-11
Publication date: 2022-06-17
Anticipated expiration: 2042-05-11
Also published as: CN114639102B

Abstract

The invention provides a cell segmentation method and a cell segmentation device based on key point and size regression, wherein the method comprises the following steps: performing feature extraction on a cell image to be segmented to obtain a feature map corresponding to the cell image to be segmented; performing key point detection and size regression analysis based on the feature map to obtain key point information and cell size information associated with the key point information, and determining a cell detection frame based on the key point information and the cell size information associated with the key point information; wherein, any key point information comprises the position information of the key point corresponding to the cell detection frame; and based on the cell detection frame, carrying out cell segmentation on the characteristic diagram to obtain a cell segmentation result of the cell image to be segmented. The invention improves the cell segmentation performance of irregular cells in a dense scene.

Description

Cell segmentation method and device based on key point and size regression

Technical Field

The invention relates to the field of image segmentation, in particular to a cell segmentation method and device based on key point and size regression.

Background

In medical cell image analysis, detection of cell images and image segmentation are extremely important links, and are also basic preconditions for studies such as identification of cell images. However, real cell images have diversity and complexity, in some cases, there may be a cell mass formed by a plurality of dense cells in the cell images, and the task of segmenting the cell images for the dense cells may face more difficulty, which may easily cause errors in the segmentation result. Therefore, a cell image segmentation method capable of adapting to dense cell scenes is needed.

At present, traditional anchor box (anchor box) based example segmentation models, such as Mask-RCNN, Pointrend and the like, are mostly adopted in the segmentation task for cell images. When the cell segmentation is performed by the anchor frame-based cell instance segmentation model, the potential target positions are generally exhausted in advance, and anchor frames are generated at corresponding positions, so that a target boundary frame is predicted based on the anchor frames. However, when there is a large amount of cell aggregation, a plurality of cells are overlapped or pressed, and the pre-mentioned anchor frame may miss some cells in the cell mass, resulting in missing some cells in the cell segmentation result. In addition, although the performance degradation caused by the reduction of the anchor frame is reduced in the existing example segmentation model of the partial anchor-free, the prediction of the cell detection frame still has the problem of inaccuracy, so that the selection of the cell detection frame in the later period is difficult and the subsequent cell segmentation task for the cell detection frame is not good enough. Therefore, a more accurate target detection frame prediction method is needed to deal with the cell segmentation problem in the dense cell scene.

Disclosure of Invention

The invention provides a cell segmentation method and a cell segmentation device based on key point and size regression, which are used for solving the defect of poor prediction accuracy of a cell detection frame in the prior art.

The invention provides a cell segmentation method based on key point and size regression, which comprises the following steps:

extracting features of a cell image to be segmented to obtain a feature map corresponding to the cell image to be segmented, and detecting key points based on the feature map to obtain key point information and cell size information related to the key point information; wherein, any key point information comprises the position information and the type of the key point corresponding to the cell detection frame; the cell size information represents size information of a cell detection frame corresponding to the associated key point information;

respectively searching the corner information corresponding to each piece of central point information in the searching range of each piece of central point information by taking the key point information except the central point as a searching object; the search range of any central point information is determined based on the cell size information related to any central point information;

generating a cell detection frame based on the central point information and the corner point information corresponding to the central point information;

and based on the cell detection frame, carrying out cell segmentation on the characteristic diagram to obtain a cell segmentation result of the cell image to be segmented.

According to the cell segmentation method based on the key point and size regression provided by the invention, in the search range of each piece of central point information in the key point information, the key point information except the central point is taken as a search object, and the corner point information corresponding to each piece of central point information is searched, specifically comprising the following steps:

determining an initial search frame of any central point information based on any central point information and cell size information related to any central point information; the initial search frame is a rectangular frame which takes any one piece of central point information as a center and is adaptive to cell size information related to any one piece of central point information in size;

determining a search range of any central point information by taking an angular point of the initial search box of any central point information as a search center and a preset threshold as a search radius;

and searching corner information corresponding to any central point information based on the type of each key point information in the searching range of any central point information.

According to the cell segmentation method based on the key point and size regression provided by the invention, the feature extraction is carried out on the cell image to be segmented to obtain the feature map corresponding to the cell image to be segmented, the key point detection is carried out on the basis of the feature map to obtain the key point information and the cell size information related to the key point information, and the method specifically comprises the following steps:

performing multi-scale feature extraction on a cell image to be segmented to obtain feature maps under all scales;

after an up-sampling feature map output by an up-sampling module of a previous scale is overlapped with a feature map of the previous scale, the feature map is input to an up-sampling module of a current scale to obtain an up-sampling feature map output by the up-sampling module of the current scale; wherein, the input of the first up-sampling module is a characteristic diagram under the highest-order scale;

and performing key point detection and cell size regression based on the up-sampling characteristic diagram output by the up-sampling module of each scale to obtain key point information and cell size information related to the key point information under each scale.

According to the cell segmentation method based on key point and size regression provided by the invention, the cell detection frame is generated based on each piece of central point information and the corner point information corresponding to each piece of central point information, and the method specifically comprises the following steps:

generating candidate detection frames under each scale based on each piece of central point information under each scale and the corner point information corresponding to each piece of central point information;

fusing the candidate detection frames under each scale by adopting a weighted non-maximum inhibition method to obtain cell detection frames under each scale; wherein, the candidate detection frame at the higher order scale has a larger weight, and the candidate detection frame with higher confidence coefficient has a larger weight.

According to the cell segmentation method based on the key point and size regression provided by the invention, the key point detection is carried out on the up-sampling feature map output by the up-sampling module based on each scale, and the method specifically comprises the following steps:

acquiring branches by using the key point heat maps of the key point prediction module, and respectively detecting key points of the up-sampling characteristic maps output by the up-sampling modules of all scales to obtain the key point heat maps of all scales;

utilizing the offset prediction branch of the key point prediction module to determine the key point offset under each scale based on the up-sampling characteristic diagram output by the up-sampling module of each scale; wherein the keypoint offset in any scale represents the coordinate offset of each keypoint in any scale when the keypoint is mapped to the cell image to be segmented from the keypoint heat map in any scale;

and determining the key point information under each scale based on the key point heat map and the key point offset under each scale.

According to the cell segmentation method based on the key point and size regression, provided by the invention, the loss function of the key point prediction module during training comprises key point heat map loss and offset loss;

wherein the keypoint heat map loss characterizes a difference between an activation result of the sample keypoint heat map and a sample keypoint determined based on an annotation result of the sample cell image; the sample keypoint heat map comprises keypoints in the sample cell image obtained by the offset prediction branch prediction;

the offset loss characterizes an error of a sample keypoint offset predicted by the offset prediction branch.

According to the cell segmentation method based on the key point and size regression provided by the invention, the cell segmentation is performed on the feature map based on the cell detection frame to obtain the cell segmentation result of the cell image to be segmented, and the method specifically comprises the following steps:

respectively intercepting the feature maps under all scales based on any cell detection frame to obtain intercepted features under all scales;

the current interception fusion feature is sampled upwards and then fused with the interception feature under the corresponding scale to obtain the next interception fusion feature; wherein, the first interception fusion feature is the interception feature under the highest order scale;

performing cell mask prediction based on the last intercepted fusion feature to obtain a cell mask prediction result corresponding to any cell detection frame;

and determining the cell segmentation result based on the cell mask prediction result corresponding to each cell detection frame.

According to the cell segmentation method based on key point and size regression, provided by the invention, cell segmentation is realized by a cell segmentation module;

wherein the loss function of the cell segmentation module during training comprises mask loss and edge loss; the mask loss represents the difference between the sample cell mask prediction result obtained by the cell segmentation module and the cell mask marking result; the edge loss characterizes the difference between the cell edge prediction result and the cell edge labeling result of the sample; the sample cell edge prediction result is determined based on the sample cell mask prediction result, and the cell edge labeling result is determined based on the cell mask labeling result.

The invention also provides a cell segmentation device based on key point and size regression, which comprises:

the key point regression unit is used for extracting features of a cell image to be segmented to obtain a feature map corresponding to the cell image to be segmented, and detecting key points based on the feature map to obtain key point information and cell size information related to the key point information; wherein, any key point information comprises the position information and the type of the key point corresponding to the cell detection frame; the cell size information represents size information of a cell detection frame corresponding to the associated key point information;

the corner searching unit is used for searching the corner information corresponding to each piece of central point information by taking the piece of central point information except the central point as a searching object in the searching range of each piece of central point information; the search range of any central point information is determined based on the cell size information related to any central point information;

a detection frame generating unit, configured to generate a cell detection frame based on the center point information and the corner point information corresponding to the center point information;

and the cell segmentation unit is used for carrying out cell segmentation on the characteristic diagram based on the cell detection frame to obtain a cell segmentation result of the cell image to be segmented.

The invention also provides an electronic device, comprising a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the steps of the cell segmentation method based on the key point and the size regression.

The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method for cell segmentation based on key point and size regression as described in any of the above.

The present invention also provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of the method for cell segmentation based on both key point and size regression as described in any of the above.

The invention provides a cell segmentation method and a device based on key point and size regression, which are characterized in that key point detection is carried out based on a characteristic graph to obtain each key point information and cell size information related to the key point information, then the cell size information related to each key point information and key point information is comprehensively utilized to search the corner point information of each central point information, thereby generating and determining a corresponding cell detection frame based on the central point information and the corner point information thereof, and constraining the size of the cell detection frame through the cell size information, searching key point information of the same cell detection frame in a constraint range, avoiding cell segmentation failure caused by confusion of key points of cell detection frames of different cells when the cells are dense, thereby improving the accuracy of cell detection frame prediction, and then cell segmentation is carried out according to each cell detection frame, so that the cell segmentation performance of irregular cells in a dense scene is improved.

Drawings

In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a schematic flow chart of a cell segmentation method based on key point and size regression according to the present invention;

FIG. 2 is a schematic diagram of a corner point search method provided by the present invention;

FIG. 3 is a schematic diagram of a multi-scale feature extraction method provided by the present invention;

FIG. 4 is a schematic structural diagram of a CBAM module provided by the present invention;

FIG. 5 is a schematic diagram of a multi-scale keypoint regression method provided by the present invention;

FIG. 6 is a schematic diagram of a cell segmentation branch provided by the present invention;

FIG. 7 is a flow chart of a segmentation model construction method provided by the present invention;

FIG. 8 is a block diagram of the overall framework of the segmentation model provided by the present invention;

FIG. 9 is a schematic diagram comparing the effects of the models provided by the present invention;

FIG. 10 is a schematic diagram of a cell segmentation apparatus based on key point and size regression according to the present invention;

fig. 11 is a schematic structural diagram of an electronic device provided in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Fig. 1 is a schematic flow chart of a cell segmentation method based on key point and size regression according to an embodiment of the present invention, as shown in fig. 1, the method includes:

step 110, extracting features of a cell image to be segmented to obtain a feature map corresponding to the cell image to be segmented, and detecting key points based on the feature map to obtain key point information and cell size information related to the key point information; wherein, any key point information comprises the position information and the type of the key point corresponding to the cell detection frame; the cell size information characterizes size information of a cell detection frame corresponding to the associated keypoint information.

Here, the image feature extraction network may be used to perform feature extraction, obtain semantic information of each cell in the cell image to be segmented, and obtain a feature map corresponding to the cell image to be segmented. For example, a convolutional neural network may be used to perform down-sampling on a cell image to be segmented for multiple times, and the highest-level feature map output by the last down-sampling may be used as the feature map corresponding to the image. In addition, under the scene of different cell sizes, a multi-scale feature extraction mode can be adopted to gradually extract feature maps of cells to be segmented under various scales so as to accurately extract semantic information of the cells with different sizes.

And then, carrying out key point detection based on the characteristic diagram to obtain key point information in the cell image to be segmented and cell size information related to each key point information. The information of any key point comprises position information and types of key points corresponding to the cell detection frame, the key points comprise center points and/or corner points of the cell detection frame, and the specific center point or the corner point of the key point can be determined according to the types of the key points in the key point information.

In addition, if the feature maps of the cell image to be segmented under each scale are extracted, the feature maps under each scale can be used for predicting key points respectively, so that key point information under each scale can be obtained. Specifically, the feature map corresponding to the next scale may be gradually fused from the feature map at the highest-order scale, and after each feature map is fused, the keypoint prediction may be performed based on the fused feature map obtained by the fusion, so as to sequentially obtain the keypoint information from the higher-order scale to the lower-order scale. When any one feature map is fused, the image semantic information in each feature map from the highest-order scale to the current scale can be integrated to obtain a fused feature map at the current scale, and then the key point prediction is performed based on the image semantic information in the fused feature map at the current scale to obtain the key point information at the current scale. The image semantic information in the feature map under the current scale and the previous scales is utilized when the key point prediction is carried out each time, so that the method can adapt to the key point prediction of cells with different sizes and deformed cells, and the accuracy and comprehensiveness of the key point prediction under a dense cell scene are improved.

It should be noted that whether the multi-scale feature extraction and multi-scale keypoint prediction manner is adopted may be determined according to an actual application scenario, which is not specifically limited in the embodiment of the present invention. If the cell sizes in the practical application scene are not large, the highest-layer feature map output by the feature extraction network can be used as the feature map of the cell image to be segmented, and the key point prediction is carried out on the basis of the feature map; if the cell size difference in the practical application scene is large, the feature map under each scale obtained by each time of down-sampling of the feature extraction network can be obtained, and the key point prediction is carried out according to the feature map under each scale respectively to obtain the key point information under each scale.

When the key point prediction is performed, cell size regression can be performed according to the feature map, and cell size information associated with each key point information is determined, or cell size information is preset according to the average size of cells in the current application scene. Wherein the cell size information reflects the size of the cell detection frame corresponding to the associated key point information.

Step 120, searching the corner information corresponding to each piece of central point information by taking the key point information except the central point as a searching object in the searching range of each piece of central point information in the key point information; and determining the search range of any central point information based on the cell size information associated with any central point information.

Here, according to the number of cells possible in an actual application scenario, a plurality of pieces of key point information with high confidence (that is, the possibility of key points of the cell detection frame) may be selected from each piece of key point information, and used as a basis for subsequently determining the cell detection frame. Subsequently, the cell detection frame can be constrained by using the cell size information obtained in the above steps, so as to determine the key point information of the same cell detection frame, so that the size of the cell detection frame is adapted to the corresponding cell size information, and the accuracy of the cell detection frame obtained in the dense cell scene is improved. Specifically, based on the type of any piece of key point information, the center point information is determined. And determining a searching range of the central point information according to the central point information and the cell size information related to the central point information, so as to search other key point information belonging to the same cell detection frame as the central point in the searching range. The cell size information associated with the center point information is used for defining a search range, so that the corner point information which is most possibly the same as the center point information and belongs to one cell detection frame can be included to a large extent, and the corner point information of the cell detection frames belonging to other cells can be excluded as much as possible.

And then, searching the corner information corresponding to each piece of central point information by taking the key point information except the central point information as a searching object in the searching range of each piece of central point information. For any piece of center point information, other pieces of key point information may be used as search objects, and key point information of types of corner points (e.g., upper left key point and lower right key point) is searched in the search range of the center point information. If other corner information is searched and the same type of corner information is larger than 1 (for example, two upper left key points are searched in the search range), the searched corner information can be further screened according to the cell size information associated with the center point information to obtain the corner information corresponding to the center point information. And then combining the central point information and the corner point information corresponding to the central point information into a key point set, and generating a corresponding cell detection frame by using the key point set. If a multi-scale key point prediction mode is adopted, when searching for a corner point, the corner point information of the center point information can be searched based on any center point information and other key point information under the scale corresponding to the center point information is used as a search object.

In addition, the cell detection frame can also be determined directly according to the central point information in the key point information and the cell size information related to the central point information, so that the acquisition efficiency of the cell detection frame is improved.

And 130, generating a cell detection frame based on each piece of central point information and the corner point information corresponding to each piece of central point information.

Here, a rectangular frame using the corner information of each piece of center point information (for example, the upper left key point information and the lower right key point information) as corners may be generated as the cell detection frame corresponding to each piece of center point information, respectively, based on each piece of center point information and the corner information corresponding to each piece of center point information.

And 140, performing cell segmentation on the characteristic diagram based on the cell detection frame to obtain a cell segmentation result of the cell image to be segmented.

Here, in order to avoid mutual interference between cells, the feature map of the cell image to be segmented may be segmented within the range of each cell detection frame based on each cell detection frame, to obtain the cell region in each cell detection frame as the cell segmentation result of the cell image to be segmented. In addition, if a multi-scale feature extraction scheme is adopted, a traversing segmentation mode can be adopted when cell segmentation is carried out, and the feature maps under all scales are used for carrying out binarization segmentation on potential cell areas under each cell detection frame one by one, so that a cell segmentation result of a cell image to be segmented is obtained.

The method provided by the embodiment of the invention detects the key points based on the characteristic diagram to obtain each key point information and the cell size information associated with the key point information, then comprehensively utilizes each key point information and the cell size information associated with the key point information to search the corner information of each central point information, thereby generating and determining a corresponding cell detection frame based on the central point information and the corner point information thereof, and constraining the size of the cell detection frame through the cell size information, searching key point information of the same cell detection frame in a constraint range, avoiding cell segmentation failure caused by confusion of key points of cell detection frames of different cells when the cells are dense, thereby improving the accuracy of cell detection frame prediction, and then cell segmentation is carried out according to each cell detection frame, so that the cell segmentation performance of irregular cells in a dense scene is improved.

Based on the above embodiment, the searching, in the search range of each piece of central point information in the key point information, for the corner point information corresponding to each piece of central point information by using key point information other than the central point as a search object, specifically includes:

determining a search range of any central point information by taking an angular point of the initial search box of any central point information as a search center and a preset threshold value as a search radius;

and searching the corner information corresponding to any central point information based on the type of each key point information within the searching range of any central point information.

Specifically, as shown in fig. 2, for any piece of center point information, an initial search box centered on the center point information (as shown by a dashed box in fig. 2, the center point information is a center point of a dashed box) may be defined based on the center point information and cell size information associated with the center point information, and the size of the initial search box is adapted to the cell size information associated with the center point information. Usually, the corner information corresponding to the center point information should be at or near the corner of the initial search box. Therefore, the search range of the center point information (e.g., the solid line frames at the upper left corner and the lower right corner in fig. 2) can be determined by taking the corner point of the initial search box of the center point information as the search center and taking a preset threshold as the search radius. Here, the size of the search radius may be set according to the cell density, and the search radius may be set to be smaller as the cells are denser, so as to avoid enclosing corner points of cell detection frames of too many other cells in the search range.

And then, searching corner point information corresponding to the central point information based on the types of other key point information in the searching range of the central point information. Specifically, for any search range of the center point information, other key point information may be used as a search object in the search range, and according to the types of the other key point information, the corner point information which is the same as the center point type of the search range (for example, all the corner points are upper left corner points or lower right corner points) and is closest to the center point of the search range is searched, so that the accuracy of the cell detection frame obtained in accordance with the search object is improved, and particularly, in a dense cell scene, the most appropriate cell detection frame corresponding to each cell may be obtained. As shown in fig. 2, it is assumed that the type of the keypoint information is the top-left corner

And

and type being lower right corner

And

. Taking the search range of the lower right corner as an example, since the center point of the search range is the lower right corner of the initial search box, the lower right corner which is located in the search range and closest to the center point of the search range, that is, the lower right corner can be searched in the key point information corresponding to the same scale

。

Based on any of the embodiments, the performing feature extraction on the cell image to be segmented to obtain a feature map corresponding to the cell image to be segmented, and performing key point detection based on the feature map to obtain key point information and cell size information associated with the key point information specifically includes:

In particular, a multi-scale feature extraction network can be constructed to realize multi-scale feature extraction for the cell image to be segmented. The multi-scale feature extraction network can be realized by various backbone networks, such as a basic encoder of the Unet, Resnet, DLAnet, Densenet, Imagenet, Efficientnet, and the like. As shown in fig. 3, the multi-scale feature extraction network gradually obtains feature maps at different scales through a 3-5-level down-sampling process. The feature maps at different scales contain image semantic information at different degrees, the feature map corresponding to a higher scale has richer high-level semantic information, such as category information, and the feature map corresponding to a lower scale has richer low-level semantic information, such as position information.

Therefore, subsequent key point prediction operation and cell segmentation operation can be performed by using the feature maps corresponding to a plurality of scales to integrate the information contained in the feature maps corresponding to different scales, so that cells with different sizes can be captured and segmented more accurately. In addition, a CBAM Attention mechanism may be fused in the backbone network, for example, a CBAM (Attention mechanism Module of a convolution Module) Module may be integrated in each Resnet Block to improve the accuracy of feature extraction. The CBAM module integrated in each Resnet Block is shown in fig. 4.

As shown in fig. 5, the feature map (1024 × 64) corresponding to the highest-order scale output by the last layer of the multi-scale feature extraction network is input to the first upsampling module (i.e., the upsampling module of the highest-order scale), the first upsampling module performs upsampling on the feature map, the feature map is multiplied, the upsampled feature map is output, the upsampled feature map is then superimposed with the feature map (512 × 128) of the same scale, and the superimposed result is used as the input of the upsampling module of the next scale, so as to obtain the upsampled feature map output by the upsampling module of the next lower scale. By analogy, the upsampling characteristic graphs output by the upsampling modules of all scales can be obtained in sequence.

The up-sampling feature map output by the up-sampling module of each scale can be input to the key point prediction module, and the plurality of regression heads are utilized to perform key point detection and cell size regression of the cell detection frame, so that key point information and cell size information associated with the key point information under each scale are obtained.

Here, since the sizes of cells in the cell image are not completely consistent, especially in a dense cell scene, there are problems that the cells are squeezed each other and the cells are deformed, and thus the sizes of the respective cells are different. Considering that the cell size information is the key information for determining the cell detection frame, the cell size information associated with each key point information can be determined by performing regression calculation on the key point information under each scale, so that the cell size information can more accurately reflect the size of the cell corresponding to each key point information, and the accuracy of the cell detection frame determined according to the cell size information can be improved. According to the obtained key point information and the cell size information associated with the key point information under each scale, a plurality of pieces of key point information (including but not limited to a central point, an upper left corner, a lower right corner and the like) under the same scale can be combined into a plurality of key point sets through a distance rule determined based on the cell size information, and corresponding cell detection frames are respectively generated through the key point sets.

Based on any one of the embodiments, the generating a cell detection frame based on the information of each central point and the corner information corresponding to the information of each central point specifically includes:

Specifically, since cell detection frames obtained by the search method may overlap with each other at different scales, candidate detection frames corresponding to each piece of center point information at each scale may be generated according to the pair of each piece of center point information at each scale and the pair of corner point information corresponding to each piece of center point information, and then the candidate detection frames are screened. Here, a Non-maximum Suppression (NMS) method may be used to screen candidate detection frames at each scale. To further improve the accuracy of the selected cell detection frame, a weighted NMS method may be used to fuse and screen candidate detection frames to obtain a more accurate cell detection frame. Specifically, in the non-maximum suppression method, a weight may be set for each candidate detection frame, and the greater the weight of any cell detection frame is, the higher the possibility that the cell detection frame is selected and left is, and conversely, the smaller the weight of any cell detection frame is, the more likely the cell detection frame is to be filtered out.

Here, in setting the weight for each candidate detection frame, the weight of the candidate detection frame at the higher order scale may be set to be larger, but the weights of the candidate detection frames at the respective scales may be set to be uniform. The image semantic information applied in the candidate detection frame determined by performing focus search based on the key point information and the cell size information determined based on the up-sampling feature map with the higher order scale is higher and richer, so that the accuracy of the candidate detection frame is possibly higher, and the weight value is also correspondingly set to be larger. In addition, the higher the confidence, the higher the weight of the candidate detection box. The confidence of the candidate detection frame can be determined based on the energy value of the key point heat map in the candidate detection frame. For example, the cell detection frames can be screened from the candidate detection frames using the following formula:

wherein, b_iRefers to the heat map mean value, w, of each of a plurality of candidate detection frames (including different scales) formed for the same object_iIs the weight corresponding to each candidate detection box, b_preThen get w_i*b_iAnd taking the candidate detection frame corresponding to the maximum value as the cell detection frame of the object.

Based on any of the above embodiments, the performing of the key point detection on the upsampling feature map output by the upsampling module based on each scale specifically includes:

In particular, the keypoint detection task may be split into two branches, namely, a keypoint heatmap fetch branch and an offset prediction branch. And the key point heat map acquisition branch is used for respectively carrying out key point detection on the up-sampling characteristic maps output by the up-sampling modules of all scales to obtain the key point heat maps of all scales. Each key point in the key point heat map may be stored according to a key point type subchannel, for example, a center point, an upper left corner point, and an upper right corner point may be stored in subchannels, so as to identify the type of each key point in subsequent operations.

In addition, the offset prediction branch is used for determining the key point offset under each scale based on the up-sampling feature map output by the up-sampling module of each scale; and the key point offset in any scale represents the coordinate offset of each key point in the scale when the key point is mapped to the cell image to be segmented from the key point heat map in the scale. Based on the key point heat map and the key point offset in each scale, the position of the key point in the cell image to be segmented in each scale can be calculated, and the key point information in each scale is obtained.

In any of the above embodiments, the loss function of the keypoint prediction module during training comprises keypoint heat map loss and offset loss;

In particular, to constrain the formation of keypoints, keypoint heat map loss and offset loss may be employed to constrain keypoint heat map fetch branches and offset prediction branches, respectively, when training the keypoint prediction module. Wherein the keypoint heat map loss characterizes a difference between an activation result of the sample keypoint heat map and a sample keypoint determined based on an annotation result of the sample cell image. For example, the loss of the key point heat map may be constrained by using BCE loss, according to the following formula, the result of the predicted key point heat map after Sigmoid activation and groudtruth:

wherein, y_iRefers to the Group Truth (GT), ŷ of the pixel point i_iWhich is the predicted value of pixel point i. Here GT is generated by first forming a GT Bbox based on cell labeling, thereby forming three keypoint coordinates, further generating a circle of diameter r based on the coordinates, and resampling the circle at different scales as the GT for obtaining branches for the keypoint heat maps at the scales.

The offset loss represents an error of the sample key point offset obtained by predicting the offset prediction branch, and can be used for eliminating the error brought by discretization. The offset loss can be calculated using the following equation:

the offset loss only calculates the offset error of the position of the key point, and does not calculate the offset errors of other positions. N is the number of cells, p is the absolute position of the key point,

the quantized positions of the keypoints at different scales,

for offset prediction, the loss may be substantially L1 loss.

For the constraint of cell size information (the size of the cell may be the length and width of the cell), it is also possible to use the L1 loss type construction, and the length and width of the cell calculated by Bbox of GT is used as the size of each cell

Thereby constraining regressed cell size information

:

Wherein s is_kThe actual cell size of the kth cell;

is the largest abscissa of the kth cell,

the smallest abscissa of the kth cell,

is the length of the kth cell;

is the largest ordinate of the kth cell,

is the smallest ordinate of the kth cell,

is the width of the kth cell; l_sizeThe average value of the difference between the regressed cell size information and the actual cell size is obtained.

Based on any one of the embodiments, the cell segmentation on the feature map based on the cell detection frame to obtain the cell segmentation result of the cell image to be segmented specifically includes:

the current intercepted fusion feature is subjected to upsampling and then fused with the intercepted feature under the corresponding scale, so that the next intercepted fusion feature is obtained; wherein, the first interception fusion feature is the interception feature under the highest order scale;

Specifically, to avoid mutual interference between cells, the feature maps at various scales may be segmented based on the above-mentioned cell detection frames. As shown in FIG. 6, feature extraction can be respectively carried out on feature maps (F0-F4) under each scale according to any Cell detection frame (Cell Bbox), so as to obtain extraction features under each scale. Starting from the intercepted feature of the feature map (F4) under the highest-order scale, the intercepted feature is subjected to upsampling and then is fused with the intercepted feature under the corresponding scale, and the first intercepted and fused feature is obtained. And then the first interception fusion feature is sampled upwards and then fused with the interception feature under the corresponding scale to obtain the next interception fusion feature. And repeating the steps until the interception features corresponding to the last scale are fused to obtain the last interception fusion feature.

For example, deconvolution with a convolution kernel of 3 × 3 may be performed on the truncated features of the feature map (F4) at the highest order scale, and the number of layers may be reduced from 1024 to 512 by upsampling, and then two-dimensional convolution with a convolution kernel of 1 × 1 is performed after stitching with the truncated vector of the feature map (F3) at the next highest order scale to obtain a new truncated and fused vector (512 × X2 × Y2); and sequentially carrying out the operations until the cut features of the feature graph (F0) corresponding to the lowest-order scale are spliced, and then obtaining the final cut fusion features through the arrangement of a 3-by-3 convolution kernel.

And performing cell mask prediction based on the final intercepted fusion characteristics to obtain a cell mask prediction result corresponding to the cell detection frame. For example, the final truncated fusion feature may be activated by using a Sigmoid function, and a cell mask prediction result corresponding to the cell detection frame may be obtained.

According to any of the above embodiments, the cell segmentation is performed by a cell segmentation module;

Specifically, the cell division operation described above may be realized by a cell division module. The loss function of the cell segmentation module during training includes mask loss and edge loss. The mask loss represents the difference between the sample cell mask prediction result obtained by the cell segmentation module and the cell mask marking result. The loss can form the constraint on the cell area through the sample cell mask prediction result predicted by the cell segmentation module and the GroudTruth. For example, the mask of the cell can be constrained by being based on BCE loss and Dice loss.

For the scenes with more important edges, Edge-aware (Edge-aware) segmentation can be further adopted, and after Edge extraction is respectively performed on the sample cell mask prediction result and the cell mask labeling result (for example, Edge extraction is performed through a sobel operator, a gradient operator and the like), the edges are further constrained through Edge loss (for example, housedorff loss). The edge loss characterization sample cell edge prediction result and the cell edge labeling result are different from each other, and the difference can be used for optimizing the cell edge segmentation effect of the cell segmentation module and improving the cell segmentation effect in an edge sensitive scene.

Based on any of the above embodiments, the cell segmentation method based on the keypoint and size regression can be implemented by a segmentation model based on multi-scale keypoint regression, as shown in fig. 7, the construction process of the model includes:

s1, data collection, cleaning and labeling

Including the collection, cleaning and labeling of data. 200 data samples were collected during the data collection, each sample containing approximately 100 fields of view with 15 microscopic images under each field of view. The cleaning is mainly to the control of image quality, ensures that the image is complete, the definition reaches the standard, and can be selected by artificially referring to a specific quality control standard. Labeling needs to realize cell pixel level labeling, and can be realized based on a Labelme frame. In addition, the data set can be divided into a training set, a test set and a verification set according to the ratio of 8:1:1 in the number of images.

S2, data preprocessing and standardization

Data pre-processing mainly comprises image histogram equalization, resampling (from 2048 down-sampling to 1024) and normalization. If the field of view size is larger, a sliding window segmentation mode is used.

S3, model construction

The model construction environment adopts Python 3.7 and Pytrch 1.2 frameworks, and the main software package relates to Numpy, Pandas, Skymage and the like. The hardware environment is DGX station, and 4 blocks of Nvidia GTX 1080 Ti are adopted. The model integral framework is shown in fig. 8, and comprises:

1. the multi-scale feature extraction network is constructed, and feature maps (E1, E2, E3 and E4) under different scales can be obtained through various backbone networks.

2. Constructing a multi-scale key point regression network, combining the same-scale feature maps through a plurality of levels of upsampling convolution modules to obtain upsampling feature maps (D1, D2, D3 and D4) under each scale and predict a plurality of key point heat maps (including but not limited to a central point) under the multi-scale, and simultaneously respectively obtaining the predicted offset of each key point and the size of a corresponding cell through two regression heads.

3. Candidate detection box Bbox generation

A two-stage Bbox generation mechanism is adopted, in the first stage, a certain number is selected after ranking is respectively carried out according to the predicted values of the key points under each scale (for example, the first 150 key points are selected under KP1, and the number is determined by the possible number of cells under the scene); and a second stage, forming a pair of three key points in a traversal mode, wherein a current central point is determined firstly, then the positions of the upper left point and the lower right point are determined according to the cell size regressed by the central point, finally, a search range is determined in a self-adaptive mode according to the cell size, if the upper left key point and the lower right key point exist in the search range, the upper left key point and the lower right key point are reserved, and forming a pair of the three key points. And finally generating the final Bbox according to the triple key pairing.

4. Bbox selection under multi-scale based on weighted NMS

A weighted NMS approach may be employed here, where the higher the scale the higher the weight of the selected Bbox in the feature map, the higher the confidence of the feature map the higher the weight of the Bbox.

5. Cell segmentation Module construction

The cell segmentation module respectively carries out feature interception on the feature map under each scale according to any cell detection frame obtained in the previous step to obtain intercepted features (C1, C2, C3 and C4) under each scale; starting from the intercepted features (C4 and S4) at the highest-order scale, firstly, carrying out deconvolution with a convolution kernel of 3 x 3 on the feature vector, reducing the number of layers from 1024 to 512, further carrying out parallel connection with the intercepted features (C3) at the second highest-order scale, and then obtaining new intercepted and fused features (S3) through two-dimensional convolution with a convolution kernel of 1 x 1; and sequentially carrying out the operations until the intercepted features (C1) under the lowest order scale are connected in parallel, finishing by a 3-by-3 convolution kernel to obtain the final intercepted fusion features (S1), activating by a Sigmoid function, and obtaining the cell mask prediction result corresponding to the cell detection frame.

6. After the model is built and before large-scale data training is carried out, a smaller data set can be adopted for model pre-training, the sizes of all modules of the model are ensured to be consistent with the scene, and initial setting is carried out on all hyper-parameters, so that the model can be ensured to be converged.

S4, model tuning training and model selection

After the model is built, each parameter in the model needs to acquire an optimal value in one scene through training. During training, within an epoch, the method is divided into two steps, wherein the first step is to constrain the key point branches according to bbox and a central point extracted by the mask information of the cells and corresponding key point information, and the second step is to take the mask information of the cells and the extracted bbox as the input of the cell division branches to train the division branches.

During training, each hyper-parameter needs to be selected and optimized, such as an optimizer, a learning rate curve, the size of an image, the size of a batch and the like, so as to ensure that the overfitting and the underfitting of the model are avoided in the training process. To avoid overfitting, dynamic image augmentation of the data set can be performed. To avoid under-fitting, more complex skeleton networks may be employed. In addition, local computing resources, whether multi-GPU parallel computing is adopted or not, whether an original image is too large or not, whether segmentation sliding window training is adopted or not and the like are also considered.

And establishing an evaluation index of the model, wherein mAP and mIoU can be used as the evaluation index. For example, boundary IoU can be used as an index for evaluating cell edge fine segmentation. By continuously detecting the index performance on the test set, the model with the best index performance can be selected as the final application model after a certain training times. Preferably, in order to improve the generalization capability of the model, multiple times of cross validation may be considered, the training set of the data set may be further divided into a plurality of training and tuning sets, and finally a more robust and generalized model is obtained through model merging (model organizing).

The above model training process can be implemented under a pytorech framework.

S5, model application and cell segmentation

The model is different from the model in application and training, and is concluded by three steps, wherein in the first step, a Bbox list is obtained through a multi-scale key point regression network, in the second step, the combination and extraction of the Bboxes are completed through weighted NMS (network management system) to form a final cell detection frame, and in the third step, the example segmentation of the cells is completed through a cell segmentation module based on the final cell detection frame to form final output.

In order to ensure the performance of the model in application, after deployment is completed, in practical application, it is required to ensure that the network input image is the same as the image preprocessing adopted in training, and finally, the cell segmentation of each image in a practical scene is completed.

By adopting the test data set (174 cases) for evaluation, the effect of the model provided by the embodiment of the invention and the Mask-RCNN example segmentation network is better than that of the model (right side of fig. 9) shown in fig. 9, and the main index pairs of the model and the Mask-RCNN example segmentation network (left side of fig. 9) are shown in table 1:

TABLE 1 comparison of the main indices

	Mask- RCNN	Ours
			mIoU	0.86389	0.8929
Boundary_mIoU	0.29082	0.40291

Based on any of the above embodiments, fig. 10 is a schematic structural diagram of a cell segmentation apparatus based on key point and size regression according to an embodiment of the present invention, as shown in fig. 10, the apparatus includes: a keypoint regression unit 1010, a corner search unit 1020, a detection frame generation unit 1030, and a cell segmentation unit 1040.

The device provided by the embodiment of the invention detects the key points based on the characteristic diagram to obtain each piece of key point information and the cell size information associated with the key point information, then comprehensively utilizes each piece of key point information and the cell size information associated with the key point information to search the corner point information of each piece of central point information, thereby generating and determining a corresponding cell detection frame based on the central point information and the corner point information thereof, and constraining the size of the cell detection frame through the cell size information, searching key point information of the same cell detection frame in a constraint range, avoiding cell segmentation failure caused by confusion of key points of cell detection frames of different cells when the cells are dense, thereby improving the accuracy of cell detection frame prediction, and then cell segmentation is carried out according to each cell detection frame, so that the cell segmentation performance of irregular cells in a dense scene is improved.

Based on any of the above embodiments, searching for corner information corresponding to each piece of center point information in the search range of each piece of center point information in the piece of key point information by using key point information other than the center point as a search object specifically includes:

Based on any of the above embodiments, the performing feature extraction on the cell image to be segmented to obtain a feature map corresponding to the cell image to be segmented, and performing key point detection based on the feature map to obtain key point information and cell size information associated with the key point information specifically includes:

Based on any of the above embodiments, the performing, by the upsampling feature map output by the upsampling module based on each scale, the key point detection to obtain the key point information corresponding to each scale specifically includes:

utilizing the offset prediction branch of the key point prediction module to determine the key point offset under each scale based on the up-sampling characteristic diagram output by the up-sampling module of each scale; wherein the key point offset in any scale represents the coordinate offset of each key point in any scale when the key point is mapped to the cell image to be segmented from the key point heat map in any scale;

Based on any of the above embodiments, the performing, based on the cell detection frame, cell segmentation on the feature map to obtain a cell segmentation result of the cell image to be segmented specifically includes:

According to any of the above embodiments, the cell segmentation is realized by a cell segmentation module;

Fig. 11 illustrates a physical structure diagram of an electronic device, and as shown in fig. 11, the electronic device may include: a processor (processor)1110, a communication Interface (Communications Interface)1120, a memory (memory)1130, and a communication bus 1140, wherein the processor 1110, the communication Interface 1120, and the memory 1130 communicate with each other via the communication bus 1140. Processor 1110 may invoke logic instructions in memory 1130 to perform a method of cell segmentation based on key point and size regression, the method comprising: extracting features of a cell image to be segmented to obtain a feature map corresponding to the cell image to be segmented, and detecting key points based on the feature map to obtain key point information and cell size information related to the key point information; wherein, any key point information comprises the position information and the type of the key point corresponding to the cell detection frame; the cell size information represents the size information of the cell detection frame corresponding to the associated key point information; respectively searching the corner information corresponding to each piece of central point information in the searching range of each piece of central point information by taking the key point information except the central point as a searching object; the search range of any central point information is determined based on the cell size information related to any central point information; generating a cell detection frame based on the central point information and the corner point information corresponding to the central point information; and based on the cell detection frame, carrying out cell segmentation on the characteristic diagram to obtain a cell segmentation result of the cell image to be segmented.

In addition, the logic instructions in the memory 1130 may be implemented in software functional units and stored in a computer readable storage medium when sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention or a part thereof which substantially contributes to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being stored on a non-transitory computer readable storage medium, wherein when the computer program is executed by a processor, the computer is capable of executing the cell segmentation method based on the key point and size regression provided by the above methods, the method comprising: extracting features of a cell image to be segmented to obtain a feature map corresponding to the cell image to be segmented, and detecting key points based on the feature map to obtain key point information and cell size information related to the key point information; wherein, any key point information comprises the position information and the type of the key point corresponding to the cell detection frame; the cell size information represents size information of a cell detection frame corresponding to the associated key point information; respectively searching the corner information corresponding to each piece of central point information in the searching range of each piece of central point information by taking the key point information except the central point as a searching object; the searching range of any central point information is determined and obtained based on cell size information related to any central point information; generating a cell detection frame based on each piece of central point information and the corner point information corresponding to each piece of central point information; and based on the cell detection frame, carrying out cell segmentation on the characteristic diagram to obtain a cell segmentation result of the cell image to be segmented.

In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to perform the cell segmentation method based on the key point and size regression provided by the above methods, the method comprising: extracting features of a cell image to be segmented to obtain a feature map corresponding to the cell image to be segmented, and detecting key points based on the feature map to obtain key point information and cell size information related to the key point information; wherein, any key point information comprises the position information and the type of the key point corresponding to the cell detection frame; the cell size information represents size information of a cell detection frame corresponding to the associated key point information; respectively searching the corner information corresponding to each piece of central point information in the searching range of each piece of central point information by taking the key point information except the central point as a searching object; the search range of any central point information is determined based on the cell size information related to any central point information; generating a cell detection frame based on the central point information and the corner point information corresponding to the central point information; and based on the cell detection frame, carrying out cell segmentation on the characteristic diagram to obtain a cell segmentation result of the cell image to be segmented.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment may be implemented by software plus a necessary general hardware platform, and may also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A cell segmentation method based on key point and size regression is characterized by comprising the following steps:

respectively searching the corner information corresponding to each piece of central point information in the searching range of each piece of central point information by taking the key point information except the central point as a searching object; the searching range of any central point information is determined and obtained based on cell size information related to any central point information;

and performing cell segmentation on the characteristic diagram based on the cell detection frame to obtain a cell segmentation result of the cell image to be segmented.

2. The method according to claim 1, wherein searching for the corner point information corresponding to each piece of center point information in the search range of each piece of center point information in the piece of key point information by using the piece of key point information other than the center point as a search object comprises:

3. The cell segmentation method based on the regression between the key point and the size according to claim 1, wherein the extracting the feature of the cell image to be segmented to obtain the feature map corresponding to the cell image to be segmented, and detecting the key point based on the feature map to obtain the key point information and the cell size information associated with the key point information specifically comprises:

4. The method according to claim 3, wherein the generating a cell detection frame based on the center point information and the corner point information corresponding to the center point information specifically comprises:

5. The method for cell segmentation based on key point and size regression according to claim 3, wherein the key point detection is performed on the upsampled feature map output by the upsampling module based on each scale, and specifically comprises:

utilizing the offset prediction branch of the key point prediction module to determine the key point offset under each scale based on the up-sampling characteristic diagram output by the up-sampling module of each scale; the key point offset under any scale represents the coordinate offset of each key point under any scale when the key point is mapped to the cell image to be segmented from the key point heat map under any scale;

6. The method of key point and size regression-based cell segmentation method of claim 5 wherein the loss function of the key point prediction module during training comprises key point heat map loss and offset loss;

7. The method of claim 3, wherein the cell segmentation is performed on the feature map based on the cell detection frame to obtain a cell segmentation result of the cell image to be segmented, and specifically comprises:

8. The method of claim 7, wherein the cell segmentation is performed by a cell segmentation module;

9. A cell segmentation device based on key point and size regression is characterized by comprising:

the corner searching unit is used for searching the corner information corresponding to each piece of central point information by taking the piece of central point information except the central point as a searching object in the searching range of each piece of central point information; the searching range of any central point information is determined and obtained based on cell size information related to any central point information;

10. A non-transitory computer-readable storage medium, on which a computer program is stored, wherein the computer program, when being executed by a processor, implements the steps of the method for cell segmentation based on key point and size regression according to any one of claims 1 to 8.