CN116844161B

CN116844161B - Cell detection classification method and system based on grouping prompt learning

Info

Publication number: CN116844161B
Application number: CN202311126734.XA
Authority: CN
Inventors: 李灏峰; 黄俊嘉; 万翔; 李冠彬
Original assignee: Shenzhen Research Institute of Big Data SRIBD
Current assignee: Shenzhen Research Institute of Big Data SRIBD
Priority date: 2023-09-04
Filing date: 2023-09-04
Publication date: 2024-03-05
Anticipated expiration: 2043-09-04
Also published as: CN116844161A

Abstract

The invention belongs to the technical field of cell detection classification, and provides a cell detection classification method and system based on grouping prompt learning. The method comprises the following steps: obtaining a pathological image; copying the parameter weight of the feature extractor in the initial model to the cell detection classification model, and freezing the parameter weight; after the initialization grouping prompt is embedded, the initialization grouping prompt and the pathology image are input into a feature extractor of a cell detection classification model together, and multi-scale features of the pathology image are extracted and encoded through a feature encoder; preliminary prediction is carried out on the cell position and the category in the pathological image, and then candidate object characteristics are obtained through calculation; and obtaining the characteristics of the fused candidate objects through the cross attention calculation, and then inputting the characteristics of the fused candidate objects and the initialized grouping prompt embedding into a cell detection classification model together for classification and positioning, and outputting a cell detection classification result. The invention can effectively improve the cell detection and classification performance by freezing parameters and embedding the grouping and prompting of the cell detection classification model into the sharing weight.

Description

Cell detection classification method and system based on grouping prompt learning

Technical Field

The invention relates to the technical field of cell detection classification, in particular to a cell detection classification method and system based on grouping prompt learning.

Background

The digital pathological image analysis plays a very important role in pathological diagnosis, not only can provide abundant information for computer-aided diagnosis, but also can provide data support for clinical practice and research. In the current cell detection classification method, most methods select a deep neural network model with a U-shaped structure and perform training and prediction based on semantic segmentation tasks; although the method can extract rich features from the cell image and obtain deep semantic information, the following defects also exist: 1. pixel-level labeling requires relatively high cost in performing training predictions, and often requires post-processing to distinguish between different cell-verification cases, depending on Watershed algorithms, etc., thus requiring relatively long prediction times; 2. cells forming groups or clusters due to semantic similarity are usually ignored when semantic segmentation is carried out, and mining of cell grouping information is lacked, so that the performance of cell detection and classification is poor; 3. the device of the method is relatively fixed, and the deployment and fine adjustment cost is relatively high when different pathological images are detected.

Therefore, we need to develop a cell detection classification method and system based on grouping prompt learning, which can fully utilize the connection between cells, dig the internal grouping information of the cells, and simultaneously achieve the purposes of improving the cell detection and classification performance and being convenient for deployment and fine adjustment without high cost and additional post-treatment.

Disclosure of Invention

The invention aims to provide a cell detection and classification method and system based on grouping prompt learning, which are used for solving the problems of time consumption, higher cost and poor detection and classification performance of the existing detection and classification method in the background technology.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

according to one aspect of the invention, a cell detection classification method based on grouping prompt learning is provided, and the method specifically comprises the following steps:

obtaining a pathological image;

copying the parameter weight of the feature extractor in the initial model to the feature extractor of the cell detection classification model based on grouping prompt learning, and freezing the parameter weight;

initializing grouping prompt embedding, inputting the grouping prompt embedding and the pathological image into a feature extractor of the cell detection classification model, and extracting multi-scale features of the pathological image;

encoding the multi-scale features through a feature encoder to obtain encoded multi-scale features;

preliminary prediction is carried out on the cell position and the category in the pathological image through the encoded multi-scale characteristics, and candidate object characteristics are obtained through calculation according to the preliminary prediction result; performing cross attention calculation on the candidate object features and the encoded multi-scale features to obtain fused candidate object features;

and embedding the initialized grouping prompt and inputting the merged candidate object features into a group classifier for classification, and simultaneously inputting the merged candidate object features into a position prediction network for positioning, and outputting a cell detection classification result.

According to another aspect of the present invention, there is provided a cell detection classification system based on packet hint learning, the system comprising: the device comprises an image acquisition module, a parameter processing module, a feature extraction module, a feature encoding module, a feature decoding module and a detection classification module. Wherein:

the image acquisition module is used for acquiring pathological images;

the parameter processing module is used for copying the parameter weight of the feature extractor in the initial model to the feature extractor of the cell detection classification model based on grouping prompt learning, and freezing the parameter weight;

the feature extraction module is used for initializing grouping prompt embedding, inputting the grouping prompt embedding and the pathological image into a feature extractor of the cell detection classification model, and extracting multi-scale features of the pathological image;

the feature coding module is used for coding the multi-scale features through a feature coder to obtain coded multi-scale features;

the feature decoding module is used for carrying out preliminary prediction on the cell position and the category in the pathological image through the encoded multi-scale features, and calculating to obtain candidate object features according to a preliminary prediction result; performing cross attention calculation on the candidate object features and the encoded multi-scale features to obtain fused candidate object features;

the detection classification module is used for embedding the initialized grouping prompt and inputting the merged candidate object characteristics into a group classifier for classification, and simultaneously inputting the merged candidate object characteristics into a position prediction network for positioning and outputting a cell detection classification result.

Based on the foregoing solution, the embedding of the initialized grouping prompt and the fused candidate object feature are input into a group classifier together for classification, and specifically includes the following steps:

embedding the initialized grouping prompts into the fused candidate object features for inner product by combining the learnable weights, and obtaining a similarity matrix through Gumbel-Softmax operation:

；

wherein,for the similarity matrix obtained by Gumbel-Softmax operation, +.>Embedding +.>For the fused candidate object feature, +.>、/>Are all the learnable weights, < ->For independent co-distributed random samples extracted from Gumbel (0, 1) distribution, +.>Temperature coefficient of Softmax, +.>Is a transposition operation;

performing one-hot coding on the similarity matrix to obtain a coded similarity matrix; performing point multiplication operation on the encoded similarity matrix and the fused candidate object features, and combining the initialized grouping prompt for embedding to obtain primary grouping features:

；

wherein,for the encoded similarity matrix, +.>For the similarity matrix obtained by Gumbel-Softmax operation, +.>For gradient stopping operation +.>For the primary grouping feature->Embedding +.>、/>Are all the learnableWeight of->For the fused candidate object feature, +.>For the number of packets>The dimension of one-hot coding is carried out on the similar matrix;

repeating the operation, and carrying out similar matrix calculation and matching on the primary grouping characteristic and the learnable grouping category characteristic to obtain a final category characteristic; the learnable grouping category features are C D-dimensional learnable vectors, and C is the category number in the acquired pathological image dataset;

obtaining a final classification result according to the final classification characteristic and the fused candidate object characteristic:

；

wherein,for the final classification result, ++>For the final class feature->For the fused candidate object feature, +.>Is a transpose operation.

Based on the foregoing scheme, the initial model is a transducer-based initial model, including a feature extractor, a feature encoder, a feature decoder, a location prediction network, and a class prediction network.

Based on the scheme, after the corresponding pathological image of the corresponding disease of the patient is acquired through professional equipment or a network channel, the pathological image is subjected to data processing, the cell nucleus centroid or the cell centroid and the class are marked as training data, the initial model based on the transducer is trained through the training data, and the parameter weight of the feature extractor in the initial model is reserved.

Based on the foregoing scheme, the grouping hints are embedded as G D-dimensional learnable vectors, and the initializing grouping hints are embedded to initialize G D-dimensional learnable vectors to all-zero vectors.

Based on the above scheme, the construction process of the cell detection classification model based on grouping prompt learning is as follows: and on an initial model based on a transducer, replacing the class prediction network with a group classifier, and embedding the initialized grouping prompt into a feature extractor serving as a prompt embedding input of the cell detection classification model and a grouping prompt into the group classifier, so that the prompt embedding and grouping embedding in the cell detection classification model share parameters.

Based on the foregoing scheme, the multi-scale features include second stage, third stage and fourth stage features and fifth stage features obtained by convolutionally sampling the fourth stage features and reducing the resolution.

Based on the foregoing scheme, the cross attention calculation is performed by taking the candidate object feature as an inquiry matrix Q, and the encoded multi-scale feature is calculated as a key value matrix K and a value matrix V, and the specific calculation formula is as follows:

；

wherein,output the result for the attention mechanism,/for>Is the transpose of the key value matrix, +.>Is the dimension of the key-value matrix.

Based on the foregoing scheme, the outputting the cell detection classification result further includes:

for the classification result, prompting and learning are carried out by adopting labeled cell category data based on the classification task;

and for the positioning result, performing prompt learning by adopting marked cell nucleus barycenter or coordinate information of the cell barycenter based on a regression task, performing training update on parameters of the cell detection classification model based on grouping prompt learning, and fixing parameter weights of the cell detection classification model after training is completed.

Compared with the prior art, the invention has at least the following advantages and positive effects:

(1) According to the invention, the cell detection classification model is trained as training data after the cell nucleus centroid or the cell centroid and the class of the pathological image are marked, so that the pixel-level marking is not required, the additional post-treatment is not required, the detection cost is saved, and the classification time is shortened.

(2) According to the invention, on the basis of the initial model based on the transducer, the class prediction network is replaced by the group classifier to construct the cell detection classification model based on the group prompt learning, the connection among cells is fully considered, and the inherent group information of the cells can be mined, and the deployment and the fine adjustment are convenient.

(3) The initialized grouping prompt is embedded into the feature extractor serving as the prompt to be embedded into the cell detection classification model and also serves as the grouping to be embedded into the input group classifier, so that the prompt embedding and grouping embedding in the cell detection classification model share parameters, the parameter updating amount can be reduced, and low-order feature information can be introduced into the group classifier, thereby enhancing the overall perception capability of pathological images and improving the cell detection and classification performance.

(4) After the cell detection classification result is output, the classification result and the positioning result are respectively subjected to prompt learning, and the parameters of the cell detection classification model are updated or finely adjusted, so that the follow-up detection of other pathological images is facilitated, and the method can be suitable for different disease types.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below only relate to some embodiments of the invention and that other drawings can be obtained from them without inventive effort for a person skilled in the art.

FIG. 1 shows a flow chart of a cell detection classification method based on packet prompt learning according to an embodiment of the present invention;

FIG. 2 is a flow chart of a method for prompt classification based on a group classifier according to an embodiment of the present invention;

FIG. 3 is a structural framework diagram of a cell detection classification model based on grouping hint learning according to an embodiment of the present invention;

FIG. 4 is a schematic diagram showing a structure of a cell detection classification system based on packet prompt learning according to an embodiment of the present invention;

FIG. 5 is a block diagram of a group classifier according to an embodiment of the present invention.

Wherein reference numerals in fig. 3 are explained as follows:

1-pathology image; 2-outputting a cell detection classification result; 3-full connection layer and standardized operation; 4-block encoding operations; 5-embedding a grouping prompt; 6-each stage of feature extraction; 7-convolution and normalization operations; 8-a feature extractor; 9-downsampling and stitching alignment operations; 10-a deformable attention layer; 11-feature encoder; 12-feature decoder; 13-fused candidate features; a 14-packet transducer module; 15-a location prediction network; 16-cross-attention operation; 17-freezing parameters; 18-training parameters; 19-initial detection result; 20-candidate features.

Wherein reference numerals in fig. 4 are explained as follows:

300-a cell detection classification system based on packet prompt learning; 301-an image acquisition module, 3011-an image acquisition unit, 3012-an image annotation unit; 302-a parameter processing module, 3021-a parameter retention unit, 3022-a parameter copying unit, 3023-a parameter freezing unit; 303-a feature extraction module; 304-a feature encoding module; a 305-feature decoding module, a 3051-prediction calculation unit, a 3052-cross calculation unit; 306-detection classification module, 3061-detection classification unit, 3062-detection positioning unit.

Wherein reference numerals in fig. 5 are explained as follows:

501-embedding a grouping prompt; 502-fused candidate features; 503-primary packet feature; 504-a learnable packet class feature; 505-final category feature; 506-dot multiplication operation; 507-element-by-element addition operation; 508-a learnable weight; 509-Gumbel-Softmax procedure.

Detailed Description

For a clearer explanation of the objects, technical solutions and advantages of the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, but not all embodiments, and that the exemplary embodiments can be implemented in various forms and should not be construed as being limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art.

Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however, that the invention may be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known methods, devices, implementations, or operations are not shown or described in detail to avoid obscuring aspects of the invention.

The block diagrams depicted in the figures are merely functional entities and do not necessarily correspond to physically separate entities. That is, the functional entities may be implemented in software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

The flow diagrams depicted in the figures are exemplary only, and do not necessarily include all of the elements and operations/steps, nor must they be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the order of actual execution may be changed according to actual situations.

The invention will be described in detail with reference to specific examples below:

example 1

As shown in fig. 1, the embodiment of the invention provides a cell detection classification method based on grouping prompt learning, which comprises the following specific steps:

s1: obtaining a pathological image;

preferably, in the present embodiment, the pathology image may be either a histopathology image or a cytopathology image.

Specifically, after a pathology image of a corresponding disease of a patient is acquired through professional equipment or a network channel, performing data processing on the pathology image; if the pathological image is a histopathological image, labeling the centroid and the category of the cell nucleus as training data; if the pathology image is a histopathology image, labeling the centroid and the category of the cell as training data.

S2: copying the parameter weight of the feature extractor in the initial model to the feature extractor of the cell detection classification model based on grouping prompt learning, and freezing the parameter weight;

preferably, the initial model is a transducer-based initial model including a feature extractor, a feature encoder, a feature decoder, a location prediction network, and a class prediction network.

Further, before copying the parameter weights of the feature extractor in the initial model, the method further comprises: and training the initial model based on the transducer according to the training data obtained in the step S1, and reserving the parameter weight of the feature extractor in the initial model.

Preferably, the cell detection classification model based on the grouping prompt learning mainly comprises: the model structure frame diagram of the feature extractor, the feature encoder, the feature decoder, the position prediction network and the group classifier is shown in fig. 3. Specifically, the construction process of the cell detection classification model based on grouping prompt learning is as follows: and on an initial model based on a transducer, replacing the class prediction network with a group classifier, and embedding the initialized group prompt into a feature extractor which is used as a prompt to be input into the cell detection classification model and also used as a group to be input into the group classifier, so that the prompt embedding and the group embedding in the cell detection classification model share parameters.

In this embodiment, the prompt embedding and grouping embedding in the cell detection classification model are enabled to share parameters, so that the parameter updating amount can be reduced, low-order characteristic information can be provided for the group classifier through the prompt embedding, and the group classifier can achieve better cell detection and classification performance.

S3: initializing grouping prompt embedding, inputting the grouping prompt embedding and the pathological image into a feature extractor of the cell detection classification model, and extracting multi-scale features of the pathological image;

preferably, in the present embodiment, the grouping hint is embedded as G D-dimensional learnable vectors, and the initializing grouping hint is embedded as initializing G D-dimensional learnable vectors to all-zero vectors.

Further, after copying the parameter weights of the feature extractors in the initial model to the feature extractors of the cell detection classification model based on grouping prompt learning and freezing the parameter weights, embedding the grouping prompt into an initialized all-zero vector, inputting the initialized all-zero vector and the initialized all-zero vector into the feature extractors of the cell detection classification model together with a pathological image, and extracting multi-scale features of the pathological image.

Preferably, in the present embodiment, the multi-scale features include second stage, third stage and fourth stage features and fifth stage features obtained by convolutionally sampling the fourth stage features and reducing the resolution. The implementation of the feature extractor is not limited to the SwinTransformer-Base model, but may be implemented as any transducer-based model.

S4: encoding the multi-scale features through a feature encoder to obtain encoded multi-scale features;

further, inputting the multi-scale features extracted in the step S3 into a feature encoder, and encoding the multi-scale features through the feature encoder to obtain encoded multi-scale features.

Preferably, the implementation of the feature encoder is not limited to deformable attention layer based models, but can be implemented as any transducer or multi-layer perceptron based model.

In this embodiment, the feature encoder uses a model with three deformable attention layers, and when self-attention is calculated later, the deformable attention mechanism will propose some learnable recommended points, and each feature only performs calculation and learning with the position feature of the recommended point, so as to reduce the calculation amount and improve the network fitting speed.

S5: preliminary prediction is carried out on the cell position and the category in the pathological image through the encoded multi-scale characteristics, and candidate object characteristics are obtained through calculation according to the preliminary prediction result; performing cross attention calculation on the candidate object features and the encoded multi-scale features to obtain fused candidate object features;

further, the cell positions and the categories of N cells in the pathological image are preliminarily predicted by using the encoded multi-scale features, and candidate object features are calculated in a learnable neural network layer according to the preliminary prediction results. In this embodiment, the feature decoder also includes three deformable attention layers, and after the candidate object feature is obtained by calculation, the candidate object feature is used as an inquiry matrix Q, the encoded multi-scale feature is used as a key value matrix K and a value matrix V, and cross attention calculation is performed to obtain the fused candidate object feature.

Further, the calculation formula of the cross attention is as follows:

；

S6: and embedding the initialized grouping prompt and inputting the merged candidate object features into a group classifier for classification, and simultaneously inputting the merged candidate object features into a position prediction network for positioning, and outputting a cell detection classification result.

Preferably, in this embodiment, the group classifier is a group transducer module, and its structure frame diagram is shown in fig. 5; the implementation of the group classifier (group transducer module) is not limited to being deployed at the end of the cell detection classification model, but may be deployed at any location within the cell detection classification model; when detecting different types of pathological images, the system can be adjusted according to actual conditions, and has the advantage of being convenient to deploy. The position prediction network is a full connection layer and can position the input characteristics.

Further, the fused candidate object features obtained in the step S5 are input into a group classifier (a group transducer module) and a position prediction network of the cell detection classification model for classification and positioning, and finally, a cell detection classification result is output.

Further, for the classification result, prompting and learning are carried out by adopting marked cell category data based on the classification task; and for the positioning result, performing prompt learning by adopting marked cell nucleus barycenter or coordinate information of the cell barycenter based on a regression task, performing training update on parameters of the cell detection classification model based on grouping prompt learning, and fixing parameter weights of the cell detection classification model after training is completed.

According to the cell detection classification method based on grouping prompt learning, detection can be performed after only a small amount of data sets are collected for fine adjustment aiming at different diseases or diseased parts; meanwhile, the cell detection classification model provided by the embodiment of the invention can adapt to image detection of different disease types by only fine-tuning a small amount of parameters, is convenient to deploy and has high practicability.

Example 2

As shown in fig. 2, an embodiment of the present invention provides a method for performing prompt classification based on a group classifier, where the method includes the following steps:

s201: embedding the initialized grouping prompts into the fused candidate object features for inner product by combining the learnable weights, and obtaining a similarity matrix through Gumbel-Softmax operation;

preferably, in the present embodiment, the grouping cues are embedded into G D-dimensional leachable vectors, and when the similarity matrix calculation and matching are performed subsequently, the correspondence is divided into G groups; the calculation formula of the similarity matrix is as follows:

；

wherein,for the similarity matrix obtained by Gumbel-Softmax operation, +.>Embedding +.>For the fused candidate object feature, +.>、/>Are all the learnable weights, < ->For independent co-distributed random samples extracted from Gumbel (0, 1) distribution, +.>Temperature coefficient of Softmax, +.>Is a transpose operation.

S202: performing one-hot coding on the similarity matrix to obtain a coded similarity matrix; performing point multiplication operation on the encoded similarity matrix and the fused candidate object characteristics, and combining the initialized grouping prompt for embedding to obtain primary grouping characteristics;

further, after obtaining a similar matrix through Gumbel-Softmax operation, adopting a hard alignment strategy to directly perform one-hot coding on the similar matrix, wherein the formula is as follows:

；

wherein,for the encoded similarity matrix, +.>For a similarity matrix obtained by gummel-Softmax operation,the gradient stopping operation is used for correctly returning the gradient.

Further, performing point multiplication operation on the encoded similarity matrix and the fused candidate object feature, and combining the initialized grouping prompt for embedding to obtain a primary grouping feature, wherein the calculation formula is as follows:

；

wherein,for the primary grouping feature->Embedding +.>、/>Are all the learnable weights, < ->For the encoded similarity matrix, +.>For the fused candidate object feature, +.>For the number of packets>For the dimension of one-hot encoding of the similarity matrix.

S203: repeating the operation, and carrying out similar matrix calculation and matching on the primary grouping characteristic and the learnable grouping category characteristic to obtain a final category characteristic;

preferably, in this embodiment, the learnable grouping category features C D-dimensional learnable vectors, and C is the number of categories in the acquired pathological image dataset.

Further, the obtained primary grouping feature and the learnable grouping category feature are subjected to similar matrix calculation and matching, and the calculation process is the same as that of the steps S201 and S202, so that the final category feature is obtained.

S204: and obtaining a final classification result according to the final classification characteristic and the fused candidate object characteristic.

Further, performing point multiplication operation, and calculating to obtain a final classification result according to the obtained final classification characteristic and the fused candidate object characteristic, wherein a calculation formula is as follows:

；

Example 3

As shown in fig. 4, an embodiment of the present invention provides a cell detection classification system 300 based on packet prompt learning, the system comprising: an image acquisition module 301, a parameter processing module 302, a feature extraction module 303, a feature encoding module 304, a feature decoding module 305, and a detection classification module 306. Wherein:

an image acquisition module 301, configured to acquire a pathology image;

the image acquisition module 301 may acquire a pathology image of the corresponding disease of the patient through a professional device or a network channel, and the pathology image may be a histopathology image or a cytopathology image.

The image acquisition module 301 includes: an image acquisition unit 3011 and an image labeling unit 3012; wherein:

the image acquisition unit 3011 is configured to: obtaining a pathological image;

the image labeling unit 3012 is configured to: labeling the acquired pathological image and sending the labeled pathological image to the parameter processing module 302; if the pathology image is a histopathology image, labeling the centroid and the category of the cell nucleus, and if the pathology image is a cytopathology image, labeling the centroid and the category of the cell nucleus.

A parameter processing module 302, configured to copy the parameter weights of the feature extractors in the initial model to the feature extractors of the cell detection classification model based on the packet hint learning, and freeze the parameter weights;

The cell detection classification model based on the grouping prompt learning is constructed by replacing the classification prediction network with a grouping classifier (grouping transducer module) on an initial model based on a transducer.

The parameter processing module 302 includes: a parameter retaining unit 3021, a parameter copying unit 3022, and a parameter freezing unit 3023; wherein:

the above-described parameter retention unit 3021 is configured to: training the initial model based on the Transformer according to the marked pathological image sent by the image marking unit 3012, and reserving the parameter weight of the feature extractor in the initial model;

the above-described parameter copy unit 3022 is configured to: copying the parameter weights of the feature extractor in the initial model retained in the parameter retaining unit 3021 to the feature extractor of the cell detection classification model based on the grouping hint learning;

the above-described parameter freezing unit 3023 is configured to: freezing the parameter weight of the feature extractor in the cell detection classification model.

The feature extraction module 303 is configured to initialize packet prompt embedding, input the packet prompt embedding and the pathological image into a feature extractor of the cell detection classification model, and extract multi-scale features of the pathological image;

in the present embodiment, the feature extractor in the feature extraction module 303 extracts multi-scale features of the pathological image, where the multi-scale features include a second stage feature, a third stage feature, and a fourth stage feature, and a fifth stage feature obtained by convolutionally sampling the fourth stage feature and reducing the resolution. The implementation of the feature extractor is not limited to the SwinTransformer-Base model, but may be implemented as any transducer-based model.

The feature encoding module 304 is configured to encode the multi-scale feature by using a feature encoder to obtain an encoded multi-scale feature;

further, after the multi-scale features of the pathology image are extracted, the multi-scale features are encoded by a feature encoder in the feature encoding module 304. In this embodiment, the feature encoder uses a model with three deformable attention layers, and when self-attention is calculated later, the deformable attention mechanism will propose some learnable recommended points, and each feature only performs calculation and learning with the position feature of the recommended point, so as to reduce the calculation amount and improve the network fitting speed.

The feature decoding module 305 is configured to perform preliminary prediction on the cell position and the category in the pathological image according to the encoded multi-scale feature, and calculate a candidate object feature according to a preliminary prediction result; performing cross attention calculation on the candidate object features and the encoded multi-scale features to obtain fused candidate object features;

preferably, in this embodiment, the feature decoding module 305 has a feature decoder embedded therein, and the feature decoder also includes three deformable attention layers, and the specific decoding process is: after the feature encoding module 304 encodes the multi-scale features, preliminary prediction is performed on the cell positions and the categories of the N cells in the input pathological image through the encoded multi-scale features, and candidate object features are calculated in a learnable neural network layer according to the preliminary prediction result; and taking the candidate object features as an inquiry matrix, taking the encoded multi-scale features as a key value matrix and a value matrix to perform cross attention calculation, and finally obtaining the fused candidate object features.

The feature decoding module 305 includes: a prediction calculation unit 3051 and a cross calculation unit 3052; wherein:

the prediction calculation unit 3051 is configured to: performing preliminary prediction on the cell position and the category in the pathological image through the encoded multi-scale features, and calculating according to a preliminary prediction result to obtain candidate object features;

the cross calculation unit 3052 is configured to: and performing cross attention calculation on the candidate object features and the encoded multi-scale features to obtain fused candidate object features.

The detection classification module 306 is configured to embed the initialized grouping prompt and input the grouping prompt and the fused candidate object feature together into a group classifier to classify, and input the fused candidate object feature into a position prediction network to position and output a cell detection classification result.

The detection classification module 306 includes: a detection classification unit 3061 and a detection positioning unit 3062; wherein:

the detection classification unit 3061 is configured to: detecting and classifying the input fused candidate object features through a group classifier, and outputting a classification result;

the detection positioning unit 3062 is configured to: and detecting and positioning the input fused candidate object features through a position prediction network, and outputting a positioning result.

And finally, integrating the classification result and the positioning result, and outputting a final cell detection classification result.

In this embodiment, the initialized packet prompt is embedded into the feature extractor of the input feature extraction module 303 as a prompt, and is also embedded into the packet classifier of the input detection classification module 306 as a packet, so that the prompt embedding and the packet embedding in the cell detection classification model share parameters, the parameter updating amount is reduced, and the cell detection and classification performance is improved.

Example 4

As shown in the following table, the embodiment of the invention provides a result of evaluating the detection and classification performance of a cell detection classification model:

the detection and classification performance of the cell detection classification model is compared by experimentally embedding and grouping the shared hint with and without sharing the weights. The experiment is carried out on a CoNSeP data set, and the F1 index is adopted to evaluate the detection and classification results of the model; in the table、/>、/>Classification F1 results, respectively representing the model on three cell (inflammation, epithelium, stroma) image categories in the CoNSeP dataset, +.>Average F1 results representing the three cell image categories,/->The results of the model for cell detection F1 are shown.

It can be seen from the table that after sharing the parameter weights of prompt embedding and grouping embedding, the classification accuracy of the model on three cell image categories is improved by about 8%, 2.6% and 5.9%, and the detection performance of the model on cells is improved by about 3.5%, which means that the method disclosed by the embodiment of the invention can effectively improve the cell detection and classification performance of the cell detection classification model.

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims. It is to be understood that the invention is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. The cell detection classification method based on grouping prompt learning is characterized by comprising the following steps:

obtaining a pathological image;

copying the parameter weight of the feature extractor in the initial model to the feature extractor of the cell detection classification model based on grouping prompt learning, and freezing the parameter weight; the initial model is a transducer-based initial model, and the copying of the parameter weights of the feature extractor in the initial model further comprises:

performing data processing on the pathological image, and labeling a cell nucleus centroid or a cell centroid and a class as training data;

training the initial model based on the transducer through the training data, and reserving the parameter weight of a feature extractor of the initial model;

initializing grouping prompt embedding, inputting the grouping prompt embedding and the pathological image into a feature extractor of the cell detection classification model, and extracting multi-scale features of the pathological image; the grouping prompt is embedded into G D-dimensional learnable vectors, G is the number of the groups, and the initializing grouping prompt is embedded into initializing G D-dimensional learnable vectors into all-zero vectors;

embedding the initialized grouping prompt and inputting the merged candidate object features into a group classifier for classification, and simultaneously inputting the merged candidate object features into a position prediction network for positioning, and outputting a cell detection classification result;

the step of embedding the initialized grouping prompt and inputting the merged candidate object features into a group classifier for classification specifically comprises the following steps:

embedding the initialized grouping prompts into the fused candidate object features for inner product by combining the learnable weights, and obtaining a similarity matrix through Gumbel-Softmax operation;

performing one-hot coding on the similarity matrix to obtain a coded similarity matrix; performing point multiplication operation on the encoded similarity matrix and the fused candidate object characteristics, and combining the initialized grouping prompt for embedding to obtain primary grouping characteristics;

and obtaining a final classification result according to the final classification characteristic and the fused candidate object characteristic.

2. The cell detection and classification method according to claim 1, wherein the similarity matrix is obtained by gummel-Softmax operation, and the calculation formula is:

；

3. The cell detection classification method according to claim 1, wherein the calculation formula of the primary grouping feature is:

；

wherein,for the encoded similarity matrix, +.>For a similarity matrix obtained by gummel-Softmax operation,for gradient stopping operation +.>For the primary grouping feature->Embedding the initialized grouping prompts,、/>are all the learnable weights, < ->For the fused candidate object feature, +.>For the number of packets>For the dimension of one-hot encoding of the similarity matrix.

4. The cell detection and classification method according to claim 1, wherein the final classification result is:

；

5. The method of claim 1, wherein the multi-scale features include second, third and fourth stage features and a fifth stage feature obtained by convolutionally sampling the fourth stage features and reducing the resolution.

6. The method according to claim 1, wherein outputting the result of cell detection classification further comprises:

7. A cell detection classification system based on packet hint learning, comprising:

the image acquisition module is used for acquiring pathological images;

the parameter processing module is used for copying the parameter weight of the feature extractor in the initial model to the feature extractor of the cell detection classification model based on grouping prompt learning, and freezing the parameter weight; the initial model is a transducer-based initial model, and the copying of the parameter weights of the feature extractor in the initial model further comprises:

the feature extraction module is used for initializing grouping prompt embedding, inputting the grouping prompt embedding and the pathological image into a feature extractor of the cell detection classification model, and extracting multi-scale features of the pathological image; the grouping prompt is embedded into G D-dimensional learnable vectors, G is the number of the groups, and the initializing grouping prompt is embedded into initializing G D-dimensional learnable vectors into all-zero vectors;

the detection classification module is used for embedding the initialized grouping prompt and inputting the same with the fused candidate object characteristics into a group classifier for classification, and simultaneously inputting the fused candidate object characteristics into a position prediction network for positioning and outputting a cell detection classification result;