CN116385823B - Semi-supervised segmentation model generation method and system for class semantic consistency representation - Google Patents

Semi-supervised segmentation model generation method and system for class semantic consistency representation Download PDF

Info

Publication number
CN116385823B
CN116385823B CN202310271384.XA CN202310271384A CN116385823B CN 116385823 B CN116385823 B CN 116385823B CN 202310271384 A CN202310271384 A CN 202310271384A CN 116385823 B CN116385823 B CN 116385823B
Authority
CN
China
Prior art keywords
scale
semantic
segmentation
image data
perception
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310271384.XA
Other languages
Chinese (zh)
Other versions
CN116385823A (en
Inventor
张瑞茂
朱烨
杨杰
万翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Research Institute of Big Data SRIBD
Original Assignee
Shenzhen Research Institute of Big Data SRIBD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Research Institute of Big Data SRIBD filed Critical Shenzhen Research Institute of Big Data SRIBD
Priority to CN202310271384.XA priority Critical patent/CN116385823B/en
Publication of CN116385823A publication Critical patent/CN116385823A/en
Application granted granted Critical
Publication of CN116385823B publication Critical patent/CN116385823B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/52Scale-space analysis, e.g. wavelet analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a semi-supervised segmentation model generation method, a system, computer equipment and a storage medium for class semantic consistency representation. And the category semantic guidance segmentation module guides unlabeled image data to generate a multi-scale semantic guidance segmentation map by utilizing multi-scale category semantic perception query, and performs consistency constraint learning with a prediction segmentation map finally output by the segmentation network, so that the aim of using a large amount of unlabeled image data to participate in training the whole segmentation network is fulfilled. And the training data contains only a small number of annotated medical images. In practical training, accurate segmentation of each organ in the medical image can be realized by using only a small amount of marked image data.

Description

Semi-supervised segmentation model generation method and system for class semantic consistency representation
Technical Field
The invention relates to the technical field of deep learning, in particular to a semi-supervised segmentation model generation method, a semi-supervised segmentation model generation system, a semi-supervised segmentation model generation computer device and a semi-supervised segmentation model storage medium for semantic consistency representation.
Background
At present, most methods based on deep learning still rely on a large amount of fine marked image data, however, the cost for acquiring a large amount of manual marked image data is high, and the method is particularly suitable for marked image data of medical images.
In some existing studies, a self-training-based approach is proposed to learn a classifier with a small number of labels, and subsequent iterations are used to generate pseudo-labels in unlabeled image data. Or a collaborative training mode and a consistency regularization training mode are provided, the former uses the compatibility and complementarity of multi-view data, namely, each view is assumed to contain enough information to generate an optimal learner, so that the classifier of each view can provide the most reliable pseudo label for other classifiers, and the mutual collaboration promotes the integral training of the model; the latter assumes that the disturbance is added to a non-labeled sample, the prediction result of the non-labeled sample is not changed significantly, the output consistency is achieved, and the generalization capability of the model can be improved. However, the above method rarely digs from the rich category semantic information to its effectiveness in semi-supervised learning and limits the generalization ability of the model in processing more unlabeled image data.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a method, a system, a computer device and a storage medium for generating a semi-supervised segmentation model with semantic consistency representation, so as to solve at least one of the problems in the prior art.
In a first aspect, a semi-supervised segmentation model generation method for class semantic consistency representation is provided, including: acquiring an image data set to be trained, wherein the image data set to be trained comprises marked image data and unmarked image data, and the number of the marked image data is smaller than that of the unmarked image data;
inputting the marked image data and the unmarked image data into a preset hierarchical segmentation network for segmentation processing so as to generate multi-scale intermediate features and a prediction segmentation map;
inputting the multi-scale intermediate features and the original semantic perception query into a preset category semantic perception segmentation module for attention operation so as to generate the multi-scale semantic perception query;
inputting the multi-scale semantic perception query and the middle features of the unlabeled images of all scales into a preset category semantic guidance segmentation module for attention operation so as to generate a multi-scale semantic guidance segmentation map;
And carrying out consistency constraint on the multi-scale semantic guided segmentation map and the prediction segmentation map through a first preset loss function so as to generate a semi-supervised segmentation model.
In one embodiment, the multi-scale semantic aware query is obtained by:
patch token processing is carried out on the marked image data intermediate features and the unmarked image data intermediate features of each scale respectively so as to generate multi-scale patch token feature representation;
performing attention interaction operation on the original semantic perception query and patch token feature representation of the first scale according to a first multi-head attention module on the first scale so as to generate the first-scale semantic perception query; on the nth scale, according to the nth multi-head attention module, performing attention interaction operation on the semantic perception query generated on the n-1 scale and the patch token feature representation on the nth scale respectively to generate the nth scale semantic perception query, wherein n is more than or equal to 2.
In an embodiment, the preset category semantic perception segmentation module is further configured to generate a multi-scale semantic perception segmentation map, where the multi-scale semantic perception segmentation map includes a multi-scale semantic perception segmentation map without labeled image data and a multi-scale semantic perception segmentation map with labeled image data, and before the multi-scale semantic guidance segmentation map and the prediction segmentation map are subjected to consistency constraint by a preset loss function, the method includes:
And carrying out consistency constraint on the multi-scale semantic perception segmentation map with the marked image data and the multi-scale semantic guidance segmentation map through a second preset loss function.
In an embodiment, the multi-scale semantic aware segmentation map is obtained by:
on different scales, respectively performing attention operation on the input patch token features to generate a multi-scale category semantic perception feature map;
respectively converting the dimension of the multi-scale category semantic perception feature map into a preset dimension;
and respectively removing the second dimension of the multi-scale category semantic perception feature map converted into the preset dimension, and performing up-sampling processing to generate the multi-scale semantic perception segmentation map.
In an embodiment, after the generating the multi-scale semantic aware segmentation map, the method includes:
and supervising the training of the semi-supervised segmentation model through the multi-scale semantic perception segmentation map with the standard image data.
In an embodiment, the multi-scale semantic guided segmentation map is obtained by:
performing attention interaction operation on the multi-scale semantic perception query and the middle features of the unlabeled image data of the corresponding scale respectively;
And respectively carrying out convolution processing on the feature images after the attention interaction operation under each scale, and removing the second dimension to generate the multi-scale semantic guidance segmentation image.
In an embodiment, the performing, by a first preset loss function, a consistency constraint on the multi-scale semantic guided segmentation map and the prediction segmentation map includes:
respectively carrying out normalization processing on the multi-scale semantic guidance segmentation map and the prediction segmentation map;
and carrying out consistency constraint on the multiscale semantic guided segmentation map and the prediction segmentation map after normalization processing through the first preset loss function.
In a second aspect, a semi-supervised segmentation model generation system for class semantic consistency representation is provided, comprising: the training device comprises an image data set to be trained and a training data set acquisition unit, wherein the image data set to be trained comprises marked image data and unmarked image data, and the number of the marked image data is smaller than that of the unmarked image data;
the segmentation processing unit is used for inputting the marked image data and the unmarked image data into a preset hierarchical segmentation network for segmentation processing so as to generate multi-scale intermediate features and a prediction segmentation map;
The multi-scale semantic perception query generation unit is used for inputting the multi-scale intermediate features and the original semantic perception query into a preset category semantic perception segmentation module to perform attention operation so as to generate multi-scale semantic perception query;
the multi-scale semantic guidance segmentation map generation unit is used for inputting the multi-scale semantic perception query and middle features of the unlabeled image data of each scale into a preset category semantic guidance segmentation module for attention operation so as to generate a multi-scale semantic guidance segmentation map;
the semi-supervised segmentation model generation unit is used for carrying out consistency constraint on the multi-scale semantic guidance segmentation map and the prediction segmentation map through a first preset loss function so as to generate a semi-supervised segmentation model.
In a third aspect, a computer device is provided, comprising a memory, a processor and computer readable instructions stored in the memory and executable on the processor, when executing the computer readable instructions, implementing the steps of a semi-supervised segmentation model generation method for class semantic consistency representation as described above.
In a fourth aspect, a readable storage medium is provided, the readable storage medium storing computer readable instructions which, when executed by a processor, implement the steps of a semi-supervised segmentation model generation method for class semantic consistency representation as described above.
The semi-supervised segmentation model generation method, the semi-supervised segmentation model generation system, the semi-supervised segmentation model generation computer equipment and the semi-supervised segmentation model storage medium for the class semantic consistency representation are realized by the following steps: acquiring an image data set to be trained, wherein the image data set to be trained comprises marked image data and unmarked image data, and the number of the marked image data is smaller than that of the unmarked image data; inputting the marked image data and the unmarked image data into a preset hierarchical segmentation network for segmentation processing so as to generate multi-scale intermediate features and a prediction segmentation map; inputting the multi-scale intermediate features and the original semantic perception query into a preset category semantic perception segmentation module for attention operation so as to generate the multi-scale semantic perception query; inputting the multi-scale semantic perception query and the middle features of the unlabeled images of all scales into a preset category semantic guidance segmentation module for attention operation so as to generate a multi-scale semantic guidance segmentation map; and carrying out consistency constraint on the multi-scale semantic guided segmentation map and the prediction segmentation map through a first preset loss function so as to generate a semi-supervised segmentation model. In the application, the class semantic perception segmentation module controllably supervises the generation of multi-scale class semantic perception queries in a training stage, and simultaneously generates a multi-scale semantic perception segmentation map with marked images for the generation of the class semantic guidance segmentation module. The category semantic guidance segmentation module guides unlabeled image data to generate a multi-scale semantic guidance segmentation map by utilizing multi-scale category semantic perception query, and performs consistency constraint learning with a prediction segmentation map finally output by a segmentation network, so that the aim of using a large amount of unlabeled image data to participate in training the whole segmentation network is fulfilled. And the training data contains only a small number of annotated medical images and a large number of non-annotated medical images. In practical training, medical treatment of a large amount of marked image data can be reduced, and accurate segmentation of each organ in medical images is realized by using only a small amount of marked image data.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a network architecture generated by a semi-supervised segmentation model based on class semantic consistency according to an embodiment of the present invention;
FIG. 2 is a flow chart of a semi-supervised segmentation model generation method for class semantic consistency representation in accordance with an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a preset category semantic perception segmentation module according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a preset category semantic guidance segmentation module according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a semi-supervised segmentation model generation system for class semantic consistency representation, according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a computer device in accordance with an embodiment of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention.
All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The method for generating the semi-supervised segmentation model with semantic consistency expression provided in the embodiment can be applied to the overall network architecture as shown in fig. 1, wherein the preset hierarchical segmentation network comprises an encoder and a decoder, the preset hierarchical segmentation network comprises a plurality of processing stages, each stage is provided with a decoder, 4 stages are taken as an example in the figure, the first 3 stages sequentially perform upsampling processing on the image data processed by the encoder, and a prediction result is output through a prediction layer of the last stage.
The preset hierarchical segmentation network may be a network based on a transducer, a network based on a CNN, or the like, and the semi-supervised segmentation model may be a network for processing two-dimensional input or a network for processing three-dimensional input. It can be understood that any hierarchical segmentation model can be used for mining category semantic consistency between marked image data and unmarked image data by using the method, so that a large amount of unmarked image data information is utilized, the performance of the segmentation model is close to that of a full-supervision training segmentation model, and accurate segmentation of each organ in a medical image is realized.
In one embodiment, as shown in fig. 2, a semi-supervised segmentation model generation method for class semantic consistency representation is provided, comprising the steps of:
in step S110, acquiring an image data set to be trained, where the image data set to be trained includes labeled image data and unlabeled image data, and the number of the labeled image data is smaller than the number of the unlabeled image data;
in the embodiment of the application, the image data set to be trained can comprise a large amount of non-labeling image data and a small amount of non-labeling image data, wherein the image data can be medical image data, and labels which are manually labeled are carried on the labeled image data.
Further, referring to FIG. 1, the annotated image data is represented as { X ] l The { and unlabeled image data are represented as { X } u ' its label is denoted as { Y } l And (3) preprocessing and inputting the preprocessed data into a preset hierarchical segmentation network, wherein l and u respectively represent that the data are derived from marked and unmarked image data. The annotated image data and the non-annotated image data may be unpaired medical images.
The annotated image data and the non-annotated image data may be two-dimensional image data or three-dimensional image data.
In step S120, the noted marked image data and unmarked image data are input into a preset hierarchical segmentation network for segmentation processing, so as to generate a multi-scale intermediate feature and a predictive segmentation map;
in the embodiment of the present application, the preset hierarchical segmentation network may include an encoder and a decoder, which are not limited to one type, i.e., may be a transform network or a CNN-based network. The marked image data and the unmarked image data are firstly preprocessed, such as image enhancement, and then input into an encoder, wherein the encoder can comprise a plurality of stages, the image data can be sequentially processed by the encoder with the plurality of stages and then input into a decoder, and the decoder can also comprise a plurality of stages, such as i stages, and then in the previous i-1 stages, the decoder and the up-sampling module can be respectively comprised, and when the marked image data and the unmarked image data sequentially pass through each stage, the marked image data intermediate feature and the unmarked image data intermediate feature can be generated in each stage after being processed by the decoder and the up-sampling module, so that the multi-scale intermediate feature is formed.
Wherein a multi-scale intermediate feature refers to a plurality of intermediate features of different scales. It is understood that the encoder generates a scale of intermediate features at each stage in the previous i-1.
In the decoder, the upsampling module in the first three phases is used to recover higher resolution step by step and fuse the multi-scale feature representations corresponding to the different encoder phases by means of a jump connection. In the last stage, a decoder and a prediction layer are provided, by means of which pixel-by-pixel prediction from the medical image with and without labels is achieved, so that a final segmented prediction map is generated.
In step S130, inputting the intermediate features of the multiple different scales and the original semantic perception query into a preset category semantic perception segmentation module for performing attention operation to generate a multi-scale semantic perception query;
in the embodiment of the application, the preset category semantic perception segmentation module can be understood as an external attention module, and simultaneously, in order to mine category semantic consistency among different data, a learnable semantic perception query is introduced, wherein the semantic perception query aims at learning global category semantic representation. The preset category semantic perception segmentation module can enable the learnable semantic perception query and the multi-scale patch tokenized feature representation to perform attention interaction operation.
The original semantic aware query may be a network parameter in a preset split network.
In one embodiment of the application, the multi-scale semantic aware query may be obtained by:
patch token processing is carried out on the marked image data intermediate features and the unmarked image data intermediate features of each scale respectively so as to generate multi-scale patch token feature representation;
performing attention interaction operation on the original semantic perception query and patch token feature representation of the first scale according to a first multi-head attention module on the first scale so as to generate the first-scale semantic perception query;
on the nth scale, according to the nth multi-head attention module, performing attention interaction operation on the semantic perception query generated on the n-1 scale and the patch token feature representation on the nth scale respectively to generate the nth scale semantic perception query, wherein n is more than or equal to 2.
It can be understood that, on the first scale, according to the first multi-head attention module, performing attention interaction on the original semantic perception query and the patch token feature representation of the first scale to generate a first-scale semantic perception query; and on a second scale, performing attention interaction operation according to the first-scale semantic perception query updated by the first scale and the patch token feature representation of the second scale to generate a second-scale semantic perception query, wherein the second-scale semantic perception query can be used as a semantic perception query of the next scale to participate in the attention interaction operation until the second-last stage of the decoder is sequentially calculated.
Where n.gtoreq.2, preferably the n can take values of 2 and 3, the decoder can comprise 4 stages.
Specifically, referring to FIG. 1, a semantic aware query is represented as Q ε R Z×4C Where Z represents the number of categories, where Q may correspond to the hexagon in FIG. 1. Q and intermediate features from the first stage output in the decoder are simultaneously input to an external class semantic aware segmentation module, thereby performing a multi-headed attention mechanism to generate Q' 1 ∈R Z×4C . To continue the attention interaction with the sample feature of the next stage, the method is performed by the method in Q' 1 The next phase Q is obtained by using a 1X 1 one-dimensional convolution layer 2 ∈R Z×2C . To recursively process the multi-scale patch token representation, Q λ Sample characteristics output from the (lambda+1) th stage in the decoder are input to an external attention module to obtain Q' λ And Q λ+1 Wherein Q is λ Is a semantic aware query from stage lambda, Q' λ Is the updated semantic aware query of stage lambda, Q λ+1 Is a semantically aware query ready for delivery to the next stage. In FIG. one, the output multi-scale semantically aware query is represented asλ∈{1,2,3}。
In the training phase, the generation of multi-scale semantic aware queries is controllably supervised. Specifically, after performing attention interactive operation on the learnable semantic perception query and the multi-scale patch tokenized feature representation, the updated semantic perception query of each stage is obtained, and the attention coefficient graph output by each stage is also obtained Converting the attention coefficient map into a multi-scale semantic perception segmentation map which can be supervised by adding a corresponding 2 x 2 convolution layer to each attention coefficient map>The shape and the size of the marked true value can be obtained by upsampling the prediction graphs with different scales, and the method can be shownSupervision is performed with labeling truth values, thereby indirectly and effectively supervising the generation of multi-scale category semantic perception queries.
The multi-scale category semantic perception query refers to a plurality of category semantic perception queries with different scales, and the multi-scale semantic perception segmentation map can comprise a multi-scale semantic perception segmentation map without marked image data and a multi-scale semantic perception segmentation map with marked image data.
The multi-scale semantic perception segmentation map can be obtained by the following steps:
on different scales, respectively performing attention operation on the input patch token features to generate a multi-scale category semantic perception feature map;
respectively converting the dimension of the multi-scale category semantic perception feature map into a preset dimension;
and respectively removing the second dimension of the multi-scale category semantic perception feature map converted into the preset dimension, and performing up-sampling processing to generate the multi-scale semantic perception segmentation map.
Referring to fig. 3, the arrow a refers to a transmission process of the marked image data, and the arrow B refers to a transmission process of the unmarked image data. Taking 4 stages as an example, the intermediate features of the marked image data with or without the marked image data in the first 3 stages are input into a category semantic perception segmentation module, and patch tokenization is performed first to obtainSo that the input meets the input requirements of the attention module. In this module, the operations experienced by the multi-scale presence and absence of annotated intermediate features are the same as the final output, but the specific usage of the output is not the same. Therefore, for ease of understanding, the description is expanded from the perspective of first-scale annotated image data:
assume that the feature dimension of the input isIs the output result of the first stage in the decoder, for F l Performing linear castingShadow, to calculate keys and values of external attention operations, then the query of external attention operations may be made of Q ε R Z×4C To calculate:
q=QW Q ,k=FW K ,v=FW V ,
wherein W is Q ,W K W V ∈R 4C×4C′ Respectively a parameter matrix of the linear projection. EA is a single head external attention operation. d, d k Is the characteristic dimension of q and k. Softmax (·) represents the Softmax function along the spatial dimension, i.e Representing a semantic aware attention attempt extracted from the first scale input patch token features under a single head external attention mechanism. Further, the multi-headed external attention Mechanism (MEA) is to concatenate N independent EA operations and perform projection output:
MEA(Q,F)=Contact(EA 1 (Q,F),...,EA N (Q,F))W o
Wherein Contact (·) is a cascading operation. W (W) O ∈R 4C×4C′ Is a matrix of parameters that can be learned,thus, Q may be updated by a multi-headed external attention mechanism:
wherein,MLP (·) is an abbreviation for multilayer perceptron. At this time, the updated category semantic aware query is saved +.>Furthermore, to extract the high resolution patch token semantic representation from the next scale, the 1X 1 convolution operation is further used to extract the +.>To Z x 2C to obtain category semantic aware query input for the next attention module. Class semantic aware feature map extracted from input patch token features of a first scale through a multi-head attention mechanism may be expressed as +.>Where Z is the number of categories. To further obtain a graph useful for supervising multi-scale semantic perception partitions, the above A can be convolved with 2X 2 1 Is converted to->Finally, the second dimension is removed, i.e. +.>Conversion to->And after up-sampling processing, the shape marked with true value can be obtained, thereby obtaining the final multi-scale semantic perception segmentation graph M l1 ∈R 1×H×W
In an embodiment of the application, training of the semi-supervised segmentation model is supervised by the multi-scale semantic aware segmentation map with standard image data. That is, the labeling truth values can be used to supervise the model, as follows:
Wherein Y represents a labeling true value: CE represents the cross entropy loss function: DICE represents the DICE loss function.
Similarly, after the middle features of the unlabeled image data are input to the category semantic perception segmentation module, the multi-scale category semantic perception feature map of the unlabeled image can be obtained through calculation in the attention operation process
In step S140, inputting the intermediate features of the multi-scale semantic perception query and the unlabeled image data of each scale into a preset category semantic guidance segmentation module for performing attention operation so as to generate a multi-scale semantic guidance segmentation map;
in the embodiment of the present application, the preset category semantic guidance segmentation module may be an external attention module. The multi-scale category semantic perception query can perform attention interaction operation with the middle features of the unlabeled image data of the corresponding scale, so that the related semantic region is responded.
The multi-scale category semantic perception query transmitted by the preset category semantic perception segmentation module is used as guide and is input into the preset category semantic guidance segmentation module to interact with unlabeled image data, so that attention coefficient graphs with different scales can be obtainedFor a multiscale attention coefficient map A u Lambda plus the corresponding 2 x 2 convolution layer, which is converted into a multi-scale class semantic guided segmentation map +.>The method can be used for consistency regularization training subsequently, and a large amount of classes of unlabeled image data can be utilized efficientlySemantic information. Furthermore, for the final output { P } of the prediction layer of the decoder in the partition network l ,P u }∈R Z×H×W For the prediction output of the marked image data, the monitoring training is performed by using the marked true value explicitly, and the specific steps are as follows:
wherein Y represents a labeling true value: CE represents the cross entropy loss function: DICE represents the DICE loss function.
Specifically, the multi-scale semantic guidance segmentation map is obtained by the following method:
performing attention interaction operation on the multi-scale semantic perception query and the middle features of the unlabeled image data of the corresponding scale respectively;
and respectively carrying out convolution processing on the feature images after the attention interaction operation under each scale, and removing the second dimension to generate the multi-scale semantic guidance segmentation image.
Referring to fig. 4, the category semantic guided segmentation module is different from the category semantic aware segmentation module in that the multi-scale category semantic aware queries transferred from the module do attention operations with intermediate features of the non-annotated image data of each scale, respectively, rather than recursively updating the semantic aware queries. The specific operation of the attention mechanism is the same as that in the above-mentioned category semantic perception segmentation module, and reference may be made to the above-mentioned attention operation flow specifically, which is not described in detail herein.
In the category semantic guidance segmentation module, the multi-scale category semantic perception query is used for guiding category semantic segmentation, the attention operation is used for enabling the related semantic areas without the middle features of the marked image data to be responded, and the multi-scale semantic guidance segmentation graph can be obtained by the same 2X 2 convolution operation and removing the second dimension
In an embodiment of the present application, before the consistency constraint is performed on the multi-scale semantic guided segmentation map and the predictive segmentation map output by the hierarchical segmentation network by using a preset loss function, the method includes:
and carrying out consistency constraint on the multi-scale semantic perception segmentation map and the multi-scale semantic guidance segmentation map of the marked image data through a second preset loss function.
Specifically, the category semantic perception segmentation module can generate a multi-scale category semantic perception feature map of unlabeled image data according to the input unlabeled imageWill M u The gradient of lambda is separated from the whole calculation map and is related to M' u Lambda, performing consistency constraint learning through a mean square error MSE to help a category semantic guidance segmentation module to be able to more fully utilize tagged data information, the consistency constraint learning can be performed specifically by the following formula:
The second predetermined loss function may be a mean square error MSE, or other loss functions may be used.
In step S150, consistency constraint is performed on the multi-scale semantic guided segmentation map and the predictive segmentation map output by the hierarchical segmentation network through a first preset loss function, so as to generate a semi-supervised segmentation model.
In the embodiment of the present application, the final stage of the decoder includes a prediction layer, that is, the non-labeled image data and labeled image data are sequentially processed by each stage of the encoder and the decoder, and then the segmentation prediction can be performed in the prediction layer of the final stage of the decoder, and a prediction segmentation map P is output u
After the prediction partition map is obtained, the following steps may be further performed:
respectively carrying out normalization processing on the multi-scale semantic guidance segmentation map and the prediction segmentation map;
and the first preset loss function carries out consistency constraint on the multiscale semantic guided segmentation map after normalization processing and the predictive segmentation map output by the hierarchical segmentation network.
That is, when the prediction partition map P is acquired u Multiscale guided segmentation map M' Thereafter, in order to reduce the problem of the semi-supervised model predicting labels too confidently during training, the prediction segmentation map P may be segmented u Multiscale guided segmentation map M' Performing a softmax operation σ (·) and then partitioning the graph M 'on a multi-scale semantic guidance by a first preset loss function' Consistency constraint learning is performed, specifically as follows:
the first predetermined loss function may be a mean square error MSE, or other loss functions may be used. The hierarchical segmentation network may be iteratively trained by the consistency constraint until the consistent segmentation accuracy reaches a preset value, e.g., 90%, and a semi-supervised segmentation module is generated.
In the embodiment of the application, the aim of participating in training by using a large amount of unlabeled image data is achieved by utilizing the final prediction output of the unlabeled image in the segmentation network. It can be understood that P in the prediction partition map u ∈R Z×H×W Is decoupled from the computational graph; the shape of the multi-scale semantic guided segmentation map is converted to z×h×w by an upsampling operation. The calculation graph refers to all prediction segmentation graphs participating in gradient calculation, and P in the prediction segmentation graphs u ∈R Z×H×W Is that the parameter gradient of (1) is separated from the calculation map u ∈R Z×H×W Does not participate in the calculation, and at the same time, M 'is obtained through the up-sampling operation' Is converted into z×h×w.
In the embodiment of the application, the class semantic perception segmentation module monitors the generation of multi-scale class semantic perception queries controllably in the training stage, and simultaneously generates the multi-scale semantic perception segmentation map with the marked image for the generation of the class semantic guidance segmentation module. The category semantic guidance segmentation module guides unlabeled image data to generate a multi-scale semantic guidance segmentation map by utilizing multi-scale category semantic perception query, and performs consistency constraint learning with a prediction segmentation map finally output by a segmentation network, so that the aim of using a large amount of unlabeled image data to participate in training the whole segmentation network is fulfilled. And the training data contains only a small number of annotated medical images and a large number of non-annotated medical images. In practical training, medical treatment of a large amount of marked image data can be reduced, and accurate segmentation of each organ in medical images is realized by using only a small amount of marked image data.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present application.
In an embodiment, a semi-supervised segmentation model generation system for class semantic consistency representation is provided, where the semi-supervised segmentation model generation system for class semantic consistency representation corresponds to the semi-supervised segmentation model generation method for class semantic consistency representation in the above embodiment one-to-one. As shown in fig. 5, the semi-supervised segmentation model generation system for the class semantic consistency representation includes an image dataset to be trained acquisition unit 10, a segmentation processing unit 20, a multi-scale semantic aware query generation unit 30, a multi-scale semantic guided segmentation map generation unit 40, and a semi-supervised segmentation model generation unit 50. The functional modules are described in detail as follows:
an image data set to be trained obtaining unit 10, configured to obtain an image data set to be trained, where the image data set to be trained includes labeled image data and non-labeled image data, and the number of the labeled image data is smaller than the number of the non-labeled image data;
the segmentation processing unit 20 is configured to input the labeled image data and the unlabeled image data into a preset hierarchical segmentation network for segmentation processing, so as to generate a multi-scale intermediate feature and a prediction segmentation map;
The multi-scale semantic perception query generation unit 30 is configured to input the multi-scale intermediate feature and the original semantic perception query into a preset category semantic perception segmentation module for performing attention operation to generate a multi-scale semantic perception query;
the multi-scale semantic guidance segmentation map generating unit 40 is configured to input the multi-scale semantic perception query and intermediate features of the non-labeled image data of each scale into a preset category semantic guidance segmentation module for performing attention operation, so as to generate a multi-scale semantic guidance segmentation map;
a semi-supervised segmentation model generation unit 50, configured to perform consistency constraint on the multi-scale semantic guided segmentation map and the predictive segmentation map through a first preset loss function, so as to generate a semi-supervised segmentation model.
In an embodiment, the multi-scale semantic aware query generation unit 30 is further configured to:
patch token processing is carried out on the marked image data intermediate features and the unmarked image data intermediate features of each scale respectively so as to generate multi-scale patch token feature representation;
performing attention interaction operation on the original semantic perception query and patch token feature representation of the first scale according to a first multi-head attention module on the first scale so as to generate the first-scale semantic perception query;
On the nth scale, according to the nth multi-head attention module, performing attention interaction operation on the semantic perception query generated on the n-1 scale and the patch token feature representation on the nth scale respectively to generate the nth scale semantic perception query, wherein n is more than or equal to 2.
In an embodiment, the preset category semantic perception segmentation module is further configured to generate a multi-scale semantic perception segmentation map, where the multi-scale semantic perception segmentation map includes a multi-scale semantic perception segmentation map without labeled image data and a multi-scale semantic perception segmentation map with labeled image data, and the system further includes: a consistency constraint unit for
And carrying out consistency constraint on the multi-scale semantic perception segmentation map with the marked image data and the multi-scale semantic guidance segmentation map through a second preset loss function.
In an embodiment, the multi-scale semantic aware query generation unit 30 is further configured to:
on different scales, respectively performing attention operation on the input patch token features to generate a multi-scale category semantic perception feature map;
respectively converting the dimension of the multi-scale category semantic perception feature map into a preset dimension;
and respectively removing the second dimension of the multi-scale category semantic perception feature map converted into the preset dimension, and performing up-sampling processing to generate the multi-scale semantic perception segmentation map.
In an embodiment, the multi-scale semantic guidance segmentation map generating unit 40 is further configured to:
performing attention interaction operation on semantic perception query of each scale and middle features of corresponding scale non-marked image data;
and respectively carrying out convolution processing on the feature images after the attention interaction operation under each scale, and removing the second dimension to generate the multi-scale semantic guidance segmentation image.
In an embodiment, the multi-scale semantic guidance segmentation map generating unit 40 is further configured to:
and supervising the training of the semi-supervised segmentation model through the multi-scale semantic perception segmentation map with the standard image data.
In an embodiment, the semi-supervised segmentation model generation unit 50 is further configured to:
performing attention interaction operation on the multi-scale semantic perception query and the middle features of the unlabeled image data of the corresponding scale respectively;
and respectively carrying out convolution processing on the feature images after the attention interaction operation under each scale, and removing the second dimension to generate the multi-scale semantic guidance segmentation image.
In the embodiment of the application, the class semantic perception segmentation module monitors the generation of multi-scale class semantic perception queries controllably in the training stage, and simultaneously generates a multi-scale semantic perception segmentation map without marked images, so as to be used for better restricting the generation of the class semantic guidance segmentation module subsequently. The category semantic guidance segmentation module guides unlabeled image data to generate a multi-scale semantic guidance segmentation map by utilizing multi-scale category semantic perception query, and performs consistency constraint learning with a prediction segmentation map finally output by a segmentation network, so that the aim of using a large amount of unlabeled image data to participate in training the whole segmentation network is fulfilled. And the training data contains only a small number of annotated medical images and a large number of non-annotated medical images. In practical training, medical treatment of a large amount of marked image data can be reduced, and accurate segmentation of each organ in medical images is realized by using only a small amount of marked image data.
For specific limitations of the semi-supervised segmentation model generation system for the class semantic consistency representation, reference may be made to the above limitations of the semi-supervised segmentation model generation method for the class semantic consistency representation, which are not described in detail herein. The various modules in the semi-supervised segmentation model generation system for category semantic consistency representation described above may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a terminal device, and the internal structure thereof may be as shown in fig. 6. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a readable storage medium. The readable storage medium stores computer readable instructions. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer readable instructions, when executed by a processor, implement a semi-supervised segmentation model generation method for class semantic consistency representation. The readable storage medium provided by the present embodiment includes a nonvolatile readable storage medium and a volatile readable storage medium.
In one embodiment, a computer device is provided that includes a memory, a processor, and computer readable instructions stored in the memory and executable on the processor that when executed implement the steps of a semi-supervised segmentation model generation method for class semantic consistency representation as described above.
In an embodiment, a readable storage medium is provided, storing computer readable instructions that when executed by a processor implement the steps of a semi-supervised segmentation model generation method for class semantic consistency representation as described above.
Those skilled in the art will appreciate that implementing all or part of the above described embodiment methods may be accomplished by instructing the associated hardware by computer readable instructions stored on a non-volatile readable storage medium or a volatile readable storage medium, which when executed may comprise the above described embodiment methods. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory.
By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims (9)

1. A method for generating a semi-supervised segmentation model for semantic consistency expression, the method comprising:
acquiring an image data set to be trained, wherein the image data set to be trained comprises marked image data and unmarked image data, and the number of the marked image data is smaller than that of the unmarked image data;
inputting the marked image data and the unmarked image data into a preset hierarchical segmentation network for segmentation processing so as to generate multi-scale intermediate features and a prediction segmentation map;
inputting the multi-scale intermediate features and the original semantic perception query into a preset category semantic perception segmentation module for attention operation so as to generate the multi-scale semantic perception query;
inputting the multi-scale semantic perception query and the middle features of the unlabeled images of all scales into a preset category semantic guidance segmentation module for attention operation so as to generate a multi-scale semantic guidance segmentation map;
consistency constraint is carried out on the multi-scale semantic guidance segmentation map and the prediction segmentation map through a first preset loss function so as to generate a semi-supervised segmentation model;
the multi-scale semantic perception query is acquired by the following steps:
Patch token processing is carried out on the marked image data intermediate features and the unmarked image data intermediate features of each scale respectively so as to generate multi-scale patch token feature representation;
performing attention interaction operation on the original semantic perception query and patch token feature representation of the first scale according to a first multi-head attention module on the first scale so as to generate the first-scale semantic perception query; on the nth scale, according to the nth multi-head attention module, performing attention interaction operation on the semantic perception query generated on the n-1 scale and the patch token feature representation on the nth scale respectively to generate the nth scale semantic perception query, wherein n is more than or equal to 2.
2. The method for generating a semi-supervised segmentation model for class semantic consistency representation as recited in claim 1, wherein the pre-set class semantic perception segmentation module is further configured to generate a multi-scale semantic perception segmentation map, the multi-scale semantic perception segmentation map including a multi-scale semantic perception segmentation map without annotated image data and a multi-scale semantic perception segmentation map with annotated image data, the method comprising, prior to consistency constraint of the multi-scale semantic guidance segmentation map and the predictive segmentation map by a first pre-set loss function:
And carrying out consistency constraint on the multi-scale semantic perception segmentation map with the marked image data and the multi-scale semantic guidance segmentation map through a second preset loss function.
3. The method for generating a semi-supervised segmentation model for class semantic consistency representation as recited in claim 2, wherein the multi-scale semantic aware segmentation map is obtained by:
on different scales, respectively performing attention operation on the input patch token features to generate a multi-scale category semantic perception feature map;
respectively converting the dimension of the multi-scale category semantic perception feature map into a preset dimension;
and respectively removing the second dimension of the multi-scale category semantic perception feature map converted into the preset dimension, and performing up-sampling processing to generate the multi-scale semantic perception segmentation map.
4. A method of generating a semi-supervised segmentation model for class semantic consistency representation as recited in claim 3, wherein the generating the multi-scale semantic aware segmentation map comprises:
and supervising the training of the semi-supervised segmentation model through the multi-scale semantic perception segmentation map with the marked image data.
5. The method for generating a semi-supervised segmentation model for class semantic consistency representation as recited in claim 1, wherein the multi-scale semantic guided segmentation map is obtained by:
Performing attention interaction operation on the multi-scale semantic perception query and the middle features of the unlabeled image data of the corresponding scale respectively;
and respectively carrying out convolution processing on the feature images after the attention interaction operation under each scale, and removing the second dimension to generate the multi-scale semantic guidance segmentation image.
6. The method for generating a semi-supervised segmentation model for class semantic consistency representation as set forth in any of claims 1-5, wherein the consistency constraint of the multi-scale semantic guided segmentation map and the predictive segmentation map by a first predetermined loss function comprises:
respectively carrying out normalization processing on the multi-scale semantic guidance segmentation map and the prediction segmentation map;
and carrying out consistency constraint on the multiscale semantic guided segmentation map and the prediction segmentation map after normalization processing through the first preset loss function.
7. A semi-supervised segmentation model generation system for class semantic consistency representations, the system comprising:
the training device comprises an image data set to be trained and a training data set acquisition unit, wherein the image data set to be trained comprises marked image data and unmarked image data, and the number of the marked image data is smaller than that of the unmarked image data;
The segmentation processing unit is used for inputting the marked image data and the unmarked image data into a preset hierarchical segmentation network for segmentation processing so as to generate multi-scale intermediate features and a prediction segmentation map;
the multi-scale semantic perception query generation unit is used for inputting the multi-scale intermediate features and the original semantic perception query into a preset category semantic perception segmentation module to perform attention operation so as to generate multi-scale semantic perception query;
the multi-scale semantic guidance segmentation map generation unit is used for inputting the multi-scale semantic perception query and middle features of the unlabeled image data of each scale into a preset category semantic guidance segmentation module for attention operation so as to generate a multi-scale semantic guidance segmentation map;
the semi-supervised segmentation model generation unit is used for carrying out consistency constraint on the multi-scale semantic guidance segmentation map and the prediction segmentation map through a first preset loss function so as to generate a semi-supervised segmentation model;
the multi-scale semantic perception query unit is further configured to:
patch token processing is carried out on the marked image data intermediate features and the unmarked image data intermediate features of each scale respectively so as to generate multi-scale patch token feature representation;
Performing attention interaction operation on the original semantic perception query and patch token feature representation of the first scale according to a first multi-head attention module on the first scale so as to generate the first-scale semantic perception query; on the nth scale, according to the nth multi-head attention module, performing attention interaction operation on the semantic perception query generated on the n-1 scale and the patch token feature representation on the nth scale respectively to generate the nth scale semantic perception query, wherein n is more than or equal to 2.
8. A computer device comprising a memory, a processor and computer readable instructions stored in the memory and executable on the processor, wherein the processor, when executing the computer readable instructions, performs the steps of the semi-supervised segmentation model generation method for class semantic consistency representation as recited in any of claims 1-6.
9. A readable storage medium storing computer readable instructions which, when executed by a processor, implement the steps of a semi-supervised segmentation model generation method for class semantic consistency representation as recited in any of claims 1-6.
CN202310271384.XA 2023-03-20 2023-03-20 Semi-supervised segmentation model generation method and system for class semantic consistency representation Active CN116385823B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310271384.XA CN116385823B (en) 2023-03-20 2023-03-20 Semi-supervised segmentation model generation method and system for class semantic consistency representation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310271384.XA CN116385823B (en) 2023-03-20 2023-03-20 Semi-supervised segmentation model generation method and system for class semantic consistency representation

Publications (2)

Publication Number Publication Date
CN116385823A CN116385823A (en) 2023-07-04
CN116385823B true CN116385823B (en) 2023-12-01

Family

ID=86974259

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310271384.XA Active CN116385823B (en) 2023-03-20 2023-03-20 Semi-supervised segmentation model generation method and system for class semantic consistency representation

Country Status (1)

Country Link
CN (1) CN116385823B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020156303A1 (en) * 2019-01-30 2020-08-06 广州市百果园信息技术有限公司 Method and apparatus for training semantic segmentation network, image processing method and apparatus based on semantic segmentation network, and device and storage medium
CN112001939A (en) * 2020-08-10 2020-11-27 浙江大学 Image foreground segmentation algorithm based on edge knowledge conversion
CN115760869A (en) * 2022-10-17 2023-03-07 复旦大学 Attention-guided non-linear disturbance consistency semi-supervised medical image segmentation method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020156303A1 (en) * 2019-01-30 2020-08-06 广州市百果园信息技术有限公司 Method and apparatus for training semantic segmentation network, image processing method and apparatus based on semantic segmentation network, and device and storage medium
CN112001939A (en) * 2020-08-10 2020-11-27 浙江大学 Image foreground segmentation algorithm based on edge knowledge conversion
CN115760869A (en) * 2022-10-17 2023-03-07 复旦大学 Attention-guided non-linear disturbance consistency semi-supervised medical image segmentation method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
面向心脏MRI分割的半监督空间一致性约束网络;李才子;刘瑞强;司伟鑫;袁志勇;王平安;;计算机辅助设计与图形学学报(07);第132-140页 *

Also Published As

Publication number Publication date
CN116385823A (en) 2023-07-04

Similar Documents

Publication Publication Date Title
WO2021159714A1 (en) Data processing method and related device
US20210279595A1 (en) Methods, devices and media providing an integrated teacher-student system
GB2571825A (en) Semantic class localization digital environment
DE102022119386A1 (en) METHOD AND APPARATUS FOR PERFORMING DENSE PREDICTION USING TRANSFORMER BLOCKS
CN111831901B (en) Data processing method, device, equipment and storage medium
WO2021208799A1 (en) Transfer model training method and apparatus and fault detection method and apparatus
US20230176903A1 (en) Dynamic batching for inference system for transformer-based generation tasks
JP7536893B2 (en) Image Processing Using Self-Attention Based Neural Networks
WO2024087858A1 (en) Image processing model training method and apparatus, electronic device, computer program product, and computer storage medium
US20240046067A1 (en) Data processing method and related device
DE102022132015A1 (en) PERFORM TRAINING OF SEMANTIC SEGMENTATION WITH IMAGE/TEXT PAIRS
CN112115744A (en) Point cloud data processing method and device, computer storage medium and electronic equipment
WO2024046144A1 (en) Video processing method and related device thereof
CN116385823B (en) Semi-supervised segmentation model generation method and system for class semantic consistency representation
CN116911361A (en) Method, device and equipment for training network model based on deep learning framework network
CN116740078A (en) Image segmentation processing method, device, equipment and medium
CN114972293B (en) Video polyp segmentation method and device based on semi-supervised space-time attention network
DE102022129634A1 (en) PERFORMING SIMULATIONS USING MACHINE LEARNING
CN114842312B (en) Generation and segmentation method and device for unpaired cross-modal image segmentation model
CN118334752B (en) Behavior recognition model training method and system integrating 3DCNN and attention mechanism
CN111815631B (en) Model generation method, device, equipment and readable storage medium
CN118398155B (en) Medical report generation method, model training method, system, equipment and medium
US20240161237A1 (en) Electronic device and method with image processing.
CN113191936B (en) Interactive image texture migration conversion method, device, computer equipment and storage medium
CN116579403A (en) Data processing method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant