CN117253044B - A method for farmland remote sensing image segmentation based on semi-supervised interactive learning - Google Patents

A method for farmland remote sensing image segmentation based on semi-supervised interactive learning Download PDF

Info

Publication number
CN117253044B
CN117253044B CN202311334268.4A CN202311334268A CN117253044B CN 117253044 B CN117253044 B CN 117253044B CN 202311334268 A CN202311334268 A CN 202311334268A CN 117253044 B CN117253044 B CN 117253044B
Authority
CN
China
Prior art keywords
image
loss
cnn
images
loss function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311334268.4A
Other languages
Chinese (zh)
Other versions
CN117253044A (en
Inventor
文思鉴
王永梅
王芃力
张友华
吴雷
吴海涛
轩亚恒
郑雪瑞
张世豪
潘海瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Agricultural University AHAU
Original Assignee
Anhui Agricultural University AHAU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Agricultural University AHAU filed Critical Anhui Agricultural University AHAU
Priority to CN202311334268.4A priority Critical patent/CN117253044B/en
Publication of CN117253044A publication Critical patent/CN117253044A/en
Application granted granted Critical
Publication of CN117253044B publication Critical patent/CN117253044B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0895Weakly supervised learning, e.g. semi-supervised or self-supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/188Vegetation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention is suitable for the technical field of agricultural image analysis, and particularly provides a farmland remote sensing image segmentation method based on semi-supervised interactive learning; and secondly, introducing a directivity contrast loss function into the CNN, and performing full-supervision training on the tagged data to ensure the consistency of the same identity features in the pictures under different scenes, thereby improving the generalization capability and robustness of the model.

Description

Farmland remote sensing image segmentation method based on semi-supervised interactive learning
Technical Field
The invention belongs to the technical field of agricultural image analysis, and particularly relates to a farmland remote sensing image segmentation method based on semi-supervised interactive learning.
Background
The farmland remote sensing image segmentation is an important task, and the aim is to classify the farmland remote sensing image at the pixel level so as to improve the efficiency of agricultural land production and management.
The conventional farmland remote sensing image segmentation method based on deep learning generally needs a large amount of labeling data for training, but the labeling data is high in acquisition cost, and the requirements are often difficult to meet in practical application. Therefore, semi-supervised learning is one of the effective methods to solve this problem.
Currently, semi-supervised learning methods use a small amount of labeled data and a large amount of unlabeled data for training to improve the performance of the model. In addition, due to the large number of parameters, the situation of over fitting is easy to occur, namely, the model performs well on the training set, but performs poorly on the test set. Therefore, the generalization capability of the farmland remote sensing image segmentation model is an important problem to be considered when being applied to actual scenes.
The existing framework for improving generalization capability and robustness of semi-supervised agricultural image segmentation algorithm can be divided into two main types: an agricultural image segmentation method based on a Convolutional Neural Network (CNN) and an agricultural image segmentation method based on a Transformer; the former, CNN, extracts features in image space by convolution operation, has the disadvantage that: CNNs use local receptive fields in processing images and gradually reduce image resolution from lower to higher layers through convolution and pooling operations, such local receptive field limitations may result in loss of detail information and global context information in the image, particularly for fine-grained segmentation tasks for large-scale farmland areas; the latter convertors model global relationships in sequence space by self-attention mechanisms, which have the disadvantage that: the goal of the transducer is to model the dependency relationship between each pixel and other pixels through global context information, and there is a limitation in processing local features. In a farmland remote sensing image, different crops or land types may have different scales, some fine feature details need finer perceptibility, and a Transformer may not accurately capture the details when processing different scale features, which leads to reduced accuracy and robustness of a segmentation result.
Disclosure of Invention
The embodiment of the invention aims to provide a farmland remote sensing image segmentation method based on semi-supervised interactive learning, which comprises the following steps of firstly, mutually cooperating CNN and a transducer through interactive learning, and mutually transmitting local characteristics and global characteristics of pixels through self-supervised training on unlabeled data, so that the requirement of labeling data is reduced, and meanwhile, the possible defects of the two existing methods are effectively avoided; secondly, introducing a directional contrast loss function into the CNN, and performing full-supervision training on the tagged data to ensure consistency of the same identity features in the pictures under different scenes, so as to improve generalization capability and robustness of the model.
In view of the above, the invention provides a farmland remote sensing image segmentation method based on semi-supervised interactive learning, which comprises the following steps:
Step S10: m input images divided with labels And N images without labels
Step S20: training CNN and transducer using the tagged image data, respectively;
step S30: weak enhancement processing for Gaussian filtering and brightness adjustment of unlabeled image, and randomly cutting out two new images with overlapping area in the same image ,/>Meanwhile, the pixels of the unlabeled image are projected between an encoder and a decoder of the CNN, a directional contrast loss function is introduced, consistency of the same identity characteristic in the image under different scenes is guaranteed, and a transform prediction result is used as a pseudo tag to calculate context perception consistency loss; calculating consistency regularization loss by using the CNN prediction result as a pseudo tag of the transform prediction result;
Step S40: and (3) taking the trained CNN model as a backbone network, segmenting the test set image, and evaluating the accuracy of the result.
As a further limitation of the technical solution of the present invention, the step of performing the weak enhancement processing of gaussian filtering and brightness adjustment on the unlabeled image, and randomly cropping two new images with overlapping areas in the same image includes:
Step S31: applying gaussian filtering to reduce noise and detail in the unlabeled image, weighted averaging the neighborhood around each pixel; for each pixel Filtering is carried out by using a Gaussian kernel with the size of k, and the pixel value after filtering is as follows:
(1)
wherein, Is the pixel value in the neighborhood,/>Is the weight value of the gaussian kernel;
Finally, adjusting the brightness of the image;
step S32: for a given weakly enhanced unlabeled image, randomly selecting the size and position of a cropping window;
moving the cutting window upwards and leftwards for a certain distance to obtain two new images with overlapping areas 、/>Training a model;
Step S33: using bicubic interpolation algorithm to interpolate all images Are scaled to size/>And uses a bilinear interpolation algorithm to interpolate the corresponding label/>Scaled to the same size so that the input image meets the input specifications of DeepLab v + network.
As a further limitation of the technical solution of the present invention, the training process of the tagged image data includes:
Encoding matrices using tagged images Tag/>Respectively training two backbone network models of CNN and transducer, and calculating the loss function/>, which is related to the real label
As a further limitation of the technical scheme of the invention, the loss function of the calculation and the real labelThe method comprises the following steps:
Data to be tagged Inputting CNN to obtain prediction probability/>, corresponding to each pixel pointInputting a transducer to obtain a prediction probability/>Calculating a loss function/>, between the calculated and corresponding real values
Loss function between the computation and corresponding real valueThe process of (2) is as follows:
with real labels In contrast, the loss function/>, of the CNN fractionAs shown in formula (2):
(2)
Loss function of a transducer section As shown in formula (3):
(3)
Wherein the method comprises the steps of Representing a ReLU activation function,/>Representing Focal Loss, the expression is shown in formula (4):
(4)
Wherein the method comprises the steps of And/>Is a super parameter, here set as/>=0.25,/>=2;
The total loss calculation of the supervised learning model is shown in the formula (5):
(5)
Wherein when In the case of a real tag,/>=1, Vice versa/>=0,/>Is a real number ranging from 0 to 1, indicating the probability that the image belongs to the category noted in the label.
As a further limitation of the technical solution of the present invention, the training process of the label-free image data includes:
through inputting two groups of weak enhancement unlabeled images which are cut randomly, obtaining a prediction result through a CNN network framework and taking the prediction result as a pseudo label predicted by a transducer model, and calculating consistency regularization loss
Obtaining a prediction result through a transducer network framework and taking the prediction result as a pseudo tag of intermediate projection to calculate the context-aware consistency lossAnd the interactive transmission of the image local information and the context global information in the training process is ensured, so that the model fully learns the consistency regularization capability.
As a further limitation of the technical scheme of the invention, in the training process of the unlabeled images, the two groups of input unlabeled images subjected to weak enhancement are subjected to,/>Two groups of predicted values/>, are generated through a CNN model framework、/>; Similarly, two sets of predicted values/>, are generated through a transducer model framework、/>
(6)
Wherein,Representing CNN network model,/>Representing a transducer network model;
Pseudo tag 、/>、/>、/>The calculation method of (2) is shown in the formula (7):
(7)
wherein, Representation such that the probability value/>The label corresponding to the maximum;
CNN prediction result is used as pseudo tag of transducer, and consistency regularization loss is carried out The calculation method of (2) is shown in the formula (8):
(8)
wherein, Representing a ReLU activation function,/>Representing a Dice loss function;
,/> Feature map/>, obtained by DeepLab v < 3+ > encoder Encoder And/>And then is passed through a nonlinear projector/>Projection as/>And/>Using a directional contrast loss function, overlapping regions/>, are encouragedAligning the contrast features with high confidence under different backgrounds, and finally keeping consistency;
for the first Label-free images, loss of directionality contrast/>The calculation formula of (2) is as follows:
(9)
(10)
(11)
wherein N represents the number of spatial positions of the overlapping feature regions; calculating the feature similarity; /(I) Representing a two-dimensional spatial position; /(I)Representing a negative set of images,/>Representation/>Negative samples in/>Representing a classifier;
Calculating a consistency loss function after context awareness constraint by using a prediction result of a transducer as a pseudo tag The formula is as follows:
(12)
wherein, Representing a Dice loss function; /(I)Representing the classifier.
As a further limitation of the technical scheme of the invention, the overall loss is minimized, in the training process, firstly, parameters of a classification network are initialized, then, training data are used for forward propagation and backward propagation, gradients of a loss function are calculated, and network parameters are updated by utilizing optimization algorithms such as gradient descent and the like until the overall loss function is reachedReaching a preset convergence condition;
The total loss To supervise learning model total loss/>Consistency regularization loss/>Loss of directivity contrast/>Context aware consistency loss/>Wherein the total loss/>The calculation formula of (2) is as follows:
(13)
wherein, And/>Is a weight factor, with the aim of controlling the directional contrast loss/>And consistency regularization lossIn the total loss function/>Is a ratio of the number of the first and second groups.
As a further limitation of the technical scheme of the present invention, the step S40 specifically includes a model test, for a given test set farmland remote sensing image, CNN is used as a backbone network model to extract features, the probability of the category to which each pixel point of the target image belongs is segmented and output, and a threshold is set to mark the pixel point as a segmentation target or background.
As a further limitation of the present invention, the step of marking the pixel point as a segmentation target or background by the set threshold includes: obtaining maximum gray value of image by dividing modelAnd minimum gray value/>Let the initial threshold be/>According to/>Dividing an image into a foreground and a background, and respectively obtaining the average gray value/>And/>Find new threshold/>Iterate until ifThe obtained value is the threshold value, the prediction probability is larger than the threshold value and marked as the foreground, and smaller than the threshold value and marked as the background, so that the final segmentation mask is obtained.
Compared with the prior art, the farmland remote sensing image segmentation method based on semi-supervised interactive learning has the beneficial effects that:
Firstly, a consistency constraint module for directional perception is inserted into a CNN and a Transformer interactive learning network, the CNN is excellent in image processing, spatial features in farmland remote sensing images can be effectively extracted, local and global features are captured through convolution and pooling operations, the Transformer is excellent in the field of natural language processing, sequence data and long-distance dependency relations are good in processing, and the advantages of the CNN in spatial feature extraction and the advantages of the Transformer in long-distance dependent modeling can be combined, so that the features in farmland remote sensing images can be better extracted and modeled, and the segmentation accuracy is improved;
secondly, the transducer uses a self-attention mechanism in the encoder-decoder framework, can effectively capture the context information in the image, is very important for accurate segmentation results for farmland remote sensing image segmentation, and can better model the relationship between pixels in the image due to the fact that crops and backgrounds in the image often have wide spatial correlation, so that the segmentation accuracy is improved; on the other hand, farmland remote sensing images usually have higher resolution, detail information needs to be accurately recovered in image segmentation, and the calculation and memory requirements of the traditional CNN decoder on the high-resolution images are higher; the invention can gradually restore the image resolution between the encoder and the decoder by introducing the Transformer layer, and reduce the consumption of calculation and memory resources, thereby more effectively processing high-resolution images.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following description will briefly introduce the drawings that are needed in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the present invention.
FIG. 1 is a system architecture diagram of a farmland remote sensing image segmentation method based on semi-supervised interactive learning;
FIG. 2 is a flow chart of an implementation of a farmland remote sensing image segmentation method based on semi-supervised interactive learning;
FIG. 3 is a sub-flow of a farmland remote sensing image segmentation method based on semi-supervised interactive learning;
FIG. 4 is a block diagram of a farmland remote sensing image segmentation system provided by the invention;
Fig. 5 is a block diagram of a computer device according to the present invention.
Detailed Description
The present application will be further described with reference to the accompanying drawings and detailed description, wherein it is to be understood that, on the premise of no conflict, the following embodiments or technical features may be arbitrarily combined to form new embodiments.
In order to make the objects, technical solutions and advantages of the present application more apparent, the following embodiments of the present application will be described in further detail with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
It should be noted that, in the embodiments of the present invention, all the expressions "first" and "second" are used to distinguish two non-identical entities with the same name or non-identical parameters, and it is noted that the "first" and "second" are only used for convenience of expression, and should not be construed as limiting the embodiments of the present invention. Furthermore, the terms "comprise" and "have," and any variations thereof, are intended to cover a non-exclusive inclusion, such as a process, method, system, article, or other step or unit that comprises a list of steps or units.
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The flow diagrams depicted in the figures are merely illustrative and not necessarily all of the elements and operations/steps are included or performed in the order described. For example, some operations/steps may be further divided, combined, or partially combined, so that the order of actual execution may be changed according to actual situations.
Some embodiments of the present application are described in detail below with reference to the accompanying drawings. The following embodiments and features of the embodiments may be combined with each other without conflict.
At present, the existing framework for improving generalization capability and robustness of a semi-supervised agricultural image segmentation algorithm can be divided into two main categories: an agricultural image segmentation method based on a Convolutional Neural Network (CNN) and an agricultural image segmentation method based on a Transformer; the former, CNN, extracts features in image space by convolution operation, has the disadvantage that: CNNs use local receptive fields in processing images and gradually reduce image resolution from lower to higher layers through convolution and pooling operations, such local receptive field limitations may result in loss of detail information and global context information in the image, particularly for fine-grained segmentation tasks for large-scale farmland areas; the latter convertors model global relationships in sequence space by self-attention mechanisms, which have the disadvantage that: the goal of the transducer is to model the dependency relationship between each pixel and other pixels through global context information, and there is a limitation in processing local features. In a farmland remote sensing image, different crops or land types may have different scales, some fine feature details need finer perceptibility, and a Transformer may not accurately capture the details when processing different scale features, which leads to reduced accuracy and robustness of a segmentation result.
In order to solve the problems, the invention designs a farmland remote sensing image segmentation method of semi-supervised interactive learning, which comprises the steps of firstly, mutually cooperating CNN and a transducer through interactive learning, and mutually transmitting local characteristics and global characteristics of pixels through self-supervised training on unlabeled data, so that the requirement of labeling data is reduced, and meanwhile, the possible defects of the two existing methods are effectively avoided; and secondly, introducing a directivity contrast loss function into the CNN, and performing full-supervision training on the tagged data to ensure the consistency of the same identity features in the pictures under different scenes, thereby improving the generalization capability and robustness of the model.
Specific implementations of the invention are described in detail below in connection with specific embodiments.
Example 1
FIG. 1 illustrates an exemplary system architecture for implementing a semi-supervised interactive learning based farmland remote sensing image segmentation method.
FIG. 2 shows the implementation flow of the farmland remote sensing image segmentation method based on semi-supervised interactive learning;
As shown in fig. 1 and fig. 2, in an embodiment of the present invention, a farmland remote sensing image segmentation method based on semi-supervised interactive learning includes the following steps:
Step S10: m input images divided with labels And N images without labels
Step S20: training CNN and transducer using the tagged image data, respectively;
Step S30: weak enhancement processing of Gaussian filtering and brightness adjustment on unlabeled images, randomly cropping two new images with overlapping regions in the same image, i.e. for each of the N unlabeled images Random cropping to obtain two groups of new images with overlapping areas/>,/>The sizes of all images are unified, meanwhile, pixels of the unlabeled images are projected between an encoder and a decoder of the CNN, a directional contrast loss function is introduced, consistency of the same identity features in the images under different scenes is guaranteed, and a transform prediction result is used as a pseudo tag to calculate context perception consistency loss; and calculating consistency regularization loss by using the CNN predicted result as a pseudo tag of the converter predicted result.
Step S40: and (3) taking the trained CNN model as a backbone network, segmenting the test set image, and evaluating the accuracy of the result.
Further, as shown in fig. 3, in the step S30, the step of performing weak enhancement processing of gaussian filtering and brightness adjustment on the unlabeled image, and randomly cropping two new images with overlapping areas in the same image includes:
Step S31: applying gaussian filtering to reduce noise and detail in the unlabeled image, weighted averaging the neighborhood around each pixel; for each pixel (x, y), filtering is performed using a gaussian kernel of size k, the filtered pixel values being:
(1)
wherein, Is the pixel value in the neighborhood,/>Is the weight value of the gaussian kernel;
Finally, adjusting the brightness of the image;
step S32: for a given weakly enhanced unlabeled image, randomly selecting the size and position of a cropping window;
moving the cutting window upwards and leftwards for a certain distance to obtain two new images with overlapping areas 、/>Training a model;
Step S33: using bicubic interpolation algorithm to interpolate all images Are scaled to an image of size 513px ∗ px and the corresponding label/>, using a two-line interpolation algorithmScaled to the same size so that the input image meets the input specifications of DeepLab v + network.
Further, in an embodiment of the present invention, the training process of the tagged image data includes:
Encoding matrices using tagged images Tag/>Respectively training two backbone network models of CNN and transducer, and calculating the loss function/>, which is related to the real label
As a further limitation of the technical scheme of the invention, the loss function of the calculation and the real labelThe method comprises the following steps:
Data to be tagged Inputting CNN to obtain prediction probability/>, corresponding to each pixel pointInputting a transducer to obtain a prediction probability/>Calculating a loss function/>, between the calculated and corresponding real values
Loss function between the computation and corresponding real valueThe process of (2) is as follows:
with real labels In contrast, the loss function/>, of the CNN fractionAs shown in formula (2):
(2)
Loss function of a transducer section As shown in formula (3):
(3)
Wherein the method comprises the steps of Representing a ReLU activation function,/>Representing Focal Loss, the expression is shown in formula (4):
(4)
Wherein the method comprises the steps of And/>Is a super parameter, here set as/>=0.25,/>=2;
The total loss calculation of the supervised learning model is shown in the formula (5):
(5)
Wherein when In the case of a real tag,/>=1, Vice versa/>=0,/>Is a real number ranging from 0 to 1, indicating the probability that the image belongs to the category noted in the label.
Further, in an embodiment of the present invention, the training process of the label-free image data includes:
through inputting two groups of weak enhancement unlabeled images which are cut randomly, obtaining a prediction result through a CNN network framework and taking the prediction result as a pseudo label predicted by a transducer model, and calculating consistency regularization loss
Obtaining a prediction result through a transducer network framework and taking the prediction result as a pseudo tag of intermediate projection to calculate the context-aware consistency lossAnd the interactive transmission of the image local information and the context global information in the training process is ensured, so that the model fully learns the consistency regularization capability.
Further, in the embodiment of the invention, in the label-free image training process, two main network models are respectively focused on learning local features and global features, information interaction is used for feature knowledge transfer, short plates are complementary, and meanwhile, in order to ensure that a CNN module has better robustness and generalization capability on the premise of only a small amount of data, a direction perception consistency constraint is introduced; specifically:
For two groups of input non-label images after weak enhancement ,/>Two groups of predicted values/>, are generated through a CNN model framework、/>; Similarly, two sets of predicted values/>, are generated through a transducer model framework、/>
(6)
Wherein,Representing CNN network model,/>Representing a transducer network model;
CNN is good at capturing local characteristics and spatial correlation in image processing, and the local structure of the image is extracted through convolution operation of local receptive fields;
The transducer is more suitable for modeling global dependence and long-range relation, and global information interaction can be established in the whole input sequence through a self-attention mechanism;
Thus, these predictions have essentially different properties at the output level, pseudo tags 、/>、/>、/>The calculation method of (2) is shown in the formula (7):
(7)
wherein, Representation such that the probability value/>The label corresponding to the maximum;
CNN prediction result is used as pseudo tag of transducer, and consistency regularization loss is carried out The calculation method of (2) is shown in the formula (8):
(8)
wherein, Representing a ReLU activation function,/>Representing a Dice loss function;
,/> Feature map/>, obtained by DeepLab v < 3+ > encoder Encoder And/>And then is passed through a nonlinear projector/>Projection as/>And/>Using a directional contrast loss function, overlapping regions/>, are encouragedAligning the contrast features with high confidence under different backgrounds, and finally keeping consistency;
for the first Label-free images, loss of directionality contrast/>The calculation formula of (2) is as follows:
(9)
(10)
(11)
wherein N represents the number of spatial positions of the overlapping feature regions; calculating the feature similarity; h, w represents a two-dimensional spatial position; /(I) Representing a negative set of images,/>Representation/>Negative samples in/>Representing a classifier;
Calculating a consistency loss function after context awareness constraint by using a prediction result of a transducer as a pseudo tag The formula is as follows:
(12)
wherein, Representing a Dice loss function; /(I)Representing the classifier.
Further, in the embodiment of the present invention, the overall loss is minimized, in the training process, parameters of the classification network are initialized first, then forward propagation and backward propagation are performed using training data, gradients of the loss function are calculated, and the network parameters are updated by using optimization algorithms such as gradient descent until the overall loss functionReaching a preset convergence condition;
By iteratively updating network parameters, we want the classification network to learn the proper feature representation so that the difference between the predicted result and the real label is minimized;
The total loss To supervise learning model total loss/>Consistency regularization loss/>Loss of directivity contrast/>Context aware consistency loss/>Wherein the total loss/>The calculation formula of (2) is as follows:
(13)
wherein, And/>Is a weight factor, with the aim of controlling the directional contrast loss/>And consistency regularization lossIn the total loss function/>Is a ratio of the number of the first and second groups.
As a further limitation of the technical scheme of the present invention, the step S40 specifically includes a model test, for a given test set farmland remote sensing image, CNN is used as a backbone network model to extract features, the probability of the category to which each pixel point of the target image belongs is segmented and output, and a threshold is set to mark the pixel point as a segmentation target or background.
As a further limitation of the present invention, the step of marking the pixel point as a segmentation target or background by the set threshold includes: obtaining maximum gray value of image by dividing modelAnd minimum gray value/>Let the initial threshold be/>According to/>Dividing an image into a foreground and a background, and respectively obtaining the average gray value/>And/>Find new threshold/>Iterating until if/>The obtained value is the threshold value, the prediction probability is larger than the threshold value and marked as the foreground, and smaller than the threshold value and marked as the background, so that the final segmentation mask is obtained.
In summary, the invention inserts the consistency constraint module of the directional perception in the interactive learning network of the CNN and the Transformer, the CNN is excellent in image processing, the spatial characteristics in the farmland remote sensing image can be effectively extracted, the local and global characteristics are captured through convolution and pooling operation, the Transformer is excellent in the natural language processing field, the sequence data and the long-distance dependency relationship are good in processing, and the advantages of the CNN in the aspect of spatial characteristic extraction and the advantages of the Transformer in the aspect of long-distance dependency modeling can be combined, the characteristics in the farmland remote sensing image can be better extracted and modeled, and the segmentation accuracy is improved.
In addition, the transducer uses a self-attention mechanism in the encoder-decoder framework, can effectively capture the context information in the image, is very important for accurate segmentation results for farmland remote sensing image segmentation, and can better model the relation between pixels in the image because crops and backgrounds in the image often have wide spatial relevance, thereby improving the segmentation accuracy;
On the other hand, farmland remote sensing images usually have higher resolution, detail information needs to be accurately recovered in image segmentation, and the calculation and memory requirements of the traditional CNN decoder on the high-resolution images are higher; the invention can gradually restore the image resolution between the encoder and the decoder by introducing the Transformer layer, and reduce the consumption of calculation and memory resources, thereby more effectively processing high-resolution images.
Example 2
As shown in fig. 4, in an exemplary embodiment provided by the present disclosure, the present invention further provides a farmland remote sensing image segmentation system, the farmland remote sensing image segmentation system 50 includes:
A preprocessing module 51, the preprocessing module 51 being used for dividing M input images with labels And unlabeled N images/>
A first training module 52, wherein the first training module 52 is configured to train CNN and a transducer using the tagged image data, respectively;
a second training module 53, where the second training module 53 is configured to perform weak enhancement processing of gaussian filtering and brightness adjustment on the unlabeled image, and randomly clip two new images with overlapping areas in the same image, i.e. for each of the N unlabeled images Random cropping to obtain two groups of new images with overlapping areas,/>The sizes of all images are unified, meanwhile, pixels of the unlabeled images are projected between an encoder and a decoder of the CNN, a directional contrast loss function is introduced, consistency of the same identity features in the images under different scenes is guaranteed, and a transform prediction result is used as a pseudo tag to calculate context perception consistency loss; and calculating consistency regularization loss by using the CNN predicted result as a pseudo tag of the converter predicted result.
The model test module 54 is used for dividing the test set image by taking the trained CNN model as a backbone network and evaluating the accuracy of the result.
Example 3
As shown in fig. 5, in an embodiment of the present invention, the present invention further provides a computer device.
The computer device 60 comprises a memory 61, a processor 62 and computer readable instructions stored in the memory 61 and executable on the processor 62, which processor 62 when executing the computer readable instructions implements the farmland telemetry image segmentation method based on semi-supervised interactive learning as provided by embodiment 1.
The farmland remote sensing image segmentation method based on semi-supervised interactive learning comprises the following steps:
Step S10: m input images divided with labels And N images without labels
Step S20: training CNN and transducer using the tagged image data, respectively;
Step S30: weak enhancement processing of Gaussian filtering and brightness adjustment on unlabeled images, randomly cropping two new images with overlapping regions in the same image, i.e. for each of the N unlabeled images Random cropping to obtain two groups of new images with overlapping areas/>The sizes of all images are unified, meanwhile, pixels of the unlabeled images are projected between an encoder and a decoder of the CNN, a directional contrast loss function is introduced, consistency of the same identity features in the images under different scenes is guaranteed, and a transform prediction result is used as a pseudo tag to calculate context perception consistency loss; and calculating consistency regularization loss by using the CNN predicted result as a pseudo tag of the converter predicted result.
Step S40: and (3) taking the trained CNN model as a backbone network, segmenting the test set image, and evaluating the accuracy of the result.
In addition, the device 60 according to the embodiment of the present invention may further have a communication interface 63 for receiving a control command.
Example 4
In an exemplary embodiment provided by the present disclosure, a computer-readable storage medium is also provided.
Specifically, in an exemplary embodiment of the present disclosure, the storage medium stores computer readable instructions that, when executed by one or more processors, cause the one or more processors to perform a farmland remote sensing image segmentation method based on semi-supervised interaction learning as provided by embodiment 1.
The farmland remote sensing image segmentation method based on semi-supervised interactive learning comprises the following steps:
Step S10: m input images divided with labels And N images without labels
Step S20: training CNN and transducer using the tagged image data, respectively;
Step S30: weak enhancement processing of Gaussian filtering and brightness adjustment on unlabeled images, randomly cropping two new images with overlapping regions in the same image, i.e. for each of the N unlabeled images Random cropping to obtain two groups of new images with overlapping areas/>The sizes of all images are unified, meanwhile, pixels of the unlabeled images are projected between an encoder and a decoder of the CNN, a directional contrast loss function is introduced, consistency of the same identity features in the images under different scenes is guaranteed, and a transform prediction result is used as a pseudo tag to calculate context perception consistency loss; and calculating consistency regularization loss by using the CNN predicted result as a pseudo tag of the converter predicted result.
Step S40: and (3) taking the trained CNN model as a backbone network, segmenting the test set image, and evaluating the accuracy of the result.
In various embodiments of the present invention, it should be understood that the size of the sequence numbers of the processes does not mean that the execution sequence of the processes is necessarily sequential, and the execution sequence of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer-accessible memory. Based on this understanding, the technical solution of the present invention, or a part contributing to the prior art or all or part of the technical solution, may be embodied in the form of a software product stored in a memory, comprising several requests for a computer device (which may be a personal computer, a server or a network device, etc., in particular may be a processor in a computer device) to execute some or all of the steps of the method according to the embodiments of the present invention.
Those of ordinary skill in the art will appreciate that some or all of the steps of the various methods of the described embodiments may be implemented by hardware associated with a program that may be stored in a computer-readable storage medium, including Read-Only Memory (ROM), random-access Memory (RandomAccess Memory,11 RAM), programmable Read-Only Memory (Programmable Read-Only Memory, PROM), erasable programmable Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM), one-time programmable Read-Only Memory (OTPROM), electrically erasable programmable Read-Only Memory (EEPROM), compact disc Read-Only Memory (CD-ROM) or other optical disc Memory, magnetic disk Memory, tape Memory, or any other medium capable of being used to carry or store data.
The farmland remote sensing image segmentation method based on semi-supervised interactive learning disclosed by the embodiment of the invention is described in detail, and specific examples are applied to explain the principle and the implementation mode of the invention, and the description of the above examples is only used for helping to understand the method and the core idea of the invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (3)

1.一种基于半监督交互学习的农田遥感图像分割方法,其特征在于,所述的农田遥感图像分割方法包括如下步骤:1. A farmland remote sensing image segmentation method based on semi-supervised interactive learning, characterized in that the farmland remote sensing image segmentation method comprises the following steps: 步骤S10:划分有标签的M个输入图像xL={x1,x2,...,xM}和无标签的N个图像xU={x1,x2,...,xN};Step S10: Divide into M labeled input images x L ={x 1 , x 2 , ..., x M } and N unlabeled images x U ={x 1 , x 2 , ..., x N }; 步骤S20:使用有标签的图像数据分别训练CNN和Transformer;Step S20: Use the labeled image data to train CNN and Transformer respectively; 步骤S30:对无标签的图像进行高斯滤波和亮度调整的弱增强处理,在同一图像中随机裁剪出具有重叠区域的两张新图像xU1={x11,x21,...,xN1},xU2={x12,x22,...,xN2},同时把无标签图像的像素投影到CNN的编码器和解码器之间,引入方向性对比损失函数,保证图片中具有相同身份特征在不同场景下的一致性,利用Transformer预测结果作为伪标签计算上下文感知一致性损失;利用CNN预测结果作为Transformer预测结果的伪标签计算一致性正则化损失;Step S30: Perform weak enhancement processing of Gaussian filtering and brightness adjustment on the unlabeled image, randomly crop two new images x U1 = {x 11 , x 21 , ..., x N1 }, x U2 = {x 12 , x 22 , ..., x N2 } with overlapping areas from the same image, and project the pixels of the unlabeled image between the encoder and decoder of CNN, introduce a directional contrast loss function to ensure the consistency of the same identity features in different scenes in the picture, use the Transformer prediction result as the pseudo label to calculate the context-aware consistency loss; use the CNN prediction result as the pseudo label of the Transformer prediction result to calculate the consistency regularization loss; 步骤S40:由训练好的CNN模型作为主干网络,对测试集图像进行分割,并评估结果的准确性;Step S40: Using the trained CNN model as the backbone network, the test set images are segmented and the accuracy of the results is evaluated; 对无标签的图像进行高斯滤波和亮度调整的弱增强处理,在同一图像中随机裁剪出具有重叠区域的两张新图像的步骤包括:The steps of performing weak enhancement processing of Gaussian filtering and brightness adjustment on the unlabeled image and randomly cropping two new images with overlapping areas from the same image include: 步骤S31:应用高斯滤波来减少无标签图像中的噪声和细节,对每个像素周围的邻域进行加权平均;对于每个像素(x,y),使用大小为k的高斯核进行滤波,滤波后的像素值为:Step S31: Apply Gaussian filtering to reduce noise and details in the unlabeled image, and perform weighted averaging on the neighborhood around each pixel; for each pixel (x, y), use a Gaussian kernel of size k to filter, and the pixel value after filtering is: I(x,y)=∑(G(x′,y′)*I(x′,y′)) (1)I(x, y) = ∑(G(x′, y′) * I(x′, y′)) (1) 其中,I(x′,y′)是邻域内的像素值,G(x′,y′)是高斯核的权重值;Among them, I(x′, y′) is the pixel value in the neighborhood, and G(x′, y′) is the weight value of the Gaussian kernel; 最后调整图像亮度;Finally, adjust the image brightness; 步骤S32:对于给定的弱增强后的无标签图像,随机选择一个裁剪窗口的大小和位置;Step S32: for a given weakly enhanced unlabeled image, randomly select a size and position of a cropping window; 将该裁剪窗口分别向上和向左移动一定的距离,得到两个具有重叠区域的新图像xu1、xu2来训练模型;Move the cropping window upward and leftward by a certain distance to obtain two new images x u1 and x u2 with overlapping areas to train the model; 步骤S33:使用双三次插值算法将所有图像χ都缩放成尺寸为513px*513px的图像,并使用双线插值算法将对应标签y缩放成相同尺寸,使得输入图像符合DeepLab v3+网络的输入规格;Step S33: Use the bicubic interpolation algorithm to scale all images x to images of size 513px*513px, and use the bilinear interpolation algorithm to scale the corresponding labels y to the same size, so that the input image meets the input specifications of the DeepLab v3+ network; 所述有标签图像数据的训练过程包括:The training process of the labeled image data includes: 使用有标签图像编码矩阵xL={χ1,χ2,...,χM}及标签yL={y1,y2,...,yM}分别训练CNN和Transformer两大骨干网络模型,并计算与真实标签的损失函数 Use the labeled image encoding matrix x L = {χ 1 , χ 2 , ..., χ M } and the label y L = {y 1 , y 2 , ..., y M } to train the CNN and Transformer backbone network models respectively, and calculate the loss function with the real label 计算与真实标签的损失函数的步骤包括:Calculate the loss function with the true label The steps include: 将有标签的数据χL输入CNN得到每个像素点对应的预测概率输入Transformer得到预测概率/>计算与对应真实值之间的损失函数/> Input the labeled data χ L into CNN to obtain the predicted probability corresponding to each pixel Input Transformer to get predicted probability/> Calculate the loss function between the corresponding true value/> 所述计算与对应真实值之间的损失函数的过程如下:The loss function between the calculation and the corresponding true value The process is as follows: 与真实标签yl做对比,CNN部分的损失函数如公式(2)所示:Compared with the true label y l , the loss function of the CNN part As shown in formula (2): Transformer部分的损失函数如公式(3)所示:Loss function of Transformer part As shown in formula (3): 其中σ代表ReLU激活函数,lFL代表Focal Loss,表达式如公式(4)所示:Where σ represents the ReLU activation function, l FL represents the Focal Loss, and the expression is shown in formula (4): 其中αl和γ为超参数,这里设置为αl=0.25,γ=2;Where α l and γ are hyper parameters, which are set as α l = 0.25 and γ = 2 here; 监督学习模型总的损失计算如公式(5)所示:The total loss calculation of the supervised learning model is shown in formula (5): 其中当l为真实标签时,yl=1,反之yl=0是范围在0到1之间实数,表示图像属于标签中所标注类别的概率;When l is the true label, y l = 1 , otherwise y l = 0 ; It is a real number between 0 and 1, indicating the probability that the image belongs to the category marked in the label; 所述无标签图像数据的训练过程包括:The training process of the unlabeled image data includes: 通过输入随机裁剪的两组弱增强无标签图像,经过CNN网络框架得到预测结果并作为Transformer模型预测的伪标签,计算一致性正则化损失 By inputting two sets of randomly cropped weakly enhanced unlabeled images, the prediction results are obtained through the CNN network framework and used as pseudo labels predicted by the Transformer model to calculate the consistency regularization loss. 经过Transformer网络框架得到预测结果并作为中间投影的伪标签,来计算上下文感知一致性损失 The prediction results are obtained through the Transformer network framework and used as pseudo labels for intermediate projections to calculate the context-aware consistency loss. 在无标签图像训练过程中,对于输入的两组经过弱增强后的无标签图像xu1,xu2,经过CNN模型框架生成两组预测值同理,经过Transformer模型框架生成两组预测值 In the unlabeled image training process, for the two sets of weakly enhanced unlabeled images x u1 and x u2 , the CNN model framework generates two sets of prediction values Similarly, two sets of prediction values are generated through the Transformer model framework 其中,表示CNN网络模型,/>表示Transformer网络模型;in, Represents the CNN network model, /> Represents the Transformer network model; 伪标签的计算方法如公式(7)所示:Pseudo Labels The calculation method of is shown in formula (7): 其中,argmax(p)表示使得预测概率值p达到最大时对应的标签;Among them, argmax(p) represents the label corresponding to the maximum predicted probability value p; CNN的预测结果作为Transformer的伪标签,一致性正则化损失的计算方法如公式(8)所示:CNN prediction results as Transformer pseudo labels, consistency regularization loss The calculation method of is shown in formula (8): 其中,σ代表ReLU激活函数,ldice代表Dice损失函数;Among them, σ represents the ReLU activation function, l dice represents the Dice loss function; xu1,xu2通过DeepLab v3+的编码器Encoder得到特征图Mu1和Mu2,然后被非线性投影仪投影为Mo1和Mo2,使用方向性对比损失函数,鼓励重叠区域xo在不同背景下向置信度高的对比特征对齐,最终保持一致;x u1 , x u2 are passed through the DeepLab v3+ encoder to obtain feature maps M u1 and M u2 , and then are projected by the nonlinear projector Projected into Mo1 and Mo2 , using directional contrast loss function, the overlapping region xo is encouraged to align to the contrast features with high confidence in different backgrounds and finally remain consistent; 对于第i个无标签图像,方向性对比损失的计算公式如下:For the i-th unlabeled image, the directional contrast loss The calculation formula is as follows: 其中,N表示重叠的特征区域的空间位置数;r计算特征相似度;h,w表示二维空间位置;Mu表示图像负样本集,m表示Mu中的负样本,C(*)表示分类器;Where N represents the number of spatial positions of overlapping feature regions; r calculates feature similarity; h, w represent two-dimensional spatial positions; Mu represents the image negative sample set, m represents the negative sample in Mu , and C(*) represents the classifier; 用Transformer的预测结果作为伪标签,计算上下文感知约束后的一致性损失函数公式如下:Use the Transformer prediction results as pseudo labels and calculate the consistency loss function after context-aware constraints The formula is as follows: 其中,ldice代表Dice损失函数;C(*)表示分类器;Where, l dice represents the Dice loss function; C(*) represents the classifier; 还包括最小化整体损失,在训练过程中,初始化分类网络的参数,然后使用训练数据进行前向传播和反向传播,计算损失函数的梯度,并利用梯度下降优化算法来更新网络参数,直到总损失函数达到预设的收敛条件;It also includes minimizing the overall loss. During the training process, the parameters of the classification network are initialized, and then the training data is used for forward propagation and back propagation, the gradient of the loss function is calculated, and the gradient descent optimization algorithm is used to update the network parameters until the total loss function Reach the preset convergence condition; 所述总损失为监督学习模型总损失/>一致性正则化损失/>方向性对比损失/>上下文感知一致性损失/>的线性组合,其中,所述总损失/>的计算公式如下:The total loss is the total loss of the supervised learning model/> Consistency Regularization Loss/> Directional contrast loss/> Context-aware consistency loss/> A linear combination of , where the total loss/> The calculation formula is as follows: 其中,λ和λw是权重因子,目的是控制方向性对比损失和一致性正则化损失/>在总损失函数/>中的占比。Among them, λ and λ w are weight factors, the purpose is to control the directional contrast loss and consistency regularization loss/> In the total loss function/> The proportion of . 2.根据权利要求1所述的基于半监督交互学习的农田遥感图像分割方法,其特征在于,所述步骤S40具体包括模型测试,对于给定测试集农田遥感图像,以CNN作为主干网络模型提取特征,分割输出目标图像每个像素点所属类别概率,并设定阈值将该像素点标记为分割目标或背景。2. According to the method for segmenting farmland remote sensing images based on semi-supervised interactive learning in claim 1, it is characterized in that step S40 specifically includes model testing. For a given test set of farmland remote sensing images, CNN is used as the backbone network model to extract features, the probability of the category to which each pixel point of the segmented output target image belongs is obtained, and a threshold is set to mark the pixel point as a segmentation target or background. 3.根据权利要求2所述的基于半监督交互学习的农田遥感图像分割方法,其特征在于,所述设定阈值将该像素点标记为分割目标或背景的步骤包括:分割模型求出图像的最大灰度值Pmax和最小灰度值Pmin,令初始阈值为T0=(Pmax+Pmin)/2,根据T(k),k=0,1,2...,k将图像分割为前景和背景,分别求出两者的平均灰度值H1和H2,求出新阈值T(k+1)=(H1+H2)/2,迭代直到若T(k)=T(k+1),所得即为阈值,预测概率大于阈值标记为前景,小于阈值标记为背景,以获得最终的分割掩码。3. The farmland remote sensing image segmentation method based on semi-supervised interactive learning according to claim 2 is characterized in that the step of setting a threshold to mark the pixel point as a segmentation target or background comprises: the segmentation model calculates the maximum grayscale value P max and the minimum grayscale value P min of the image, sets the initial threshold as T 0 = (P max + P min )/2, and divides the image into foreground and background according to T (k) , k = 0, 1, 2..., k, and calculates the average grayscale values H 1 and H 2 of the two respectively, and calculates a new threshold T (k+1) = (H 1 + H 2 )/2, iterates until T (k) = T (k+1) , the result is the threshold, and the predicted probability is greater than the threshold and marked as foreground, and less than the threshold and marked as background, so as to obtain the final segmentation mask.
CN202311334268.4A 2023-10-16 2023-10-16 A method for farmland remote sensing image segmentation based on semi-supervised interactive learning Active CN117253044B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311334268.4A CN117253044B (en) 2023-10-16 2023-10-16 A method for farmland remote sensing image segmentation based on semi-supervised interactive learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311334268.4A CN117253044B (en) 2023-10-16 2023-10-16 A method for farmland remote sensing image segmentation based on semi-supervised interactive learning

Publications (2)

Publication Number Publication Date
CN117253044A CN117253044A (en) 2023-12-19
CN117253044B true CN117253044B (en) 2024-05-24

Family

ID=89134963

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311334268.4A Active CN117253044B (en) 2023-10-16 2023-10-16 A method for farmland remote sensing image segmentation based on semi-supervised interactive learning

Country Status (1)

Country Link
CN (1) CN117253044B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117437426B (en) * 2023-12-21 2024-09-10 苏州元瞰科技有限公司 Semi-supervised semantic segmentation method for high-density representative prototype guidance
CN118155284A (en) * 2024-03-20 2024-06-07 飞虎互动科技(北京)有限公司 Signature action detection method, signature action detection device, electronic equipment and readable storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507343A (en) * 2019-01-30 2020-08-07 广州市百果园信息技术有限公司 Training of semantic segmentation network and image processing method and device thereof
CN113469283A (en) * 2021-07-23 2021-10-01 山东力聚机器人科技股份有限公司 Image classification method, and training method and device of image classification model
CN114943831A (en) * 2022-07-25 2022-08-26 安徽农业大学 Knowledge distillation-based mobile terminal pest target detection method and mobile terminal equipment
WO2023024920A1 (en) * 2021-08-24 2023-03-02 华为云计算技术有限公司 Model training method and system, cluster, and medium
CN116051574A (en) * 2022-12-28 2023-05-02 河南大学 A semi-supervised segmentation model construction and image analysis method, device and system
CN116258730A (en) * 2023-05-16 2023-06-13 先进计算与关键软件(信创)海河实验室 A Semi-supervised Medical Image Segmentation Method Based on Consistency Loss Function
CN116258695A (en) * 2023-02-03 2023-06-13 浙江大学 Semi-supervised medical image segmentation method based on interaction of Transformer and CNN
CN116402838A (en) * 2023-06-08 2023-07-07 吉林大学 Semi-supervised image segmentation method and system for intracranial hemorrhage

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507343A (en) * 2019-01-30 2020-08-07 广州市百果园信息技术有限公司 Training of semantic segmentation network and image processing method and device thereof
CN113469283A (en) * 2021-07-23 2021-10-01 山东力聚机器人科技股份有限公司 Image classification method, and training method and device of image classification model
WO2023024920A1 (en) * 2021-08-24 2023-03-02 华为云计算技术有限公司 Model training method and system, cluster, and medium
CN114943831A (en) * 2022-07-25 2022-08-26 安徽农业大学 Knowledge distillation-based mobile terminal pest target detection method and mobile terminal equipment
CN116051574A (en) * 2022-12-28 2023-05-02 河南大学 A semi-supervised segmentation model construction and image analysis method, device and system
CN116258695A (en) * 2023-02-03 2023-06-13 浙江大学 Semi-supervised medical image segmentation method based on interaction of Transformer and CNN
CN116258730A (en) * 2023-05-16 2023-06-13 先进计算与关键软件(信创)海河实验室 A Semi-supervised Medical Image Segmentation Method Based on Consistency Loss Function
CN116402838A (en) * 2023-06-08 2023-07-07 吉林大学 Semi-supervised image segmentation method and system for intracranial hemorrhage

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
S-RPN: Sampling-balanced region proposal network for small crop pest detection;Rujing Wang 等;Computers and Electronics in Agriculture;20210831;第187卷;1-11 *
融合ASPP-Attention和上下文的复杂场景语义分割;杨鑫;于重重;王鑫;陈秀新;;计算机仿真;20200915(第09期);209-213 *

Also Published As

Publication number Publication date
CN117253044A (en) 2023-12-19

Similar Documents

Publication Publication Date Title
US11176381B2 (en) Video object segmentation by reference-guided mask propagation
CN117253044B (en) A method for farmland remote sensing image segmentation based on semi-supervised interactive learning
WO2020238560A1 (en) Video target tracking method and apparatus, computer device and storage medium
US10019652B2 (en) Generating a virtual world to assess real-world video analysis performance
CN112634296B (en) RGB-D image semantic segmentation method and terminal for gate mechanism guided edge information distillation
CN113344932B (en) A Semi-Supervised Single-Object Video Segmentation Method
CN112395951B (en) Complex scene-oriented domain-adaptive traffic target detection and identification method
CN110163188B (en) Video processing and method, device and equipment for embedding target object in video
KR102321998B1 (en) Method and system for estimating position and direction of image
CN116453121B (en) Training method and device for lane line recognition model
CN107918776A (en) A kind of plan for land method, system and electronic equipment based on machine vision
CN114787828B (en) Inference or training of artificial intelligence neural networks using imagers with intentionally controlled distortion
CN113744280B (en) Image processing method, device, equipment and medium
CN115862119A (en) Human face age estimation method and device based on attention mechanism
KR20240159462A (en) Method for determining pose of target object in query image and electronic device performing same method
CN118781502A (en) A vehicle detection data enhancement method for UAV remote sensing images
CN117693768A (en) Optimization methods and devices for semantic segmentation models
CN116935332A (en) Fishing boat target detection and tracking method based on dynamic video
CN115577768A (en) Semi-supervised model training method and device
CN118096800B (en) Training method, device, equipment and medium for small sample semantic segmentation model
CN117911900A (en) A method and system for detecting obstacles and targets of substation inspection drones
CN111144422A (en) Positioning identification method and system for aircraft component
CN119068080A (en) Method, electronic device and computer program product for generating an image
CN116309618A (en) Artificial intelligence-based hip joint image segmentation method and system
CN116310304A (en) A water area image segmentation method and its segmentation model training method and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant