CN116778335B - Method and system for detecting collapsed building based on cross-domain teacher-student training - Google Patents

Method and system for detecting collapsed building based on cross-domain teacher-student training Download PDF

Info

Publication number
CN116778335B
CN116778335B CN202310812000.0A CN202310812000A CN116778335B CN 116778335 B CN116778335 B CN 116778335B CN 202310812000 A CN202310812000 A CN 202310812000A CN 116778335 B CN116778335 B CN 116778335B
Authority
CN
China
Prior art keywords
network
training
remote sensing
sensing data
aviation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310812000.0A
Other languages
Chinese (zh)
Other versions
CN116778335A (en
Inventor
尹鹏宇
潘洁
谭骏翔
王旻罡
杨宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aerospace Information Research Institute of CAS
Original Assignee
Aerospace Information Research Institute of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aerospace Information Research Institute of CAS filed Critical Aerospace Information Research Institute of CAS
Priority to CN202310812000.0A priority Critical patent/CN116778335B/en
Publication of CN116778335A publication Critical patent/CN116778335A/en
Application granted granted Critical
Publication of CN116778335B publication Critical patent/CN116778335B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention provides a collapse building detection method and system based on cross-domain teacher-student training. The method comprises the following steps: applying the manual labeling of the aviation optical remote sensing data and the aviation optical remote sensing data as input, training a teacher network, and obtaining a pseudo tag; the method comprises the steps of training an aviation satellite style migration network by using aviation optical remote sensing data and satellite optical remote sensing data as input to generate pseudolite optical remote sensing data; taking the optical remote sensing data of the pseudolites, the manual labels and the pseudo labels as inputs to train the student network; updating parameters in the student network after the training of the round into a teacher network by applying an EMA algorithm, updating the parameters of the teacher network, and carrying out training iteration, wherein the trained teacher network is the finally obtained target detection model; inputting the aviation optical remote sensing data into a target detection model, and detecting the collapsed building. The scheme provided by the invention can effectively detect the damaged building through aerial remote sensing and satellite remote sensing image data.

Description

Method and system for detecting collapsed building based on cross-domain teacher-student training
Technical Field
The invention belongs to the field of disaster response and collapse building detection, and particularly relates to a collapse building detection method and system based on cross-domain teacher-student training.
Background
Detection of damaged or collapsed buildings is critical to the earthquake disaster emergency response. In the related algorithms of collapse building detection, two general categories can be distinguished: one is based on the detection of changes between pre-and post-disaster images; the other is based on post-disaster image detection only. Because the change detection needs a large amount of pretreatment, and the earthquake disaster has a large accident, the immediate pre-disaster aviation data is difficult to acquire. Thus, most algorithms and techniques are based on constructing a detection model on post-disaster images.
In recent years, deep learning has been widely used in the field of object detection, and many successful object detection models, such as fast-Rcnn and YOLO, which are mature object detection models based on deep learning, have been proposed and applied to detect collapsed buildings. However, training of deep learning models typically requires a large amount of marker data. Because the threshold of aerial remote sensing data collection is higher, and meanwhile, building data with damaged earthquake collapse is rare, the related data sets for training of the deep learning model are rare. Furthermore, the damaged area in a single scene is limited, and the feature diversity is insufficient. The above limitations result in using first labeling data and then performing supervised training on the model, the collapsed building detection model obtained by training the traditional target detection model training paradigm has the following two problems:
1. The model recognition accuracy is not high due to insufficient training samples;
2. because the aviation data and the satellite data are data in different image domains, the model trained based on the aviation data is obviously degraded on the satellite data.
These problems limit the development of deep learning in seismic disaster response applications.
In the prior art in the field of deep learning, semi-supervised learning is often used to cope with the problem of few labeling samples. Semi-supervised object detection techniques train the object detector with labeled, weakly labeled, or unlabeled data. At the same time, the aviation data and the satellite data can be regarded as data of different image domains, and the domain adaptation algorithm can help the model to improve average performance on the different image domains. But currently there is no technology that combines semi-supervised learning and domain adaptation algorithms into seismic collapse building detection.
Technical proposal of the prior art
The technical principle proposed in the literature Earthquake-Induced Building DAMAGE MAPPING Based on Multi-TASK DEEP LEARNING Framework is that post-earthquake image data with labels are input into a deep learning semantic segmentation model for training, so that the model is required to detect normal buildings besides collapsed buildings, and the characteristic learning of a network is enhanced.
Shortcomings of the prior art
1. This technique is still a training paradigm for traditional supervised learning, without a mechanism to use unlabeled data.
2. The over-fitting problem that occurs when the number of labeled samples is small, and the domain migration problem of the model between the aerial data and the satellite data cannot be solved.
Technical proposal of the second prior art
A similar teacher-student framework is proposed by a study named "Cross-Domain ADAPTIVE TEACHER for Object Detection" to solve the Domain adaptation problem in target detection. In this study, the source domain is tagged and the target domain is not tagged, so there is a difference between the domains. The study proposes a self-training framework named "adaptive teacher" that attempts to solve the model domain migration problem and improves the quality of pseudo tags in the target domain by opposing learning and mutual learning. The model includes two independent modules: target specific teacher models and cross-domain student models. This study also applied weak strength enhancement techniques and used Faster R-CNN as the backbone network for the detector.
Disadvantages of the second prior art
1. In solving the domain migration problem between source domain data and target domain data, the feature extraction module (Feature Encoder) is constrained by the loss function and Discriminator only at the feature map level, without alignment or transition at the original input image level for differences between image domains.
2. There is no verification or application on aerial remote sensing and satellite remote sensing images, in particular on collapse building detection.
3. The problem of cross-image-domain target detection of aerial remote sensing images and satellite remote sensing images is not solved pertinently.
4. The data enhancement mode is not designed in a targeted manner according to the image characteristics of two domains, and the training is unstable and the deviation is increased easily by using the traditional strong random data enhancement mode.
Disclosure of Invention
In order to solve the technical problems, the invention provides a technical scheme of a collapse building detection method based on cross-domain teacher-student training, so as to solve the technical problems.
The invention discloses a collapse building detection method based on cross-domain teacher-student training, which comprises the following steps:
s1, constructing a data set comprising aviation optical remote sensing data and satellite optical remote sensing data of a collapsed building;
S2, training a teacher network by using manual labeling of aviation optical remote sensing data and the aviation optical remote sensing data as inputs to obtain a pseudo tag;
S3, training an aviation satellite style migration network by using aviation optical remote sensing data and satellite optical remote sensing data as inputs to generate pseudolite optical remote sensing data; the structure of the aviation satellite style migration network is a generation countermeasure network for conversion from unpaired images to images;
s4, training a student network by taking the pseudolite optical remote sensing data, the manual label and the pseudo tag as inputs;
s5, updating parameters in the student network after the training round to the teacher network by applying an EMA algorithm, and updating the parameters of the teacher network;
S6, repeating the steps S2-S5 to train and iterate the teacher network, wherein the trained teacher network is the finally obtained target detection model;
And S7, inputting aviation optical remote sensing data into the target detection model, and detecting the collapsed building.
According to the method of the first aspect of the present invention, in said step S2, a two-phase model, fast-Rcnn, is selected as the teacher network.
According to the method of the first aspect of the invention, in said step S2, a loss function is applied Training the teacher network;
wherein, Representing RPN class loss,/>Representing the RPN regression loss,/>Representing the ROI classification loss,Indicating ROI regression loss.
According to the method of the first aspect of the present invention, in said step S3, the network structure of said aeronautical satellite style migration network learns a mapping G: x 1→X2 and F: x 2→X1;
X 1 represents aviation optical remote sensing data, X 2 represents satellite optical remote sensing data, and G and F represent mapping functions;
Combining the cyclic consistency loss function with the fight loss functions of X 1 and X 2 to obtain a complete objective function of unpaired image-to-image conversion for aviation satellite style migration network training.
According to the method of the first aspect of the present invention, in said step S3, a Cycle-GAN network is selected as an aviation satellite style migration network.
According to the method of the first aspect of the present invention, in the step S4, the method for training a student network takes the pseudolite optical remote sensing data, the artificial annotation and the pseudotag as inputs, and includes:
and respectively combining the pseudolite optical remote sensing data with the artificial labels and the pseudotags to form two sets of training data pairs, and training a student network by taking the two sets of training data pairs as input.
According to the method of the first aspect of the present invention, in the step S4, the student network model is composed of an R-CNN network and a Cycle-GAN network; in the Cycle-GAN network ResNet is used as the basic network of generators and discriminators.
The invention discloses a collapse building detection system based on cross-domain teacher-student training, which comprises:
A first processing module configured to construct a dataset comprising aerial optical remote sensing data and satellite optical remote sensing data of the collapsed building;
The second processing module is configured to train the teacher network to obtain a pseudo tag by applying the manual annotation of the aviation optical remote sensing data and the aviation optical remote sensing data as inputs;
the third processing module is configured to train the aviation satellite style migration network by using the aviation optical remote sensing data and the satellite optical remote sensing data as inputs to generate pseudolite optical remote sensing data; the structure of the aviation satellite style migration network is a generation countermeasure network for conversion from unpaired images to images;
A fourth processing module configured to train a student network with the pseudolite optical remote sensing data, the manual annotation, and the pseudotag as inputs;
A fifth processing module configured to apply an EMA algorithm to update parameters in the student network after the training round to the teacher network, and update parameters of the teacher network;
The sixth processing module is configured to repeat training iteration of the second processing module to the fifth processing module on the teacher network, and the trained teacher network is the finally obtained target detection model;
and a seventh processing module configured to input aviation optical remote sensing data into the target detection model to detect a collapsed building.
A third aspect of the invention discloses an electronic device. The electronic device comprises a memory and a processor, the memory stores a computer program, and the processor implements the steps in a method for detecting a collapsed building based on cross-domain teacher-student training in any one of the first aspect of the disclosure when executing the computer program.
A fourth aspect of the invention discloses a computer-readable storage medium. A computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps in a method for detecting a collapsed building based on cross-domain teacher-student training of any one of the first aspects of the present disclosure.
In summary, the scheme provided by the invention can effectively utilize a large amount of unlabeled remote sensing image data, can reduce the dependence of a model on manual labeling, can obtain good accuracy when detecting a collapsed building on aviation data and satellite data, and can improve generalization and domain migration capability of the model; the damaged building is effectively detected through the aerial remote sensing and satellite remote sensing image data, the emergency response capability of the earthquake disaster can be improved, and rescue workers can be helped to quickly locate the damaged building.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for detecting a collapsed building based on cross-domain teacher-student training according to an embodiment of the invention;
FIG. 2 is a diagram of teacher network training according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an aviation satellite style migration network training in accordance with an embodiment of the present invention;
FIG. 4 is a diagram of a student network training according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a teacher network parameter update according to an embodiment of the present invention;
FIG. 6 is a graph of output results during training of an aviation satellite style migration network according to an embodiment of the present invention;
FIG. 7 is a block diagram of a collapsed building detection system based on cross-domain teacher-student training in accordance with an embodiment of the present invention;
Fig. 8 is a block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention discloses a collapse building detection method based on cross-domain teacher-student training. Fig. 1 is a flowchart of a method for detecting a collapsed building based on cross-domain teacher-student training according to an embodiment of the present invention, as shown in fig. 1, the method includes:
s1, constructing a data set comprising aviation optical remote sensing data and satellite optical remote sensing data of a collapsed building;
S2, training a teacher network by using manual labeling of aviation optical remote sensing data and the aviation optical remote sensing data as inputs to obtain a pseudo tag;
S3, training an aviation satellite style migration network by using aviation optical remote sensing data and satellite optical remote sensing data as inputs to generate pseudolite optical remote sensing data; the structure of the aviation satellite style migration network is a generation countermeasure network for conversion from unpaired images to images;
s4, training a student network by taking the pseudolite optical remote sensing data, the manual label and the pseudo tag as inputs;
s5, updating parameters in the student network after the training round to the teacher network by applying an EMA algorithm, and updating the parameters of the teacher network;
S6, repeating the steps S2-S5 to train and iterate the teacher network, wherein the trained teacher network is the finally obtained target detection model;
And S7, inputting aviation optical remote sensing data into the target detection model, and detecting the collapsed building.
In step S1, a dataset comprising aerial optical remote sensing data and satellite optical remote sensing data of the collapsed building is constructed.
Specifically, a dataset DB-ARSD was constructed, comprising 3000 satellite images and 1000 aerial images. These images are collected after natural disasters such as earthquakes, hurricanes, etc. In these images the bounding box of the damaged building is marked.
In step S2, as shown in FIG. 2, the teacher network f (θ) is trained to obtain pseudo tags using manual labeling of the aviation optical remote sensing data and the aviation optical remote sensing data as inputs
In some embodiments, in step S2, a two-phase model Faster-Rcnn is selected as the teacher network.
Applying a loss functionTraining the teacher network;
wherein, Representing RPN class loss,/>Representing the RPN regression loss,/>Representing the ROI classification loss,Indicating ROI regression loss.
In step S3, as shown in FIG. 3, the aviation satellite style migration network g (θ) is trained using the aviation optical remote sensing data and the satellite optical remote sensing data as inputs to generate pseudolite optical remote sensing dataThe architecture of the aviation satellite style migration network is a generation countermeasure network for unpaired image-to-image conversion.
In some embodiments, in the step S3, the network structure of the aviation satellite style migration network learns a mapping G: x 1→X2 and F: x 2→X1;
X 1 represents aviation optical remote sensing data, X 2 represents satellite optical remote sensing data, and G and F represent mapping functions;
Combining the cyclic consistency loss function with the fight loss functions of X 1 and X 2 to obtain a complete objective function of unpaired image-to-image conversion for aviation satellite style migration network training.
And selecting the Cycle-GAN network as an aviation satellite style migration network.
Specifically, the cyclic consistency loss function encourages F (G (X 1))≈x1 and G(F(x2))≈x2. Combine this loss function with the antagonistic loss functions of X 1 and X 2 to get the complete objective function of the unpaired image-to-image conversion of the aeronautical satellite style migration network training:
Where G attempts to generate an image G (X 1) that looks similar to that from domain X 2, and D Y attempts to distinguish between the transformed sample G (X 1) and the real sample X 2.
For mapping: f: x 2→X1, and its arbiter D X, use a similar loss function. The loop consistency penalty reduces the space of possible mapping functions by forcing forward and backward consistency:
The complete objective function is:
LGAN(G,F,DX,DY)=LGAN(G,DY,X,Y)+LGAN(F,DX,X,Y)+λLcyc(G,F).
In the aviation satellite style migration network training process, the generated style migration image is gradually converted from aviation optical remote sensing data to satellite optical remote sensing data along with the increase of training round number, as shown in fig. 5.
In step S4, as shown in fig. 4, the student network f (epsilon) is trained using the pseudolite optical remote sensing data, the manual labels and the pseudotags as inputs.
In some embodiments, in the step S4, the method for training a student network using the pseudolite optical remote sensing data, the artificial annotation and the pseudotag as inputs includes:
and respectively combining the pseudolite optical remote sensing data with the artificial labels and the pseudotags to form two sets of training data pairs, and training a student network by taking the two sets of training data pairs as input.
The student network model consists of an R-CNN network and a Cycle-GAN network; in the Cycle-GAN network ResNet is used as the basic network of generators and discriminators.
Specifically, in the student network training, the initial structure and parameters of the student network are inherited by the teacher network trained in step S2. Fixing the parameters of the f (theta) and the g (theta) network, and inputting x 1 to f (theta) to obtainInputs x 1 to g (θ) result in/>Then the artificial mark y 1 is taken out from the original training data pair to form/> And/>Two sets of training data pairs, and finally, the combined new training batch data is input into f (epsilon) for training.
The aviation satellite style migration network and the student network training are synchronously carried out, so that data generated by the aviation satellite style migration network in each round of training can be used as training data of the student network, and the student network can learn more characteristics and information.
In step S5, as shown in fig. 5, an EMA algorithm is applied to update parameters in the student network after the training round to the teacher network, and update parameters of the teacher network.
Specifically, mθ+ (1-m) ε → θ, mε [0,1], where m is typically 0.999, and the speed of the student network transmitting parameters to the teacher network through the EMA is controlled according to the actual training situation.
The EMA algorithm is introduced to enable the network to keep useful information in the training data when the parameters are updated, so that the training effect of the model is finally improved.
In the training process of the model, an Adam optimizer is used for parameter optimization, the initial value of the learning rate is 0.001, and the weight attenuation coefficient is 0.0005. In training, the batch size was set to 4 and the epoch number was set to 50. To avoid overfitting, dropout techniques and data enhancement methods are used, such as rotation, flipping, scaling, and the like.
In summary, the scheme provided by the invention can effectively utilize a large amount of unlabeled remote sensing image data, can reduce the dependence of a model on manual labeling, can obtain good accuracy when detecting a collapsed building on aviation data and satellite data, and can improve generalization and domain migration capability of the model; the damaged building is effectively detected through the aerial remote sensing and satellite remote sensing image data, the emergency response capability of the earthquake disaster can be improved, and rescue workers can be helped to quickly locate the damaged building.
The invention discloses a collapse building detection system based on cross-domain teacher-student training. FIG. 7 is a block diagram of a collapsed building detection system based on cross-domain teacher-student training in accordance with an embodiment of the present invention; as shown in fig. 7, the system 100 includes:
a first processing module 101 configured to construct a dataset comprising aerial optical remote sensing data and satellite optical remote sensing data of a collapsed building;
the second processing module 102 is configured to train the teacher network to obtain the pseudo tag by applying the artificial annotation of the aviation optical remote sensing data and the aviation optical remote sensing data as inputs;
a third processing module 103 configured to train the aviation satellite style migration network to generate pseudolite optical remote sensing data, applying the aviation optical remote sensing data and the satellite optical remote sensing data as inputs; the structure of the aviation satellite style migration network is a generation countermeasure network for conversion from unpaired images to images;
A fourth processing module 104 configured to train the student network with the pseudolite optical remote sensing data, the manual labels, and the pseudotags as inputs;
A fifth processing module 105, configured to apply an EMA algorithm to update parameters in the student network after the training round to the teacher network, and update parameters of the teacher network;
The sixth processing module 106 is configured to repeat the training iteration of the second processing module to the fifth processing module on the teacher network, and the trained teacher network is the finally obtained target detection model;
a seventh processing module 107 is configured to input aviation optical remote sensing data into the target detection model to detect a collapsed building.
According to the system of the second aspect of the present invention, the first processing module 101 is specifically configured to construct a dataset DB-ARSD, comprising 3000 satellite images and 1000 aerial images. These images are collected after natural disasters such as earthquakes, hurricanes, etc. In these images the bounding box of the damaged building is marked.
The system according to the second aspect of the present invention, the second processing module 102 is specifically configured to select the two-phase model fast-Rcnn as the teacher network.
Applying a loss functionTraining the teacher network;
wherein, Representing RPN class loss,/>Representing the RPN regression loss,/>Representing the ROI classification loss,Indicating ROI regression loss.
According to the system of the second aspect of the present invention, the third processing module 103 is specifically configured to learn a mapping G by the network structure of the aviation satellite style migration network: x 1→X2 and F: x 2→X1;
X 1 represents aviation optical remote sensing data, X 2 represents satellite optical remote sensing data, and G and F represent mapping functions;
Combining the cyclic consistency loss function with the fight loss functions of X 1 and X 2 to obtain a complete objective function of unpaired image-to-image conversion for aviation satellite style migration network training.
And selecting the Cycle-GAN network as an aviation satellite style migration network.
Specifically, the cyclic consistency loss function encourages F (G (X 1))≈x1and G(F(x2))≈x2. Combine this loss function with the antagonistic loss functions of X 1 and X 2 to get the complete objective function of the unpaired image-to-image conversion of the aeronautical satellite style migration network training:
Where G attempts to generate an image G (X 1) that looks similar to that from domain X 2, and D Y attempts to distinguish between the transformed sample G (X 1) and the real sample X 2.
For mapping: f: x 2→X1, and its arbiter D X, use a similar loss function. The loop consistency penalty reduces the space of possible mapping functions by forcing forward and backward consistency:
The complete objective function is:
LGAN(G,F,DX,DY)=LGAN(G,DY,X,Y)+LGAN(F,DX,X,Y)+λLcyc(G,F).
In the aviation satellite style migration network training process, generated style migration images are gradually converted from aviation optical remote sensing data to satellite optical remote sensing data along with the increase of training round numbers.
The system according to the second aspect of the present invention, the fourth processing module 104 is specifically configured to take as input the pseudolite optical remote sensing data, the manual labeling and the pseudo tag, and the method for training the student network includes:
and respectively combining the pseudolite optical remote sensing data with the artificial labels and the pseudotags to form two sets of training data pairs, and training a student network by taking the two sets of training data pairs as input.
The student network model consists of an R-CNN network and a Cycle-GAN network; in the Cycle-GAN network ResNet is used as the basic network of generators and discriminators.
Specifically, in the student network training, the initial structure and parameters of the student network are inherited by the teacher network trained in the second processing module 102. Fixing the parameters of the f (theta) and the g (theta) network, and inputting x 1 to f (theta) to obtainInputs x 1 to g (θ) result in/>Then the artificial mark y 1 is taken out from the original training data pair to form/>And/>Two sets of training data pairs, and finally, the combined new training batch data is input into f (epsilon) for training.
The aviation satellite style migration network and the student network training are synchronously carried out, so that data generated by the aviation satellite style migration network in each round of training can be used as training data of the student network, and the student network can learn more characteristics and information.
According to the system of the second aspect of the present invention, the fifth processing module 105 is specifically configured to mθ+ (1-m) ε→θ, mε [0,1], where m is typically 0.999, and controls the speed of the student network transmitting parameters to the teacher network through the EMA according to the actual training situation.
The EMA algorithm is introduced to enable the network to keep useful information in the training data when the parameters are updated, so that the training effect of the model is finally improved.
A third aspect of the invention discloses an electronic device. The electronic equipment comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps in the collapse building detection method based on cross-domain teacher-student training in any one of the first aspect of the invention when executing the computer program.
Fig. 8 is a block diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 8, the electronic device includes a processor, a memory, a communication interface, a display screen, and an input device connected through a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic device includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the electronic device is used for conducting wired or wireless communication with an external terminal, and the wireless communication can be achieved through WIFI, an operator network, near Field Communication (NFC) or other technologies. The display screen of the electronic equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the electronic equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the electronic equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structure shown in fig. 8 is merely a block diagram of a portion related to the technical solution of the present disclosure, and does not constitute a limitation of the electronic device to which the technical solution of the present disclosure is applied, and a specific electronic device may include more or less components than those shown in the drawings, or may combine some components, or have different component arrangements.
A fourth aspect of the invention discloses a computer-readable storage medium. The computer readable storage medium stores a computer program which, when executed by a processor, implements the steps in a method for detecting a collapsed building based on cross-domain teacher-student training according to any one of the first aspect of the present disclosure.
Note that the technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be regarded as the scope of the description. The foregoing examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims (5)

1. The method for detecting the collapse building based on the cross-domain teacher-student training is characterized by comprising the following steps of:
s1, constructing a data set comprising aviation optical remote sensing data and satellite optical remote sensing data of a collapsed building;
S2, training a teacher network by using manual labeling of aviation optical remote sensing data and the aviation optical remote sensing data as inputs to obtain a pseudo tag;
in the step S2, selecting a two-stage model Faster-Rcnn as a teacher network;
In said step S2, a loss function is applied Training the teacher network;
wherein, Representing RPN class loss,/>Representing the RPN regression loss,/>Representing ROI classification loss,/>Representing ROI regression loss;
S3, training an aviation satellite style migration network by using aviation optical remote sensing data and satellite optical remote sensing data as inputs to generate pseudolite optical remote sensing data; the structure of the aviation satellite style migration network is a generation countermeasure network for conversion from unpaired images to images;
In the step S3, the network structure of the aviation satellite style migration network learns a mapping G: x 1→X2 and F: x 2→X1;
X 1 represents aviation optical remote sensing data, X 2 represents satellite optical remote sensing data, and G and F represent mapping functions;
Combining the cyclic consistency loss function with the antagonism loss functions of X 1 and X 2 to obtain a complete objective function of unpaired image-to-image conversion of the aviation satellite style migration network training;
Selecting a Cycle-GAN network as an aviation satellite style migration network;
in particular, the cyclic consistency loss function encourages F (G (X 1))≈x1 and G(F(x2))≈x2; combining this loss function with the antagonistic loss functions of X 1 and X 2, to obtain the complete objective function of unpaired image-to-image conversion for aeronautical satellite style migration network training; for mapping)
G: x 1→X2, and its discriminator D Y whose objective function is:
Where G attempts to generate an image G (X 1) that looks similar to that from domain X 2, and D Y attempts to distinguish between the transformed sample G (X 1) and the real sample X 2;
For mapping: f: x 2→X1, and its arbiter D X, use a similar loss function; the loop consistency penalty reduces the space of possible mapping functions by forcing forward and backward consistency:
The complete objective function is:
LGAN(G,F,DX,DY)=LGAN(G,DY,X,Y)+LGAN(F,DX,X,Y)+λLcyc(G,F);
In the aviation satellite style migration network training process, the generated style migration image is gradually converted from aviation optical remote sensing data to satellite optical remote sensing data along with the increase of training round number;
s4, training a student network by taking the pseudolite optical remote sensing data, the manual label and the pseudo tag as inputs;
in the step S4, the method for training the student network takes the pseudolite optical remote sensing data, the manual label and the pseudotag as input, and includes:
Respectively forming two groups of training data pairs by using the pseudolite optical remote sensing data and the artificial labels and the pseudotags, and training a student network by taking the two groups of training data pairs as input;
In the step S4, the student network model consists of an R-CNN network and a Cycle-GAN network; in the Cycle-GAN network, resNet is used as a basic network of generators and discriminators;
Specifically, in the student network training, the initial structure and parameters of the student network are inherited by the teacher network trained in step S2; fixing the parameters of the f (theta) and the g (theta) network, and inputting x 1 to f (theta) to obtain Inputs x 1 to g (θ) result in/>Then the artificial mark y 1 is taken out from the original training data pair to form/> And/>Two sets of training data pairs, and finally, inputting the combined new training batch data into f (epsilon) for training;
The aviation satellite style migration network and the student network training are synchronously carried out, so that data generated by the aviation satellite style migration network in each round of training can be used as training data of the student network, and the student network can learn more characteristics and information;
s5, updating parameters in the student network after the training round to the teacher network by applying an EMA algorithm, and updating the parameters of the teacher network;
Specifically, mθ+ (1-m) ε → θ, mε [0,1], where m is usually 0.999, and the speed of the student network transmitting parameters to the teacher network through EMA is controlled according to the actual training situation;
In the training process of the model, an Adam optimizer is used for parameter optimization, the initial value of the learning rate is 0.001, and the weight attenuation coefficient is 0.0005; in training, the batch size was set to 4 and the epoch number was set to 50; to avoid overfitting, rotation, flipping, and scaling in dropout techniques and data enhancement methods are used;
S6, repeating the steps S2-S5 to train and iterate the teacher network, wherein the trained teacher network is the finally obtained target detection model;
And S7, inputting aviation optical remote sensing data into the target detection model, and detecting the collapsed building.
2. The method for detecting a collapse building based on cross-domain teacher-student training according to claim 1, wherein in the step S3, a Cycle-GAN network is selected as an aviation satellite style migration network.
3. A collapsed building detection system for cross-domain teacher-to-student training, the system comprising:
A first processing module configured to construct a dataset comprising aerial optical remote sensing data and satellite optical remote sensing data of the collapsed building;
The second processing module is configured to train the teacher network to obtain a pseudo tag by applying the manual annotation of the aviation optical remote sensing data and the aviation optical remote sensing data as inputs;
selecting a two-stage model Faster-Rcnn as a teacher network;
applying a loss function Training the teacher network;
wherein, Representing RPN class loss,/>Representing the RPN regression loss,/>Representing ROI classification loss,/>Representing ROI regression loss;
the third processing module is configured to train the aviation satellite style migration network by using the aviation optical remote sensing data and the satellite optical remote sensing data as inputs to generate pseudolite optical remote sensing data; the structure of the aviation satellite style migration network is a generation countermeasure network for conversion from unpaired images to images;
Selecting a Cycle-GAN network as an aviation satellite style migration network;
Specifically, the cyclic consistency loss function encourages F (G (X 1))≈x1 and G(F(x2))≈x2; combining this loss function with the antagonistic loss functions of X 1 and X 2, to get the complete objective function of the unpaired image-to-image conversion of the aeronautical satellite style migration network training; for mapping G: X 1→X2, and its discriminator D Y, its objective function is:
Where G attempts to generate an image G (X 1) that looks similar to that from domain X 2, and D Y attempts to distinguish between the transformed sample G (X 1) and the real sample X 2;
For mapping: f: x 2→X1, and its arbiter D X, use a similar loss function; the loop consistency penalty reduces the space of possible mapping functions by forcing forward and backward consistency:
The complete objective function is:
LGAN(G,F,DX,DY)=LGAN(G,DY,X,Y)+LGAN(F,DX,X,Y)+λLcyc(G,F);
In the aviation satellite style migration network training process, the generated style migration image is gradually converted from aviation optical remote sensing data to satellite optical remote sensing data along with the increase of training round number;
The network structure of the aviation satellite style migration network learns a mapping G: x 1→X2 and F: x 2→X1;
X 1 represents aviation optical remote sensing data, X 2 represents satellite optical remote sensing data, and G and F represent mapping functions;
Combining the cyclic consistency loss function with the antagonism loss functions of X 1 and X 2 to obtain a complete objective function of unpaired image-to-image conversion of the aviation satellite style migration network training;
A fourth processing module configured to train a student network with the pseudolite optical remote sensing data, the manual annotation, and the pseudotag as inputs;
the training student network takes the pseudolite optical remote sensing data, the manual labeling and the pseudotag as inputs and comprises the following steps:
Respectively forming two groups of training data pairs by using the pseudolite optical remote sensing data and the artificial labels and the pseudotags, and training a student network by taking the two groups of training data pairs as input;
the student network model consists of an R-CNN network and a Cycle-GAN network; in the Cycle-GAN network, resNet is used as a basic network of generators and discriminators;
Specifically, in the student network training, the initial structure and parameters of the student network are inherited by the teacher network trained in step S2; fixing the parameters of the f (theta) and the g (theta) network, and inputting x 1 to f (theta) to obtain Inputs x 1 to g (θ) result in/>Then the artificial mark y 1 is taken out from the original training data pair to form/> And/>Two sets of training data pairs, and finally, inputting the combined new training batch data into f (epsilon) for training;
The aviation satellite style migration network and the student network training are synchronously carried out, so that data generated by the aviation satellite style migration network in each round of training can be used as training data of the student network, and the student network can learn more characteristics and information;
A fifth processing module configured to apply an EMA algorithm to update parameters in the student network after the training round to the teacher network, and update parameters of the teacher network;
Specifically, mθ+ (1-m) ε → θ, mε [0,1], where m is usually 0.999, and the speed of the student network transmitting parameters to the teacher network through EMA is controlled according to the actual training situation;
In the training process of the model, an Adam optimizer is used for parameter optimization, the initial value of the learning rate is 0.001, and the weight attenuation coefficient is 0.0005; in training, the batch size was set to 4 and the epoch number was set to 50; to avoid overfitting, rotation, flipping, and scaling in dropout techniques and data enhancement methods are used;
The sixth processing module is configured to repeat training iteration of the second processing module to the fifth processing module on the teacher network, and the trained teacher network is the finally obtained target detection model;
and a seventh processing module configured to input aviation optical remote sensing data into the target detection model to detect a collapsed building.
4. An electronic device comprising a memory and a processor, the memory storing a computer program, the processor implementing the steps in a method for detecting a collapsed building based on cross-domain teachers and students training according to any one of claims 1 to 2 when the computer program is executed.
5. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps in a method for detecting a collapsed building based on cross-domain teachers and students training according to any one of claims 1 to 2.
CN202310812000.0A 2023-07-04 2023-07-04 Method and system for detecting collapsed building based on cross-domain teacher-student training Active CN116778335B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310812000.0A CN116778335B (en) 2023-07-04 2023-07-04 Method and system for detecting collapsed building based on cross-domain teacher-student training

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310812000.0A CN116778335B (en) 2023-07-04 2023-07-04 Method and system for detecting collapsed building based on cross-domain teacher-student training

Publications (2)

Publication Number Publication Date
CN116778335A CN116778335A (en) 2023-09-19
CN116778335B true CN116778335B (en) 2024-04-26

Family

ID=88008008

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310812000.0A Active CN116778335B (en) 2023-07-04 2023-07-04 Method and system for detecting collapsed building based on cross-domain teacher-student training

Country Status (1)

Country Link
CN (1) CN116778335B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112529774A (en) * 2020-12-28 2021-03-19 南开大学 Remote sensing simulation image generation method based on cycleGAN
CN114399686A (en) * 2021-11-26 2022-04-26 中国科学院计算机网络信息中心 Remote sensing image ground feature identification and classification method and device based on weak supervised learning
CN114626461A (en) * 2022-03-16 2022-06-14 西安理工大学 Cross-domain target detection method based on domain self-adaptation
CN114943689A (en) * 2022-04-27 2022-08-26 河钢数字技术股份有限公司 Method for detecting components of steel cold-rolling annealing furnace based on semi-supervised learning
CN116310655A (en) * 2023-04-23 2023-06-23 中国人民解放军国防科技大学 Infrared dim target detection method and device based on semi-supervised mixed domain adaptation
WO2023116635A1 (en) * 2021-12-24 2023-06-29 中国科学院深圳先进技术研究院 Mutual learning-based semi-supervised medical image segmentation method and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112529774A (en) * 2020-12-28 2021-03-19 南开大学 Remote sensing simulation image generation method based on cycleGAN
CN114399686A (en) * 2021-11-26 2022-04-26 中国科学院计算机网络信息中心 Remote sensing image ground feature identification and classification method and device based on weak supervised learning
WO2023116635A1 (en) * 2021-12-24 2023-06-29 中国科学院深圳先进技术研究院 Mutual learning-based semi-supervised medical image segmentation method and system
CN114626461A (en) * 2022-03-16 2022-06-14 西安理工大学 Cross-domain target detection method based on domain self-adaptation
CN114943689A (en) * 2022-04-27 2022-08-26 河钢数字技术股份有限公司 Method for detecting components of steel cold-rolling annealing furnace based on semi-supervised learning
CN116310655A (en) * 2023-04-23 2023-06-23 中国人民解放军国防科技大学 Infrared dim target detection method and device based on semi-supervised mixed domain adaptation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《Cross-Domain Adaptive Teacher for Object Detection》;Yu-Jhe Li 等;《2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)》;7581-7590 *
《基于深度学习的域适应弱监督目标检测算法研究》;欧阳胜雄;《中国优秀硕士学位论文全文数据库》;参见第15-18、27-52页 *

Also Published As

Publication number Publication date
CN116778335A (en) 2023-09-19

Similar Documents

Publication Publication Date Title
CN112580439B (en) Large-format remote sensing image ship target detection method and system under small sample condition
CN108171233A (en) Use the method and apparatus of the object detection of the deep learning model based on region
CN110929577A (en) Improved target identification method based on YOLOv3 lightweight framework
CN112766244A (en) Target object detection method and device, computer equipment and storage medium
Li et al. Integrating ensemble-urban cellular automata model with an uncertainty map to improve the performance of a single model
CN109214006B (en) Natural language reasoning method for image enhanced hierarchical semantic representation
CN111241306B (en) Path planning method based on knowledge graph and pointer network
CN114332578A (en) Image anomaly detection model training method, image anomaly detection method and device
CN113177559B (en) Image recognition method, system, equipment and medium combining breadth and dense convolutional neural network
CN116664719B (en) Image redrawing model training method, image redrawing method and device
CN114419351A (en) Image-text pre-training model training method and device and image-text prediction model training method and device
Du et al. Polyline simplification based on the artificial neural network with constraints of generalization knowledge
CN112131884B (en) Method and device for entity classification, method and device for entity presentation
CN109583371A (en) Landmark information based on deep learning extracts and matching process
CN111783688B (en) Remote sensing image scene classification method based on convolutional neural network
CN111125550A (en) Interest point classification method, device, equipment and storage medium
CN116778335B (en) Method and system for detecting collapsed building based on cross-domain teacher-student training
CN116258931B (en) Visual finger representation understanding method and system based on ViT and sliding window attention fusion
CN112884780A (en) Estimation method and system for human body posture
CN116824291A (en) Remote sensing image learning method, device and equipment
CN113255701B (en) Small sample learning method and system based on absolute-relative learning framework
CN115905442A (en) Method, system and medium for surveying landform of unmanned aerial vehicle based on cognitive map
CN114529949A (en) Lightweight gesture recognition method based on deep learning
Du et al. The innovation of ideological and political education integrating artificial intelligence big data with the support of wireless network
Bousias Alexakis et al. Evaluation of semi-supervised learning for CNN-based change detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant