CN111144234A - Video SAR target detection method based on deep learning - Google Patents

Video SAR target detection method based on deep learning Download PDF

Info

Publication number
CN111144234A
CN111144234A CN201911257311.5A CN201911257311A CN111144234A CN 111144234 A CN111144234 A CN 111144234A CN 201911257311 A CN201911257311 A CN 201911257311A CN 111144234 A CN111144234 A CN 111144234A
Authority
CN
China
Prior art keywords
network
rcnn
frame
layer
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911257311.5A
Other languages
Chinese (zh)
Inventor
秦尉博
闫贺
黄佳
黄智杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN201911257311.5A priority Critical patent/CN111144234A/en
Publication of CN111144234A publication Critical patent/CN111144234A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • G06V10/464Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a video SAR target detection method based on deep learning, which comprises the following steps: preprocessing and dividing a video data set to obtain a training set and a test set; constructing a Resnet101 residual error network as a feature extractor for extracting high-dimensional features of the SAR image; constructing an RPN (resilient packet network), inputting image characteristics output by a Resnet101 residual error network into the RPN, and outputting a candidate region; and constructing a Faster-RCNN network, and inputting a result output by the RPN network into the Faster-RCNN network to obtain a video SAR target detection result. The invention has the characteristics of simple realization, high detection precision and wide applicable scenes.

Description

Video SAR target detection method based on deep learning
Technical Field
The invention belongs to the technical field of radars, and particularly relates to a video SAR target detection method.
Background
Synthetic Aperture radar (sar), an active earth observation system, can be installed on flight platforms such as airplanes, satellites, spacecraft, etc., and can perform earth observation all day long and all day long, and has a certain ground surface penetration capability. Therefore, the SAR system has unique advantages in disaster monitoring, environmental monitoring, marine monitoring, resource exploration, crop estimation, mapping, military and other applications, and can play a role that other remote sensing means are difficult to play, so that the SAR system is more and more paid attention by countries in the world.
Currently, common mainstream SAR target detection methods can be classified into three categories, namely target detection based on background clutter statistical distribution, target detection based on polarization decomposition and target detection based on polarization characteristics. The method carries out target detection from the angle of an imaging mechanism, needs artificial modeling to extract SAR image characteristics, is complex in detection method and low in detection accuracy.
Disclosure of Invention
In order to solve the technical problems mentioned in the background art, the invention provides a video SAR target detection method based on deep learning.
In order to achieve the technical purpose, the technical scheme of the invention is as follows:
a video SAR target detection method based on deep learning comprises the following steps:
(1) preprocessing and dividing a video data set to obtain a training set and a test set;
(2) constructing a Resnet101 residual error network as a feature extractor for extracting high-dimensional features of the SAR image; in the process of constructing a Resnet101 residual error network, introducing an FPN network architecture, and providing multi-scale combined image characteristics for subsequent steps by combining characteristic graphs of different scales before and after a pooling layer;
(3) constructing an RPN (resilient packet network), inputting image characteristics output by a Resnet101 residual error network into the RPN, and outputting a candidate region;
(4) and constructing a Faster-RCNN network, and inputting a result output by the RPN network into the Faster-RCNN network to obtain a video SAR target detection result.
Further, in step (1), a data set is constructed through the video, each frame of the video is read and stored in sequence, the data set is firstly calibrated for the data set image, and the position coordinate of the target frame is (x)k,yk,wk,hk) Wherein x isk、ykIs the horizontal and vertical coordinates, w, of the upper left corner of the target framek、hkThe width and height of the target frame;
then data enhancement is carried out, the vertical pixel of the data set image is unchanged, the horizontal position is overturned, and new position coordinates are obtained
Figure BDA0002310641830000021
Wherein the content of the first and second substances,
Figure BDA0002310641830000022
and finally dividing the data set into a training set and a test set according to the ratio of m to n, wherein m is larger than n.
Further, in the step (2), an initial neural network model is constructed by adopting a VGG network construction method, and a residual error structure and a jump structure are introduced, so that the number of network layers is deepened; and introducing an FPN network structure, performing down-sampling on the feature maps with different scales before and after the pooling layer, performing summation operation of corresponding elements, inputting the convolution layer, and outputting the convolution layer.
Further, in step (2), the constructed residual error network is trained using the SAR image classification dataset.
Further, in the step (3), the K-Means clustering algorithm is used for calculating the ratio of the length and the width of the target in the data set, N clustering centers are obtained to serve as the ratio of the height and the width of a subsequent prior anchor point frame, anchor point frames with different sizes and the ratio of the N clustering centers are selected on the data set image in a frame mode, a frame selection area corresponds to the feature map according to the corresponding relation of the SPP-net algorithm, the corresponding obtained feature map is respectively input into a classification layer and a regression layer, the classification layer is used for distinguishing whether the target anchor point is contained in the current frame, the output result of the classification layer is a confidence coefficient S, and the regression layer is used for outputting the position coordinate of the candidate prediction frame.
Further, in step (4), an ROI Align layer in a fast-RCNN network is constructed, each candidate region is traversed by the ROI Align layer, floating point boundaries are kept not to be quantized, the candidate regions are divided into k × k units, k is a positive integer, the boundaries of each unit are not quantized, four coordinate positions are calculated and fixed in each unit, values of the four positions are calculated by a bilinear interpolation method, then maximum pooling operation is carried out, and candidate regions with different sizes are mapped to fixed sizes.
Further, in step (4), an RCNN layer in the fast-RCNN network is constructed, and the candidate region of a fixed size output by the ROI Align layer is input into two convolutional neural networks, one of which is a classification neural network for predicting the kind of the background object and the other is a regression neural network for outputting the position coordinates of the target frame.
Further, in step (4), a staged learning rate method is used to train the loss function of the fast-RCNN.
Further, the loss function L of the Faster-RCNN is as follows:
Figure BDA0002310641830000031
in the above equation, the first term of L is the RPN network classification loss, Ncls1Number of prediction boxes, P, output for RPN network1iThe probability that the ith prediction frame is the foreground is determined, if the ith prediction frame is the foreground, the probability that the ith prediction frame is the foreground is determined
Figure BDA0002310641830000032
Otherwise
Figure BDA0002310641830000033
Cross entropy loss function for two classes; the second term of L is the RCNN classification loss, Ncls2Number of prediction boxes, P, output for RCNN layer2iThe probability predicted for each class for the ith prediction box,
Figure BDA0002310641830000034
get 1 to the actual category, the restClass 0, Lcls2Cross entropy loss function for multi-classification; the third term of L is the regression loss of RPN and RCNN, λ is the preset weight coefficient, NregTo be the area of the output feature map,
Figure BDA0002310641830000035
is the probability of an object in the ith prediction box, tiIs the coordinate vector (x, y, w, h) of the ith prediction box,
Figure BDA0002310641830000036
defining a loss function L for coordinate vector of the actual frame, wherein x and y are horizontal and vertical coordinates of the upper left corner of the frame, w and h are width and height of the framereg
Figure BDA0002310641830000041
In the above formula, the first and second carbon atoms are,
Figure BDA0002310641830000042
further, in the testing process, for the result output by the fast-RCNN network, a Soft-NMS method is adopted to eliminate the overlapped frames.
Adopt the beneficial effect that above-mentioned technical scheme brought:
the method utilizes a residual error network to extract high-dimensional characteristics of an original SAR image, introduces an FPN network architecture, provides multi-scale combined image characteristics for a subsequent algorithm through characteristic graphs of different scales before and after a combined pooling layer, inputs the image characteristics into a subsequent RPN network, and finally outputs results through the RCNN network. Meanwhile, the invention also adjusts the initial parameters of the residual error network and trains the initial parameters by utilizing the SAR class data set, thereby realizing the high-precision end-to-end target detection aiming at the multi-scale target in the video SAR. The invention has the characteristics of simple realization, high detection precision and wide applicable scenes.
Drawings
FIG. 1 is a basic flow diagram of the present invention.
Detailed Description
The technical scheme of the invention is explained in detail in the following with the accompanying drawings.
The invention designs a video SAR target detection method based on deep learning, as shown in figure 1, the steps are as follows:
step 1: preprocessing and dividing a video data set to obtain a training set and a test set;
step 2: constructing a Resnet101 residual error network as a feature extractor for extracting high-dimensional features of the SAR image; in the process of constructing a Resnet101 residual error network, introducing an FPN network architecture, and providing multi-scale combined image characteristics for subsequent steps by combining characteristic graphs of different scales before and after a pooling layer;
and step 3: constructing an RPN (resilient packet network), inputting image characteristics output by a Resnet101 residual error network into the RPN, and outputting a candidate region;
and 4, step 4: and constructing a Faster-RCNN network, and inputting a result output by the RPN network into the Faster-RCNN network to obtain a video SAR target detection result.
In this embodiment, the step 1 is implemented by the following preferred scheme:
constructing a data set through a video, sequentially reading and storing each frame of the video, firstly calibrating the data set for a data set image, wherein the position coordinate of a target frame is (x)k,yk,wk,hk) Wherein x isk、ykIs the horizontal and vertical coordinates, w, of the upper left corner of the target framek、hkThe width and height of the target frame. And then data enhancement is carried out, so that generalization capability and robustness are improved. The vertical pixel of the data set image is unchanged, the horizontal position is overturned, and a new position coordinate is obtained
Figure BDA0002310641830000051
Wherein the content of the first and second substances,
Figure BDA0002310641830000052
the data set is enlarged by a factor of two. Finally, dividing the data set into a training set and a testing set according to the ratio of m to n, wherein m is>n。
In this embodiment, the step 2 is implemented by the following preferred scheme:
firstly, a Resnet101 residual error network is used as a feature extractor, and the Resnet101 structure is constructed to be similar to a common deep convolution neural network. An initial neural network model is constructed by adopting a traditional VGG network construction method, and a residual error structure and a jump structure are introduced, so that the problem of accuracy reduction under the condition of increasing the number of layers is solved, the number of network layers is deepened, and the extraction precision of high-dimensional features is improved. And secondly, introducing an FPN structure, performing down-sampling on the feature maps with different scales before and after the pooling layer, performing corresponding element summation operation, inputting the 3 x 3 convolution layer, and outputting. And finally, aiming at the defects that the traditional residual error network extracts features and SAR generates offset, training the constructed residual error network by utilizing an SAR image classification data set to obtain a better parameter design.
In this embodiment, the step 3 is implemented by the following preferred scheme:
calculating the length-width ratio of a target in a data set by using a K-Means clustering algorithm to obtain N clustering centers as the height-width ratio of a subsequent prior anchor point frame, performing frame selection on anchor point frames with different sizes and the ratio of N clustering centers on a data set image, corresponding frame selection areas to feature maps according to the corresponding relation of an SPP-net algorithm, respectively inputting the corresponding obtained feature maps into a classification layer and a regression layer, wherein the classification layer is used for distinguishing whether the anchor point frame at present contains the target or not, the output result is a confidence coefficient S, and the regression layer is used for outputting the position coordinates of a candidate prediction frame. Defining the prediction result of IOU >0.7 as positive sample, IOU <0.3 as negative sample.
In this embodiment, the step 4 is implemented by adopting the following preferred scheme:
the ROI Align layer in the fast-RCNN network was constructed. The ROI Pooling layer in the original fast-RCNN is that feature graphs with different sizes are changed into fixed-scale feature graphs through cutting for pre-selected frames, the feature graphs are input into a subsequent classification network, if the size of an original target area is decimal after four times of down-sampling, the ROI Pooling rounds the original target area, and two round-up processes are contained in the whole network frame, so that deviation is generated between a result frame after down-sampling and an original image. The optimization utilizes an ROI Align layer to traverse each candidate region, floating point number boundaries are kept not to be quantized, the candidate regions are divided into k multiplied by k units, k is a positive integer, the boundaries of each unit are not quantized, four coordinate positions are calculated and fixed in each unit, values of the four positions are calculated by a bilinear interpolation method, then the maximum pooling operation is carried out, and the candidate regions with different sizes are mapped to fixed sizes.
And constructing an RCNN layer in a fast-RCNN network, and inputting the candidate region with fixed size output by the ROI Align layer into two convolutional neural networks, wherein one convolutional neural network is a classification neural network and used for predicting the type of the background object, and the other convolutional neural network is used for outputting the position coordinates of the target frame.
Training a loss function of fast-RCNN, setting an initial learning rate to be LS1, reducing the probability of a local minimum value by adopting a stage learning rate method aiming at the characteristic that the SAR image is easy to enter the local minimum value, and reducing the learning rate to 0.1 LS1 and 0.01 LS1 when the training times sequentially reach N1 and N2 times.
The loss function of the fast-RCNN is as follows:
Figure BDA0002310641830000061
in the above equation, the first term of L is the RPN network classification loss, Ncls1Number of prediction boxes, P, output for RPN network1iThe probability that the ith prediction frame is the foreground is determined, if the ith prediction frame is the foreground, the probability that the ith prediction frame is the foreground is determined
Figure BDA0002310641830000062
Otherwise
Figure BDA0002310641830000063
Cross entropy loss function for two classes; the second term of L is the RCNN classification loss, Ncls2Number of prediction boxes, P, output for RCNN layer2iThe probability predicted for each class for the ith prediction box,
Figure BDA0002310641830000064
to the actual category, take 1The rest classes are 0, Lcls2Cross entropy loss function for multi-classification; the third term of L is the regression loss of RPN and RCNN, λ is the preset weight coefficient, NregTo be the area of the output feature map,
Figure BDA0002310641830000071
is the probability of an object in the ith prediction box, tiIs the coordinate vector (x, y, w, h) of the ith prediction box,
Figure BDA0002310641830000072
defining a loss function L for coordinate vector of the actual frame, wherein x and y are horizontal and vertical coordinates of the upper left corner of the frame, w and h are width and height of the framereg
Figure BDA0002310641830000073
In the above formula, the first and second carbon atoms are,
Figure BDA0002310641830000074
in the embodiment, in the test process, for the result output by the fast-RCNN network, a Soft-NMS method is adopted to eliminate the overlapped frames.
The embodiments are only for illustrating the technical idea of the present invention, and the technical idea of the present invention is not limited thereto, and any modifications made on the basis of the technical scheme according to the technical idea of the present invention fall within the scope of the present invention.

Claims (10)

1. A video SAR target detection method based on deep learning is characterized by comprising the following steps:
(1) preprocessing and dividing a video data set to obtain a training set and a test set;
(2) constructing a Resnet101 residual error network as a feature extractor for extracting high-dimensional features of the SAR image; in the process of constructing a Resnet101 residual error network, introducing an FPN network architecture, and providing multi-scale combined image characteristics for subsequent steps by combining characteristic graphs of different scales before and after a pooling layer;
(3) constructing an RPN (resilient packet network), inputting image characteristics output by a Resnet101 residual error network into the RPN, and outputting a candidate region;
(4) and constructing a Faster-RCNN network, and inputting a result output by the RPN network into the Faster-RCNN network to obtain a video SAR target detection result.
2. The method for detecting SAR target of video based on deep learning as claimed in claim 1, wherein in step (1), a data set is constructed by video, each frame of video is read and stored in turn, for the image of data set, the data set is calibrated first, and the position coordinate of the target frame is (x)k,yk,wk,hk) Wherein x isk、ykIs the horizontal and vertical coordinates, w, of the upper left corner of the target framek、hkThe width and height of the target frame;
then data enhancement is carried out, the vertical pixel of the data set image is unchanged, the horizontal position is overturned, and new position coordinates are obtained
Figure FDA0002310641820000011
Wherein the content of the first and second substances,
Figure FDA0002310641820000012
and finally dividing the data set into a training set and a test set according to the ratio of m to n, wherein m is larger than n.
3. The deep learning-based video SAR target detection method according to claim 1, characterized in that in step (2), an initial neural network model is constructed by adopting a VGG network construction method, and a residual structure and a jump structure are introduced, so that the number of network layers is deepened; and introducing an FPN network structure, performing down-sampling on the feature maps with different scales before and after the pooling layer, performing summation operation of corresponding elements, inputting the convolution layer, and outputting the convolution layer.
4. The deep learning-based video SAR target detection method according to claim 1, characterized in that in step (2), the constructed residual network is trained using SAR image classification dataset.
5. The method for detecting the video SAR target based on the deep learning of claim 1, characterized in that in the step (3), the K-Means clustering algorithm is used for calculating the ratio of the length and the width of the target in the data set, N clustering centers are obtained as the ratio of the height and the width of the subsequent prior anchor point frame, the anchor point frames with different sizes and the ratio of N clustering centers are selected on the data set image, the selected area is corresponding to the feature map according to the corresponding relationship of the SPP-net algorithm, the corresponding feature map is respectively input into a classification layer and a regression layer, the classification layer is used for distinguishing whether the current anchor point frame contains the target, the output result is the confidence S, and the regression layer is used for outputting the position coordinates of the candidate prediction frame.
6. The method for detecting SAR target in video based on deep learning as claimed in claim 1, wherein in step (4), the ROI Align layer in the fast-RCNN network is constructed, each candidate region is traversed by the ROI Align layer, the floating point boundary is kept not quantized, the candidate region is divided into k × k units, k is a positive integer, the boundary of each unit is not quantized, four coordinate positions are calculated in each unit, the values of the four positions are calculated by bilinear interpolation, and then the largest pooling operation is performed to map the candidate regions with different sizes to fixed sizes.
7. The deep learning-based video SAR target detection method as claimed in claim 6, wherein in step (4), an RCNN layer in a fast-RCNN network is constructed, and the candidate region with fixed size output by the ROI Align layer is input into two convolutional neural networks, one of which is a classification neural network for predicting the kind of the background object, and the other is a regression neural network for outputting the position coordinates of the target frame.
8. The deep learning-based video SAR target detection method as claimed in claim 1, wherein in step (4), a step-by-step learning rate method is adopted to train the loss function of fast-RCNN.
9. The deep learning based video SAR target detection method according to claim 8, wherein the loss function L of the Faster-RCNN is as follows:
Figure FDA0002310641820000031
in the above equation, the first term of L is the RPN network classification loss, Ncls1Number of prediction boxes, P, output for RPN network1iThe probability that the ith prediction frame is the foreground is determined, if the ith prediction frame is the foreground, the probability that the ith prediction frame is the foreground is determined
Figure FDA0002310641820000032
Otherwise
Figure FDA0002310641820000033
Lcls1Cross entropy loss function for two classes; the second term of L is the RCNN classification loss, Ncls2Number of prediction boxes, P, output for RCNN layer2iThe probability predicted for each class for the ith prediction box,
Figure FDA0002310641820000034
take 1 for the actual category and 0, L for the rest categoriescls2Cross entropy loss function for multi-classification; the third term of L is the regression loss of RPN and RCNN, λ is the preset weight coefficient, NregTo be the area of the output feature map,
Figure FDA0002310641820000035
is the probability of an object in the ith prediction box, tiIs the coordinate vector (x, y, w, h) of the ith prediction box,
Figure FDA0002310641820000036
as coordinate vectors of the actual frameWherein x and y are the horizontal and vertical coordinates of the upper left corner of the frame, w and h are the width and height of the frame, and a loss function L is definedreg
Figure FDA0002310641820000037
In the above formula, the first and second carbon atoms are,
Figure FDA0002310641820000038
10. the deep learning-based video SAR target detection method according to claim 1, characterized in that in the test process, for the result output by the fast-RCNN network, a Soft-NMS method is adopted to eliminate the overlapped box.
CN201911257311.5A 2019-12-10 2019-12-10 Video SAR target detection method based on deep learning Pending CN111144234A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911257311.5A CN111144234A (en) 2019-12-10 2019-12-10 Video SAR target detection method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911257311.5A CN111144234A (en) 2019-12-10 2019-12-10 Video SAR target detection method based on deep learning

Publications (1)

Publication Number Publication Date
CN111144234A true CN111144234A (en) 2020-05-12

Family

ID=70517869

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911257311.5A Pending CN111144234A (en) 2019-12-10 2019-12-10 Video SAR target detection method based on deep learning

Country Status (1)

Country Link
CN (1) CN111144234A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111680655A (en) * 2020-06-15 2020-09-18 深延科技(北京)有限公司 Video target detection method for aerial images of unmanned aerial vehicle
CN112016594A (en) * 2020-08-05 2020-12-01 中山大学 Collaborative training method based on domain self-adaptation
CN112200115A (en) * 2020-10-21 2021-01-08 平安国际智慧城市科技股份有限公司 Face recognition training method, recognition method, device, equipment and storage medium
CN112686340A (en) * 2021-03-12 2021-04-20 成都点泽智能科技有限公司 Dense small target detection method based on deep neural network
CN113673534A (en) * 2021-04-22 2021-11-19 江苏大学 RGB-D image fruit detection method based on fast RCNN
CN113836985A (en) * 2020-06-24 2021-12-24 富士通株式会社 Image processing apparatus, image processing method, and computer-readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109584227A (en) * 2018-11-27 2019-04-05 山东大学 A kind of quality of welding spot detection method and its realization system based on deep learning algorithm of target detection
CN110110783A (en) * 2019-04-30 2019-08-09 天津大学 A kind of deep learning object detection method based on the connection of multilayer feature figure
CN110321815A (en) * 2019-06-18 2019-10-11 中国计量大学 A kind of crack on road recognition methods based on deep learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109584227A (en) * 2018-11-27 2019-04-05 山东大学 A kind of quality of welding spot detection method and its realization system based on deep learning algorithm of target detection
CN110110783A (en) * 2019-04-30 2019-08-09 天津大学 A kind of deep learning object detection method based on the connection of multilayer feature figure
CN110321815A (en) * 2019-06-18 2019-10-11 中国计量大学 A kind of crack on road recognition methods based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王慧玲,綦小龙,武港山: "基于深度卷积神经网络的目标检测技术的研究进展" *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111680655A (en) * 2020-06-15 2020-09-18 深延科技(北京)有限公司 Video target detection method for aerial images of unmanned aerial vehicle
CN113836985A (en) * 2020-06-24 2021-12-24 富士通株式会社 Image processing apparatus, image processing method, and computer-readable storage medium
CN112016594A (en) * 2020-08-05 2020-12-01 中山大学 Collaborative training method based on domain self-adaptation
CN112016594B (en) * 2020-08-05 2023-06-09 中山大学 Collaborative training method based on field self-adaption
CN112200115A (en) * 2020-10-21 2021-01-08 平安国际智慧城市科技股份有限公司 Face recognition training method, recognition method, device, equipment and storage medium
CN112200115B (en) * 2020-10-21 2024-04-19 平安国际智慧城市科技股份有限公司 Face recognition training method, recognition method, device, equipment and storage medium
CN112686340A (en) * 2021-03-12 2021-04-20 成都点泽智能科技有限公司 Dense small target detection method based on deep neural network
CN113673534A (en) * 2021-04-22 2021-11-19 江苏大学 RGB-D image fruit detection method based on fast RCNN
CN113673534B (en) * 2021-04-22 2024-06-11 江苏大学 RGB-D image fruit detection method based on FASTER RCNN

Similar Documents

Publication Publication Date Title
CN111144234A (en) Video SAR target detection method based on deep learning
CN110135267B (en) Large-scene SAR image fine target detection method
CN109934282B (en) SAGAN sample expansion and auxiliary information-based SAR target classification method
CN108596101B (en) Remote sensing image multi-target detection method based on convolutional neural network
CN108647655B (en) Low-altitude aerial image power line foreign matter detection method based on light convolutional neural network
CN111191566B (en) Optical remote sensing image multi-target detection method based on pixel classification
CN110929607B (en) Remote sensing identification method and system for urban building construction progress
CN112132093B (en) High-resolution remote sensing image target detection method and device and computer equipment
CN107358260B (en) Multispectral image classification method based on surface wave CNN
CN108428220B (en) Automatic geometric correction method for ocean island reef area of remote sensing image of geostationary orbit satellite sequence
CN107909015A (en) Hyperspectral image classification method based on convolutional neural networks and empty spectrum information fusion
CN107067405B (en) Remote sensing image segmentation method based on scale optimization
CN110598600A (en) Remote sensing image cloud detection method based on UNET neural network
CN110766058B (en) Battlefield target detection method based on optimized RPN (resilient packet network)
CN110163213B (en) Remote sensing image segmentation method based on disparity map and multi-scale depth network model
CN111914924B (en) Rapid ship target detection method, storage medium and computing equipment
CN110163207B (en) Ship target positioning method based on Mask-RCNN and storage device
CN110728197B (en) Single-tree-level tree species identification method based on deep learning
CN106295613A (en) A kind of unmanned plane target localization method and system
CN111368935B (en) SAR time-sensitive target sample amplification method based on generation countermeasure network
CN113850129A (en) Target detection method for rotary equal-variation space local attention remote sensing image
CN113435253A (en) Multi-source image combined urban area ground surface coverage classification method
CN113486819A (en) Ship target detection method based on YOLOv4 algorithm
CN112464745A (en) Ground feature identification and classification method and device based on semantic segmentation
CN109558803B (en) SAR target identification method based on convolutional neural network and NP criterion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination