CN111144234A - Video SAR target detection method based on deep learning - Google Patents
Video SAR target detection method based on deep learning Download PDFInfo
- Publication number
- CN111144234A CN111144234A CN201911257311.5A CN201911257311A CN111144234A CN 111144234 A CN111144234 A CN 111144234A CN 201911257311 A CN201911257311 A CN 201911257311A CN 111144234 A CN111144234 A CN 111144234A
- Authority
- CN
- China
- Prior art keywords
- network
- rcnn
- frame
- layer
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
- G06V10/464—Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biomedical Technology (AREA)
- Evolutionary Biology (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Astronomy & Astrophysics (AREA)
- Remote Sensing (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a video SAR target detection method based on deep learning, which comprises the following steps: preprocessing and dividing a video data set to obtain a training set and a test set; constructing a Resnet101 residual error network as a feature extractor for extracting high-dimensional features of the SAR image; constructing an RPN (resilient packet network), inputting image characteristics output by a Resnet101 residual error network into the RPN, and outputting a candidate region; and constructing a Faster-RCNN network, and inputting a result output by the RPN network into the Faster-RCNN network to obtain a video SAR target detection result. The invention has the characteristics of simple realization, high detection precision and wide applicable scenes.
Description
Technical Field
The invention belongs to the technical field of radars, and particularly relates to a video SAR target detection method.
Background
Synthetic Aperture radar (sar), an active earth observation system, can be installed on flight platforms such as airplanes, satellites, spacecraft, etc., and can perform earth observation all day long and all day long, and has a certain ground surface penetration capability. Therefore, the SAR system has unique advantages in disaster monitoring, environmental monitoring, marine monitoring, resource exploration, crop estimation, mapping, military and other applications, and can play a role that other remote sensing means are difficult to play, so that the SAR system is more and more paid attention by countries in the world.
Currently, common mainstream SAR target detection methods can be classified into three categories, namely target detection based on background clutter statistical distribution, target detection based on polarization decomposition and target detection based on polarization characteristics. The method carries out target detection from the angle of an imaging mechanism, needs artificial modeling to extract SAR image characteristics, is complex in detection method and low in detection accuracy.
Disclosure of Invention
In order to solve the technical problems mentioned in the background art, the invention provides a video SAR target detection method based on deep learning.
In order to achieve the technical purpose, the technical scheme of the invention is as follows:
a video SAR target detection method based on deep learning comprises the following steps:
(1) preprocessing and dividing a video data set to obtain a training set and a test set;
(2) constructing a Resnet101 residual error network as a feature extractor for extracting high-dimensional features of the SAR image; in the process of constructing a Resnet101 residual error network, introducing an FPN network architecture, and providing multi-scale combined image characteristics for subsequent steps by combining characteristic graphs of different scales before and after a pooling layer;
(3) constructing an RPN (resilient packet network), inputting image characteristics output by a Resnet101 residual error network into the RPN, and outputting a candidate region;
(4) and constructing a Faster-RCNN network, and inputting a result output by the RPN network into the Faster-RCNN network to obtain a video SAR target detection result.
Further, in step (1), a data set is constructed through the video, each frame of the video is read and stored in sequence, the data set is firstly calibrated for the data set image, and the position coordinate of the target frame is (x)k,yk,wk,hk) Wherein x isk、ykIs the horizontal and vertical coordinates, w, of the upper left corner of the target framek、hkThe width and height of the target frame;
then data enhancement is carried out, the vertical pixel of the data set image is unchanged, the horizontal position is overturned, and new position coordinates are obtainedWherein the content of the first and second substances,
and finally dividing the data set into a training set and a test set according to the ratio of m to n, wherein m is larger than n.
Further, in the step (2), an initial neural network model is constructed by adopting a VGG network construction method, and a residual error structure and a jump structure are introduced, so that the number of network layers is deepened; and introducing an FPN network structure, performing down-sampling on the feature maps with different scales before and after the pooling layer, performing summation operation of corresponding elements, inputting the convolution layer, and outputting the convolution layer.
Further, in step (2), the constructed residual error network is trained using the SAR image classification dataset.
Further, in the step (3), the K-Means clustering algorithm is used for calculating the ratio of the length and the width of the target in the data set, N clustering centers are obtained to serve as the ratio of the height and the width of a subsequent prior anchor point frame, anchor point frames with different sizes and the ratio of the N clustering centers are selected on the data set image in a frame mode, a frame selection area corresponds to the feature map according to the corresponding relation of the SPP-net algorithm, the corresponding obtained feature map is respectively input into a classification layer and a regression layer, the classification layer is used for distinguishing whether the target anchor point is contained in the current frame, the output result of the classification layer is a confidence coefficient S, and the regression layer is used for outputting the position coordinate of the candidate prediction frame.
Further, in step (4), an ROI Align layer in a fast-RCNN network is constructed, each candidate region is traversed by the ROI Align layer, floating point boundaries are kept not to be quantized, the candidate regions are divided into k × k units, k is a positive integer, the boundaries of each unit are not quantized, four coordinate positions are calculated and fixed in each unit, values of the four positions are calculated by a bilinear interpolation method, then maximum pooling operation is carried out, and candidate regions with different sizes are mapped to fixed sizes.
Further, in step (4), an RCNN layer in the fast-RCNN network is constructed, and the candidate region of a fixed size output by the ROI Align layer is input into two convolutional neural networks, one of which is a classification neural network for predicting the kind of the background object and the other is a regression neural network for outputting the position coordinates of the target frame.
Further, in step (4), a staged learning rate method is used to train the loss function of the fast-RCNN.
Further, the loss function L of the Faster-RCNN is as follows:
in the above equation, the first term of L is the RPN network classification loss, Ncls1Number of prediction boxes, P, output for RPN network1iThe probability that the ith prediction frame is the foreground is determined, if the ith prediction frame is the foreground, the probability that the ith prediction frame is the foreground is determinedOtherwiseCross entropy loss function for two classes; the second term of L is the RCNN classification loss, Ncls2Number of prediction boxes, P, output for RCNN layer2iThe probability predicted for each class for the ith prediction box,get 1 to the actual category, the restClass 0, Lcls2Cross entropy loss function for multi-classification; the third term of L is the regression loss of RPN and RCNN, λ is the preset weight coefficient, NregTo be the area of the output feature map,is the probability of an object in the ith prediction box, tiIs the coordinate vector (x, y, w, h) of the ith prediction box,defining a loss function L for coordinate vector of the actual frame, wherein x and y are horizontal and vertical coordinates of the upper left corner of the frame, w and h are width and height of the framereg:
further, in the testing process, for the result output by the fast-RCNN network, a Soft-NMS method is adopted to eliminate the overlapped frames.
Adopt the beneficial effect that above-mentioned technical scheme brought:
the method utilizes a residual error network to extract high-dimensional characteristics of an original SAR image, introduces an FPN network architecture, provides multi-scale combined image characteristics for a subsequent algorithm through characteristic graphs of different scales before and after a combined pooling layer, inputs the image characteristics into a subsequent RPN network, and finally outputs results through the RCNN network. Meanwhile, the invention also adjusts the initial parameters of the residual error network and trains the initial parameters by utilizing the SAR class data set, thereby realizing the high-precision end-to-end target detection aiming at the multi-scale target in the video SAR. The invention has the characteristics of simple realization, high detection precision and wide applicable scenes.
Drawings
FIG. 1 is a basic flow diagram of the present invention.
Detailed Description
The technical scheme of the invention is explained in detail in the following with the accompanying drawings.
The invention designs a video SAR target detection method based on deep learning, as shown in figure 1, the steps are as follows:
step 1: preprocessing and dividing a video data set to obtain a training set and a test set;
step 2: constructing a Resnet101 residual error network as a feature extractor for extracting high-dimensional features of the SAR image; in the process of constructing a Resnet101 residual error network, introducing an FPN network architecture, and providing multi-scale combined image characteristics for subsequent steps by combining characteristic graphs of different scales before and after a pooling layer;
and step 3: constructing an RPN (resilient packet network), inputting image characteristics output by a Resnet101 residual error network into the RPN, and outputting a candidate region;
and 4, step 4: and constructing a Faster-RCNN network, and inputting a result output by the RPN network into the Faster-RCNN network to obtain a video SAR target detection result.
In this embodiment, the step 1 is implemented by the following preferred scheme:
constructing a data set through a video, sequentially reading and storing each frame of the video, firstly calibrating the data set for a data set image, wherein the position coordinate of a target frame is (x)k,yk,wk,hk) Wherein x isk、ykIs the horizontal and vertical coordinates, w, of the upper left corner of the target framek、hkThe width and height of the target frame. And then data enhancement is carried out, so that generalization capability and robustness are improved. The vertical pixel of the data set image is unchanged, the horizontal position is overturned, and a new position coordinate is obtainedWherein the content of the first and second substances,the data set is enlarged by a factor of two. Finally, dividing the data set into a training set and a testing set according to the ratio of m to n, wherein m is>n。
In this embodiment, the step 2 is implemented by the following preferred scheme:
firstly, a Resnet101 residual error network is used as a feature extractor, and the Resnet101 structure is constructed to be similar to a common deep convolution neural network. An initial neural network model is constructed by adopting a traditional VGG network construction method, and a residual error structure and a jump structure are introduced, so that the problem of accuracy reduction under the condition of increasing the number of layers is solved, the number of network layers is deepened, and the extraction precision of high-dimensional features is improved. And secondly, introducing an FPN structure, performing down-sampling on the feature maps with different scales before and after the pooling layer, performing corresponding element summation operation, inputting the 3 x 3 convolution layer, and outputting. And finally, aiming at the defects that the traditional residual error network extracts features and SAR generates offset, training the constructed residual error network by utilizing an SAR image classification data set to obtain a better parameter design.
In this embodiment, the step 3 is implemented by the following preferred scheme:
calculating the length-width ratio of a target in a data set by using a K-Means clustering algorithm to obtain N clustering centers as the height-width ratio of a subsequent prior anchor point frame, performing frame selection on anchor point frames with different sizes and the ratio of N clustering centers on a data set image, corresponding frame selection areas to feature maps according to the corresponding relation of an SPP-net algorithm, respectively inputting the corresponding obtained feature maps into a classification layer and a regression layer, wherein the classification layer is used for distinguishing whether the anchor point frame at present contains the target or not, the output result is a confidence coefficient S, and the regression layer is used for outputting the position coordinates of a candidate prediction frame. Defining the prediction result of IOU >0.7 as positive sample, IOU <0.3 as negative sample.
In this embodiment, the step 4 is implemented by adopting the following preferred scheme:
the ROI Align layer in the fast-RCNN network was constructed. The ROI Pooling layer in the original fast-RCNN is that feature graphs with different sizes are changed into fixed-scale feature graphs through cutting for pre-selected frames, the feature graphs are input into a subsequent classification network, if the size of an original target area is decimal after four times of down-sampling, the ROI Pooling rounds the original target area, and two round-up processes are contained in the whole network frame, so that deviation is generated between a result frame after down-sampling and an original image. The optimization utilizes an ROI Align layer to traverse each candidate region, floating point number boundaries are kept not to be quantized, the candidate regions are divided into k multiplied by k units, k is a positive integer, the boundaries of each unit are not quantized, four coordinate positions are calculated and fixed in each unit, values of the four positions are calculated by a bilinear interpolation method, then the maximum pooling operation is carried out, and the candidate regions with different sizes are mapped to fixed sizes.
And constructing an RCNN layer in a fast-RCNN network, and inputting the candidate region with fixed size output by the ROI Align layer into two convolutional neural networks, wherein one convolutional neural network is a classification neural network and used for predicting the type of the background object, and the other convolutional neural network is used for outputting the position coordinates of the target frame.
Training a loss function of fast-RCNN, setting an initial learning rate to be LS1, reducing the probability of a local minimum value by adopting a stage learning rate method aiming at the characteristic that the SAR image is easy to enter the local minimum value, and reducing the learning rate to 0.1 LS1 and 0.01 LS1 when the training times sequentially reach N1 and N2 times.
The loss function of the fast-RCNN is as follows:
in the above equation, the first term of L is the RPN network classification loss, Ncls1Number of prediction boxes, P, output for RPN network1iThe probability that the ith prediction frame is the foreground is determined, if the ith prediction frame is the foreground, the probability that the ith prediction frame is the foreground is determinedOtherwiseCross entropy loss function for two classes; the second term of L is the RCNN classification loss, Ncls2Number of prediction boxes, P, output for RCNN layer2iThe probability predicted for each class for the ith prediction box,to the actual category, take 1The rest classes are 0, Lcls2Cross entropy loss function for multi-classification; the third term of L is the regression loss of RPN and RCNN, λ is the preset weight coefficient, NregTo be the area of the output feature map,is the probability of an object in the ith prediction box, tiIs the coordinate vector (x, y, w, h) of the ith prediction box,defining a loss function L for coordinate vector of the actual frame, wherein x and y are horizontal and vertical coordinates of the upper left corner of the frame, w and h are width and height of the framereg:
in the embodiment, in the test process, for the result output by the fast-RCNN network, a Soft-NMS method is adopted to eliminate the overlapped frames.
The embodiments are only for illustrating the technical idea of the present invention, and the technical idea of the present invention is not limited thereto, and any modifications made on the basis of the technical scheme according to the technical idea of the present invention fall within the scope of the present invention.
Claims (10)
1. A video SAR target detection method based on deep learning is characterized by comprising the following steps:
(1) preprocessing and dividing a video data set to obtain a training set and a test set;
(2) constructing a Resnet101 residual error network as a feature extractor for extracting high-dimensional features of the SAR image; in the process of constructing a Resnet101 residual error network, introducing an FPN network architecture, and providing multi-scale combined image characteristics for subsequent steps by combining characteristic graphs of different scales before and after a pooling layer;
(3) constructing an RPN (resilient packet network), inputting image characteristics output by a Resnet101 residual error network into the RPN, and outputting a candidate region;
(4) and constructing a Faster-RCNN network, and inputting a result output by the RPN network into the Faster-RCNN network to obtain a video SAR target detection result.
2. The method for detecting SAR target of video based on deep learning as claimed in claim 1, wherein in step (1), a data set is constructed by video, each frame of video is read and stored in turn, for the image of data set, the data set is calibrated first, and the position coordinate of the target frame is (x)k,yk,wk,hk) Wherein x isk、ykIs the horizontal and vertical coordinates, w, of the upper left corner of the target framek、hkThe width and height of the target frame;
then data enhancement is carried out, the vertical pixel of the data set image is unchanged, the horizontal position is overturned, and new position coordinates are obtainedWherein the content of the first and second substances,
and finally dividing the data set into a training set and a test set according to the ratio of m to n, wherein m is larger than n.
3. The deep learning-based video SAR target detection method according to claim 1, characterized in that in step (2), an initial neural network model is constructed by adopting a VGG network construction method, and a residual structure and a jump structure are introduced, so that the number of network layers is deepened; and introducing an FPN network structure, performing down-sampling on the feature maps with different scales before and after the pooling layer, performing summation operation of corresponding elements, inputting the convolution layer, and outputting the convolution layer.
4. The deep learning-based video SAR target detection method according to claim 1, characterized in that in step (2), the constructed residual network is trained using SAR image classification dataset.
5. The method for detecting the video SAR target based on the deep learning of claim 1, characterized in that in the step (3), the K-Means clustering algorithm is used for calculating the ratio of the length and the width of the target in the data set, N clustering centers are obtained as the ratio of the height and the width of the subsequent prior anchor point frame, the anchor point frames with different sizes and the ratio of N clustering centers are selected on the data set image, the selected area is corresponding to the feature map according to the corresponding relationship of the SPP-net algorithm, the corresponding feature map is respectively input into a classification layer and a regression layer, the classification layer is used for distinguishing whether the current anchor point frame contains the target, the output result is the confidence S, and the regression layer is used for outputting the position coordinates of the candidate prediction frame.
6. The method for detecting SAR target in video based on deep learning as claimed in claim 1, wherein in step (4), the ROI Align layer in the fast-RCNN network is constructed, each candidate region is traversed by the ROI Align layer, the floating point boundary is kept not quantized, the candidate region is divided into k × k units, k is a positive integer, the boundary of each unit is not quantized, four coordinate positions are calculated in each unit, the values of the four positions are calculated by bilinear interpolation, and then the largest pooling operation is performed to map the candidate regions with different sizes to fixed sizes.
7. The deep learning-based video SAR target detection method as claimed in claim 6, wherein in step (4), an RCNN layer in a fast-RCNN network is constructed, and the candidate region with fixed size output by the ROI Align layer is input into two convolutional neural networks, one of which is a classification neural network for predicting the kind of the background object, and the other is a regression neural network for outputting the position coordinates of the target frame.
8. The deep learning-based video SAR target detection method as claimed in claim 1, wherein in step (4), a step-by-step learning rate method is adopted to train the loss function of fast-RCNN.
9. The deep learning based video SAR target detection method according to claim 8, wherein the loss function L of the Faster-RCNN is as follows:
in the above equation, the first term of L is the RPN network classification loss, Ncls1Number of prediction boxes, P, output for RPN network1iThe probability that the ith prediction frame is the foreground is determined, if the ith prediction frame is the foreground, the probability that the ith prediction frame is the foreground is determinedOtherwiseLcls1Cross entropy loss function for two classes; the second term of L is the RCNN classification loss, Ncls2Number of prediction boxes, P, output for RCNN layer2iThe probability predicted for each class for the ith prediction box,take 1 for the actual category and 0, L for the rest categoriescls2Cross entropy loss function for multi-classification; the third term of L is the regression loss of RPN and RCNN, λ is the preset weight coefficient, NregTo be the area of the output feature map,is the probability of an object in the ith prediction box, tiIs the coordinate vector (x, y, w, h) of the ith prediction box,as coordinate vectors of the actual frameWherein x and y are the horizontal and vertical coordinates of the upper left corner of the frame, w and h are the width and height of the frame, and a loss function L is definedreg:
10. the deep learning-based video SAR target detection method according to claim 1, characterized in that in the test process, for the result output by the fast-RCNN network, a Soft-NMS method is adopted to eliminate the overlapped box.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911257311.5A CN111144234A (en) | 2019-12-10 | 2019-12-10 | Video SAR target detection method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911257311.5A CN111144234A (en) | 2019-12-10 | 2019-12-10 | Video SAR target detection method based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111144234A true CN111144234A (en) | 2020-05-12 |
Family
ID=70517869
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911257311.5A Pending CN111144234A (en) | 2019-12-10 | 2019-12-10 | Video SAR target detection method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111144234A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111680655A (en) * | 2020-06-15 | 2020-09-18 | 深延科技(北京)有限公司 | Video target detection method for aerial images of unmanned aerial vehicle |
CN112016594A (en) * | 2020-08-05 | 2020-12-01 | 中山大学 | Collaborative training method based on domain self-adaptation |
CN112200115A (en) * | 2020-10-21 | 2021-01-08 | 平安国际智慧城市科技股份有限公司 | Face recognition training method, recognition method, device, equipment and storage medium |
CN112686340A (en) * | 2021-03-12 | 2021-04-20 | 成都点泽智能科技有限公司 | Dense small target detection method based on deep neural network |
CN113673534A (en) * | 2021-04-22 | 2021-11-19 | 江苏大学 | RGB-D image fruit detection method based on fast RCNN |
CN113836985A (en) * | 2020-06-24 | 2021-12-24 | 富士通株式会社 | Image processing apparatus, image processing method, and computer-readable storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109584227A (en) * | 2018-11-27 | 2019-04-05 | 山东大学 | A kind of quality of welding spot detection method and its realization system based on deep learning algorithm of target detection |
CN110110783A (en) * | 2019-04-30 | 2019-08-09 | 天津大学 | A kind of deep learning object detection method based on the connection of multilayer feature figure |
CN110321815A (en) * | 2019-06-18 | 2019-10-11 | 中国计量大学 | A kind of crack on road recognition methods based on deep learning |
-
2019
- 2019-12-10 CN CN201911257311.5A patent/CN111144234A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109584227A (en) * | 2018-11-27 | 2019-04-05 | 山东大学 | A kind of quality of welding spot detection method and its realization system based on deep learning algorithm of target detection |
CN110110783A (en) * | 2019-04-30 | 2019-08-09 | 天津大学 | A kind of deep learning object detection method based on the connection of multilayer feature figure |
CN110321815A (en) * | 2019-06-18 | 2019-10-11 | 中国计量大学 | A kind of crack on road recognition methods based on deep learning |
Non-Patent Citations (1)
Title |
---|
王慧玲,綦小龙,武港山: "基于深度卷积神经网络的目标检测技术的研究进展" * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111680655A (en) * | 2020-06-15 | 2020-09-18 | 深延科技(北京)有限公司 | Video target detection method for aerial images of unmanned aerial vehicle |
CN113836985A (en) * | 2020-06-24 | 2021-12-24 | 富士通株式会社 | Image processing apparatus, image processing method, and computer-readable storage medium |
CN112016594A (en) * | 2020-08-05 | 2020-12-01 | 中山大学 | Collaborative training method based on domain self-adaptation |
CN112016594B (en) * | 2020-08-05 | 2023-06-09 | 中山大学 | Collaborative training method based on field self-adaption |
CN112200115A (en) * | 2020-10-21 | 2021-01-08 | 平安国际智慧城市科技股份有限公司 | Face recognition training method, recognition method, device, equipment and storage medium |
CN112200115B (en) * | 2020-10-21 | 2024-04-19 | 平安国际智慧城市科技股份有限公司 | Face recognition training method, recognition method, device, equipment and storage medium |
CN112686340A (en) * | 2021-03-12 | 2021-04-20 | 成都点泽智能科技有限公司 | Dense small target detection method based on deep neural network |
CN113673534A (en) * | 2021-04-22 | 2021-11-19 | 江苏大学 | RGB-D image fruit detection method based on fast RCNN |
CN113673534B (en) * | 2021-04-22 | 2024-06-11 | 江苏大学 | RGB-D image fruit detection method based on FASTER RCNN |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111144234A (en) | Video SAR target detection method based on deep learning | |
CN110135267B (en) | Large-scene SAR image fine target detection method | |
CN109934282B (en) | SAGAN sample expansion and auxiliary information-based SAR target classification method | |
CN108596101B (en) | Remote sensing image multi-target detection method based on convolutional neural network | |
CN108647655B (en) | Low-altitude aerial image power line foreign matter detection method based on light convolutional neural network | |
CN111191566B (en) | Optical remote sensing image multi-target detection method based on pixel classification | |
CN110929607B (en) | Remote sensing identification method and system for urban building construction progress | |
CN112132093B (en) | High-resolution remote sensing image target detection method and device and computer equipment | |
CN107358260B (en) | Multispectral image classification method based on surface wave CNN | |
CN108428220B (en) | Automatic geometric correction method for ocean island reef area of remote sensing image of geostationary orbit satellite sequence | |
CN107909015A (en) | Hyperspectral image classification method based on convolutional neural networks and empty spectrum information fusion | |
CN107067405B (en) | Remote sensing image segmentation method based on scale optimization | |
CN110598600A (en) | Remote sensing image cloud detection method based on UNET neural network | |
CN110766058B (en) | Battlefield target detection method based on optimized RPN (resilient packet network) | |
CN110163213B (en) | Remote sensing image segmentation method based on disparity map and multi-scale depth network model | |
CN111914924B (en) | Rapid ship target detection method, storage medium and computing equipment | |
CN110163207B (en) | Ship target positioning method based on Mask-RCNN and storage device | |
CN110728197B (en) | Single-tree-level tree species identification method based on deep learning | |
CN106295613A (en) | A kind of unmanned plane target localization method and system | |
CN111368935B (en) | SAR time-sensitive target sample amplification method based on generation countermeasure network | |
CN113850129A (en) | Target detection method for rotary equal-variation space local attention remote sensing image | |
CN113435253A (en) | Multi-source image combined urban area ground surface coverage classification method | |
CN113486819A (en) | Ship target detection method based on YOLOv4 algorithm | |
CN112464745A (en) | Ground feature identification and classification method and device based on semantic segmentation | |
CN109558803B (en) | SAR target identification method based on convolutional neural network and NP criterion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |