CN114973023B - High-resolution SAR image vehicle target key part extraction method based on fast RCNN - Google Patents

High-resolution SAR image vehicle target key part extraction method based on fast RCNN Download PDF

Info

Publication number
CN114973023B
CN114973023B CN202210913427.5A CN202210913427A CN114973023B CN 114973023 B CN114973023 B CN 114973023B CN 202210913427 A CN202210913427 A CN 202210913427A CN 114973023 B CN114973023 B CN 114973023B
Authority
CN
China
Prior art keywords
network model
vehicle
data set
training
faster rcnn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210913427.5A
Other languages
Chinese (zh)
Other versions
CN114973023A (en
Inventor
张月婷
胡玉新
郭嘉逸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aerospace Information Research Institute of CAS
Original Assignee
Aerospace Information Research Institute of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aerospace Information Research Institute of CAS filed Critical Aerospace Information Research Institute of CAS
Priority to CN202210913427.5A priority Critical patent/CN114973023B/en
Publication of CN114973023A publication Critical patent/CN114973023A/en
Application granted granted Critical
Publication of CN114973023B publication Critical patent/CN114973023B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a method for extracting key parts of a vehicle target from a high-resolution SAR image based on Faster RCNN, which relates to the technical field of image processing and comprises the following steps: preprocessing a plurality of SAR sensing images to obtain a plurality of slice image data; labeling the slice image data, and dividing the labeled slice image data into a training data set and a test data set according to a preset proportion; constructing a Faster RCNN network model based on a VGG16 network and an RPN network, setting training parameters of the Faster RCNN network model and constructing a loss function of the Faster RCNN network model; and sequentially inputting the training data set and the testing data set into a Faster RCNN network model for training and testing, and storing the parameters of the tested network model to obtain the tested Faster RCNN network model which is used for extracting the key parts of the vehicle target. The invention also provides a high-resolution SAR image vehicle target key part extraction device based on Faster RCNN, electronic equipment and a storage medium.

Description

High-resolution SAR image vehicle target key part extraction method based on fast RCNN
Technical Field
The invention relates to the technical field of image processing, in particular to a method and a device for extracting key parts of a vehicle target from a high-resolution SAR image based on fast RCNN, electronic equipment and a storage medium.
Background
The vehicle is a typical artificial target, the vehicle target is generally divided into a vehicle body and a vehicle head part, and has important significance in military and civil use aiming at the field of analysis and information extraction of SAR (Synthetic Aperture Radar) image vehicle target key parts, and the extraction of the vehicle target key parts of the high-resolution SAR image needs to be solved: and aiming at the high-resolution SAR image data, extracting the head and body parts of the vehicle target contained in the high-resolution SAR image data by using the image slice and the image observation data containing a single vehicle.
Under the condition of the prior low-medium resolution SAR image, the solution of the problem cannot be completed because the target mobility of the vehicle is strong and the size is not large. Along with the rapid improvement of the SAR image resolution in recent years, data better than 0.5mSAR image resolution are obtained in a large quantity, and a vehicle target has abundant detail characteristics in the high-resolution SAR image, so that the SAR image can complete detection and extract refined information of the vehicle target.
Aiming at the SAR image target identification field, the traditional method adopts the same research thinking as the optical image, and realizes detection and identification by extracting the characteristics of brightness, texture, geometry and the like of an image area, and in recent years, a method based on deep learning is gradually developed and applied to the image application field. The deep learning technology based on the convolutional neural network is developed rapidly in recent years, and the deep learning technology is better applied to the aspects of voice recognition, signal extraction, image application and the like. In particular, the method has better performance in the aspects of face recognition, target detection in natural images and information extraction in the field of image application. The Convolutional Neural Network (CNN) is one of deep learning Networks, and in the application of an image, the Convolutional Neural network extracts and learns local features of the image through multiple Convolutional layers, and realizes feature fitting through modes such as pooling and full connection. The convolutional neural network has stronger characteristic characterization and self-adaptive learning capacity and has better applicability to target detection and identification.
However, most methods of CNN for target detection and identification of SAR images are still based on application forms of natural image application problems, and most methods do not consider respective characteristics of targets in SAR images, for example: vehicle target structure characteristics, satellite observation conditions and other prior conditions.
Disclosure of Invention
In order to solve the problems in the prior art, embodiments of the present invention provide a method, an apparatus, an electronic device, a storage medium, and a program product for extracting a key target Region of a vehicle based on a high-resolution SAR image of a fast Region-based Neural network (fast Region-based Neural network), which are intended to improve feature extraction accuracy of regions of a vehicle head and a vehicle body.
The invention provides a method for extracting key parts of a vehicle target from a high-resolution SAR image based on fast RCNN, which comprises the following steps: preprocessing a plurality of SAR sensing images to obtain a plurality of slice image data; labeling the slice image data, and dividing the labeled slice image data into a training data set and a test data set according to a preset proportion; constructing a fast RCNN network model based on a VGG16 network (Visual Geometry Group 16) and an RPN network (Region Proposal Networks), setting training parameters of the fast RCNN network model and constructing a loss function of the fast RCNN network model; and sequentially inputting the training data set and the testing data set into a Faster RCNN network model for training and testing, and storing the parameters of the tested network model to obtain the tested Faster RCNN network model which is used for extracting the key parts of the vehicle target.
Further, constructing a loss function of the Faster RCNN network model, comprising: constructing a loss function of a fast RCNN network model according to the L1 norm, parameter loss between the position structure parameter and the true value of the predicted target and a characteristic function established based on the observation condition and the scattering characteristic of the target; the characteristic function is obtained through image space change and represents strong scattering characteristics of interaction between the target vehicle and the ground.
Further, preprocessing a plurality of SAR sensing images to obtain a plurality of slice image data, including: carrying out geocoding geometric correction and slice segmentation processing on a plurality of SAR sensing images acquired by an SAR sensor at different shooting angles in the same working mode to obtain a plurality of slice image data; the plurality of slice image data are equal in size.
Further, the labeling processing of the slice image data includes: and according to the three-dimensional model of the vehicle with the known type, labeling the slice image data by using the minimum circumscribed rectangle frame and the vehicle main shaft according to the image areas corresponding to the vehicle head and the vehicle body to obtain the labeled slice image data.
Further, dividing the marked slice image data into a training data set and a testing data set according to a preset proportion, comprising: generating corresponding xml parameter files for each marked slice image data, and recording parameter information of the vehicle area; and dividing the slice image data with the recorded parameters into a training data set and a test data set according to a preset proportion.
Further, the training parameters include at least: the method comprises the following steps of inputting the number of channels, outputting the number of channels of a full connection layer, the size of a ROI (Region of Interest) feature layer, a Classification parameter, a model learning rate, the number of extraction frames, the number of data samples captured by one training and the total iteration number.
Further, a loss functionLossThe following relationship is satisfied:
Figure 340064DEST_PATH_IMAGE001
wherein the content of the first and second substances,λ 1λ 2 andλ 3 the average value is 0.1;L para an L1 norm representing a parameter estimate;L iou representing parameter loss between the position structure parameter of the predicted target and a real value and IOU (interaction over Unit) loss between a predicted frame and a real frame;L d representing a feature function established on the basis of observation conditions and scattering features of the target, whereinL d The following relationship is satisfied:
Figure 224844DEST_PATH_IMAGE002
wherein N represents the number of samples;
Figure 491877DEST_PATH_IMAGE003
representing the estimated value of the parameter for removing the wavy line;θ i denotes the firstiThe angle value of the calculation process of the individual sample data;
Figure 503827DEST_PATH_IMAGE004
to representθ i Calculated angleAnd (4) measuring values.
Further, the preset ratio is 3:1 or 4:1.
Further, the vehicle target key part comprises: the key parts of the vehicle head and the vehicle body.
The invention provides a high-resolution SAR image vehicle target key part extraction device based on fast RCNN, comprising: the image preprocessing module is used for preprocessing the SAR sensing images to obtain a plurality of slice image data; the data set generating module is used for labeling the slice image data and dividing the labeled slice image data into a training data set and a test data set according to a preset proportion; the model building module is used for building a Faster RCNN network model based on the VGG16 network and the RPN network, setting training parameters of the Faster RCNN network model and building a loss function of the Faster RCNN network model; and the model training and testing module is used for sequentially inputting the training data set and the testing data set into the Faster RCNN network model for training and testing, storing the parameters of the tested network model to obtain the tested Faster RCNN network model, and the tested Faster RCNN network model is used for extracting the key parts of the vehicle target.
A third aspect of the present invention provides an electronic apparatus comprising: the device comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, wherein when the processor executes the computer program, the method for extracting the key parts of the vehicle target based on the high-resolution SAR image of the Faster RCNN provided by the first aspect of the invention is realized.
A fourth aspect of the present invention provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the method for extracting vehicle key parts from high-resolution SAR images based on fast RCNN according to the first aspect of the present invention.
The method comprises the steps of constructing a fast RCNN network model based on a VGG16 network and an RPN network, constructing constraint conditions by considering the observation characteristics of the SAR image and utilizing the observation angle and the strong scattering characteristics of the interaction between the vehicle target and the ground, and realizing the optimization of network parameters. Compared with the traditional specific feature extraction method, the method improves the automation level and efficiency, and is a new vehicle target tail extraction method aiming at the high-resolution SAR image.
Drawings
For a more complete understanding of the present invention, and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
FIG. 1 is a flow chart schematically illustrating a method for extracting key portions of a vehicle target from a high-resolution SAR image based on fast RCNN according to an embodiment of the present invention;
FIG. 2 schematically illustrates a vehicle labeling and geometry diagram according to an embodiment of the present invention;
FIG. 3 is a block diagram schematically illustrating a vehicle target key part extracting apparatus based on fast RCNN high resolution SAR images according to an embodiment of the present invention;
fig. 4 schematically shows a block diagram of an electronic device adapted to implement the above described method according to an embodiment of the invention.
Detailed Description
Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. It is to be understood that such description is merely illustrative and not intended to limit the scope of the present invention. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the invention. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Where a convention analogous to "A, B and at least one of C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include, but not be limited to, systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a convention analogous to "A, B or at least one of C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B or C" would include, but not be limited to, systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
Some block diagrams and/or flow diagrams are shown in the figures. It will be understood that some blocks of the block diagrams and/or flowchart illustrations, or combinations thereof, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the instructions, which execute via the processor, create means for implementing the functions/acts specified in the block diagrams and/or flowchart block or blocks. The techniques of the present invention may be implemented in hardware and/or in software (including firmware, microcode, etc.). Furthermore, the techniques of this disclosure may take the form of a computer program product on a computer-readable storage medium having instructions stored thereon for use by or in connection with an instruction execution system.
Fig. 1 schematically shows a flowchart of a method for extracting key portions of a vehicle target from a high-resolution SAR image based on fast RCNN according to an embodiment of the present invention. As shown in fig. 1, the method includes: s101 to S104.
In operation S101, a plurality of SAR sensing images are preprocessed to obtain a plurality of slice image data.
In the embodiment of the invention, the plurality of SAR sensing images can be a plurality of SAR sensing images acquired by the SAR sensor at different shooting angles in the same working mode. The preprocessing of the multiple SAR sensing images may specifically include: and carrying out geocoding geometric correction on the plurality of SAR sensing images to obtain a plurality of SAR secondary images, and then carrying out slice segmentation processing on the plurality of SAR secondary images to obtain a plurality of slice image data. It should be noted that the plurality of SAR secondary images may be secondary images obtained by image product level selection.
In particular, the plurality of slice images may be slices of the SAR image obtained by intercepting the SAR image containing the vehicle targetIxy) Preferably, the number of slice images is not less than 200, the resolution is better than 0.3m, and the size of the slice image is 128
Figure 172705DEST_PATH_IMAGE005
128。
It should be noted that, in some other embodiments, to meet the requirements of picture processing in different application scenarios, the slice image size may also be 256
Figure 228386DEST_PATH_IMAGE005
256、512
Figure 982715DEST_PATH_IMAGE005
512, etc., which are not limited by the embodiments of the present invention.
In operation S102, the slice image data is labeled, and the labeled slice image data is divided into a training data set and a test data set according to a preset ratio.
In the embodiment of the present invention, data labeling is performed on the plurality of slice image data generated in step S101, and the slice image data is divided into a training data set and a test data set according to a preset ratio, wherein the data labeling may be performed based on vehicles of known model types.
For each vehicle target slice, according to the three-dimensional model of the known type of vehicle, as shown in fig. 2, the minimum circumscribed rectangular frame and the vehicle main shaft are utilized according to the image areas corresponding to the two parts of the vehicle head and the vehicle bodya i (vehicle center axis and observation direction in image regionyIs at an included angle ofa i ) And performing annotation processing on the slice image data to obtain the annotated slice image data.
Specifically, the labeled parameters are: the minimum external rectangular frame of the vehicle head and the vehicle body surrounds the image area covered by the vehicle, and the labeling format of the minimum external rectangular frame of each vehicle head is as follows: coordinates of the head middle point on the imageP i P xi P yi ) The width of the head isw i The length of the headstock isl i (ii) a The labeling format of the minimum external rectangular frame of the vehicle body is as follows: width of car bodyW i Length of car bodyL i (ii) a Vehicle main shafta i As shown in fig. 2.
And then, after the image slice labeling is finished, dividing the labeled slice image data into a training data set and a test data set according to a preset proportion. First, a corresponding xml parameter file is generated for each image slice, and the parameter information of the vehicle region is recorded, includingP i P xi P yi )、w i l i W i L i Anda i there are 7 parameters. And secondly, dividing the slice image data after recording the parameters into a training data set and a testing data set according to a preset proportion. It should be noted that the preset ratio includes, but is not limited to, 3:1, 4:1, etc.
In operation S103, a fast RCNN network model is constructed based on the VGG16 network and the RPN network, and training parameters of the fast RCNN network model and a loss function of the fast RCNN network model are set.
In an embodiment of the present invention, following the above-mentioned embodiment, the network model requires an estimated parameter set
Figure 719727DEST_PATH_IMAGE006
Can be expressed as
Figure 554697DEST_PATH_IMAGE007
. And selecting a Faster RCNN network model for extracting key parts of the vehicle target, and specifically constructing the Faster RCNN network model through a VGG16 network and an RPN network. Then, setting model training parameters, wherein the main parameters are set as follows:
1) And VGG16 network parameter setting: VGG model and convolutional layer parameters, such as: convolution kernel _ size =3, padding pad =1, step size stride =1, posing layer parameter: the convolution kernel _ size takes 2, step size stride =2, no padding.
2) RPN network parameter setting: 256 input channels, 256 feature layer channels, an Anchor generator, scales = [8], a scale factor Ratios = [0.5,1.0,2.0], and Anchor step Strides = [4,8, 16, 32, 64 ]) are generated.
3) The number of input channels in _ channels =256; the number of fully connected layer output channels, fc _ out _ channels =1024, roi feature layer size roi _ flat _ size =7.
4) And Classification parameter setting: extending 3 parameter vectors after num _ classes =2, bounding box regression, and outputting detection box parameters as a result
Figure 984541DEST_PATH_IMAGE008
Figure 288484DEST_PATH_IMAGE009
5) And training parameter setting: learning rate of 2 QUOTE
Figure 563607DEST_PATH_IMAGE011
Figure 20127DEST_PATH_IMAGE011
10 -4 (ii) a Extraction ofThe number of boxes Box is set to 20; the number of data samples captured by one training batch _ size is 8; the total number of iterations was 200 epoch.
And setting training parameters of the fast RCNN model according to the parameters, and then constructing a loss function of the fast RCNN model. In the embodiment of the invention, a loss function of the fast RCNN network model is constructed according to the L1 norm, the parameter loss between the position structure parameter and the true value of the predicted target and a characteristic function established based on the observation condition and the scattering characteristic of the target. The characteristic function is obtained through image space change and represents strong scattering characteristics of interaction between the target vehicle and the ground.
In particular, the constructed loss functionLossThe following relationship is satisfied:
Figure 355294DEST_PATH_IMAGE001
wherein the content of the first and second substances,L para an L1 norm representing a parameter estimate;L iou representing parameter loss between the position structure parameter of the predicted target and a real value and IOU loss between a predicted frame and a real frame;L d representing a characteristic function term established based on the observation condition and the scattering characteristic of the target;λ 1λ 2 andλ 3 all take the value of 0.1.
Loss functionLossAre specifically represented as:
Figure 84215DEST_PATH_IMAGE012
Figure 225347DEST_PATH_IMAGE013
wherein N represents the number of samples;iis shown asiSample data; ioU denotes the cross-over ratio; box 1-i Is shown asiA true value of a vehicle head detection box of the sample data; box 1-pre Is shown asiNumber of samplesAccording to the predicted value of the vehicle head detection frame; box 2-i Denotes the firstiA true value of a vehicle body detection frame of each sample data; box 2-pre Is shown asiAnd the predicted value of the vehicle body detection frame of each sample data.
Item IIIL d The calculation method can be as follows: first, image space conversion is carried out to obtain
Figure 723324DEST_PATH_IMAGE014
Then image space transformation matrixTCan be expressed as:
Figure 494971DEST_PATH_IMAGE015
wherein the content of the first and second substances,θ i denotes the firstiThe angle value of the calculation process of the sample data;
Figure 757194DEST_PATH_IMAGE016
the parameter estimate is shown with the wavy line removed. From the transformation, a transformed image is obtainedI' in the transformed imageI' in (1):
Figure 639699DEST_PATH_IMAGE017
wherein the coordinates (a)mn) Express coordinates (xy) The transformed coordinates. Approximating a computed imageI' one-dimensional accumulated data
Figure 54500DEST_PATH_IMAGE018
And solving the following equation to obtainθ i The following relationship is satisfied:
Figure 997048DEST_PATH_IMAGE019
finally, a third term of the loss function is calculatedL d The following were used:
Figure 434983DEST_PATH_IMAGE002
the fast RCNN network model and the loss function are constructed through the method, the trained network model is obtained through data training, the network model considers the characteristics of the SAR image and the scattering characteristics of the vehicle target, the loss function is constructed based on the strong scattering characteristics of the interaction between the vehicle target and the ground, and the optimization of network parameters is realized.
It should be noted that, in the embodiment of the present invention, the setting of the model training parameter includes, but is not limited to, the specific value, and may be adjusted according to the actual application requirement, the setting of the parameter is merely an exemplary illustration, and the setting of the parameter is not limited by the embodiment of the present invention.
In operation S104, the training data set and the testing data set are sequentially input into the fast RCNN network model for training and testing, and parameters of the tested network model are stored to obtain the tested fast RCNN network model, which is used for extracting key parts of the vehicle target.
In the embodiment of the invention, based on the model training parameters, the training data set and the test data set are sequentially input into the Faster RCNN network model for training and testing, and the tested network model parameters are stored to obtain the tested Faster RCNN network model. Therefore, the extraction network model of the vehicle target head and the vehicle body is established.
Specifically, in the model training process, whether the output value of the loss function of the model is smaller than a threshold value or not can be judged, if so, the model training result is represented to be good, and the training is finished. Otherwise, optimizing the model training parameters and continuing the model training.
According to the method for extracting the key parts of the vehicle target from the high-resolution SAR image based on the fast RCNN, which is provided by the embodiment of the invention, a fast RCNN network model is constructed based on a VGG16 network and an RPN network, the observation characteristics of the SAR image are considered, the observation angle and the information of the target are utilized, the constraint condition is constructed, and the optimization of network parameters is realized. Compared with the traditional method for extracting the specific features, the method improves the automation level and efficiency, and is a novel method for extracting the tail of the vehicle target aiming at the high-resolution SAR image.
Fig. 3 is a block diagram schematically illustrating a vehicle target key region extraction apparatus based on fast RCNN high-resolution SAR images according to an embodiment of the present invention.
As shown in fig. 3, the high resolution SAR image vehicle target key part extracting device 300 based on the fast RCNN includes: an image preprocessing module 310, a dataset generation module 320, a model construction module 330, and a model training and testing module 340. The device 300 can be used for implementing the method for extracting key parts of the vehicle target based on the fast RCNN high-resolution SAR image described with reference to fig. 1.
The image preprocessing module 310 is configured to preprocess the plurality of SAR sensing images to obtain a plurality of slice image data. The image preprocessing module 310 may be configured to perform the step S101 described above with reference to fig. 1, for example, and is not described herein again.
And the data set generating module 320 is configured to label the slice image data, and divide the labeled slice image data into a training data set and a test data set according to a preset ratio. The data set generating module 320 may be configured to perform the step S102 described above with reference to fig. 1, for example, and is not described herein again.
The model building module 330 is configured to build a fast RCNN network model based on the VGG16 network and the RPN network, set training parameters of the fast RCNN network model, and build a loss function of the fast RCNN network model. The model building module 330 may be used to perform the step S103 described above with reference to fig. 1, for example, and will not be described herein again.
And the model training and testing module 340 is used for sequentially inputting the training data set and the testing data set into the Faster RCNN network model for training and testing, storing the parameters of the tested network model and obtaining the tested Faster RCNN network model, wherein the tested Faster RCNN network model is used for extracting the key part of the vehicle target. The model training and testing module 340 may be used to perform the step S104 described above with reference to fig. 1, for example, and will not be described herein again.
Any of the modules, sub-modules, units, sub-units, or at least part of the functionality of any of them according to embodiments of the invention may be implemented in one module. Any one or more of the modules, sub-modules, units, and sub-units according to the embodiments of the present invention may be implemented by being divided into a plurality of modules. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present invention may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a device on a chip, a device on a substrate, a device on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in any other reasonable manner of hardware or firmware by integrating or packaging a circuit, or in any one of or a suitable combination of software, hardware, and firmware implementations. Alternatively, one or more of the modules, sub-modules, units, sub-units according to embodiments of the present invention may be at least partially implemented as computer program modules, which, when executed, may perform the corresponding functions.
For example, any of the image preprocessing module 310, the data set generation module 320, the model construction module 330, and the model training and testing module 340 may be combined in one module for implementation, or any one of them may be split into multiple modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of other modules and implemented in one module. According to an embodiment of the present invention, at least one of the image preprocessing module 310, the data set generation module 320, the model construction module 330, and the model training and testing module 340 may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a device on a chip, a device on a substrate, a device on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or in any one of three implementations of software, hardware, and firmware, or in any suitable combination of any of them. Alternatively, at least one of the image pre-processing module 310, the dataset generation module 320, the model construction module 330 and the model training and testing module 340 may be at least partially implemented as a computer program module, which when executed may perform a corresponding function.
Fig. 4 schematically shows a block diagram of an electronic device adapted to implement the above described method according to an embodiment of the present invention. The electronic device shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 4, the electronic device 400 described in this embodiment includes: a processor 401, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 402 or a program loaded from a storage section 408 into a Random Access Memory (RAM) 403. Processor 401 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 401 may also include onboard memory for caching purposes. Processor 401 may include a single processing unit or a plurality of processing units for performing the various actions of the method flows in accordance with embodiments of the present invention.
In the RAM 403, various programs and data necessary for the operation of the electronic apparatus 400 are stored. The processor 401, ROM 402 and RAM 403 are connected to each other by a bus 404. The processor 401 performs various operations of the method flow according to the embodiment of the present invention by executing programs in the ROM 402 and/or the RAM 403. Note that the programs may also be stored in one or more memories other than the ROM 402 and RAM 403. The processor 401 may also perform various operations of method flows according to embodiments of the present invention by executing programs stored in the one or more memories.
According to an embodiment of the invention, electronic device 400 may also include an input/output (I/O) interface 405, input/output (I/O) interface 405 also being connected to bus 404. Electronic device 400 may also include one or more of the following components connected to I/O interface 405: an input section 406 including a keyboard, a mouse, and the like; an output section 407 including a display device such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 408 including a hard disk and the like; and a communication section 409 including a network interface card such as a LAN card, a modem, or the like. The communication section 409 performs communication processing via a network such as the internet. A driver 410 is also connected to the I/O interface 405 as needed. A removable medium 411 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 410 as needed, so that a computer program read out therefrom is mounted in the storage section 408 as needed.
According to an embodiment of the invention, the method flow according to an embodiment of the invention may be implemented as a computer software program. For example, embodiments of the invention include a computer program product comprising a computer program embodied on a computer-readable storage medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 409, and/or installed from the removable medium 411. The computer program performs the above-described functions defined in the apparatus of the embodiment of the present invention when executed by the processor 401. According to an embodiment of the present invention, the above-described apparatuses, devices, apparatuses, modules, units, etc. may be realized by computer program modules.
An embodiment of the present invention further provides a computer-readable storage medium, which may be included in the apparatus/device/apparatus described in the foregoing embodiment; or may be separate and not incorporated into the apparatus/device/apparatus. The computer-readable storage medium carries one or more programs which, when executed, implement a method for extracting key parts of a vehicle based on fast RCNN high-resolution SAR images according to an embodiment of the present invention.
According to embodiments of the present invention, the computer readable storage medium may be a non-volatile computer readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution apparatus, device, or apparatus. For example, according to embodiments of the invention, a computer-readable storage medium may include ROM 402 and/or RAM 403 and/or one or more memories other than ROM 402 and RAM 403 as described above.
Embodiments of the invention also include a computer program product comprising a computer program comprising program code for performing the method illustrated in the flow chart. When the computer program product runs in a computer device, the program code is used for enabling the computer device to realize the method for extracting the key parts of the vehicle target based on the fast RCNN high-resolution SAR image provided by the embodiment of the invention.
Which when executed by the processor 401, performs the above-described functions as defined in the apparatus/arrangement of embodiments of the invention. The above described apparatuses, devices, modules, units etc. may be implemented by computer program modules according to embodiments of the present invention.
In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, and the like. In another embodiment, the computer program may also be transmitted, distributed in the form of a signal on a network medium, downloaded and installed through the communication section 409, and/or installed from the removable medium 411. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 409 and/or installed from the removable medium 411. The computer program performs the above-described functions defined in the apparatus of the embodiment of the present invention when executed by the processor 401. The above described apparatuses, devices, apparatuses, modules, units etc. may be implemented by computer program modules according to embodiments of the present invention.
According to embodiments of the present invention, program code for executing a computer program provided by embodiments of the present invention may be written in any combination of one or more programming languages, and in particular, the computer program may be implemented using a high level procedural and/or object oriented programming language, and/or an assembly/machine language. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
Figure 668649DEST_PATH_IMAGE020
It should be noted that each functional module in each embodiment of the present invention may be integrated into one processing module, or each module may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention is essentially or partially contributing to the prior art or the technical solutionMay be embodied in the form of a software product.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based apparatus that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It will be appreciated by a person skilled in the art that various combinations and/or combinations of features described in the various embodiments of the invention are possible, even if such combinations or combinations are not explicitly described in the present invention. In particular, various combinations and/or subcombinations of the features described in various embodiments of the invention may be made without departing from the spirit and teachings of the invention. All such combinations and/or associations are within the scope of the present invention.
While the invention has been shown and described with reference to certain exemplary embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims and their equivalents. Therefore, the scope of the present invention should not be limited to the above-described embodiments, but should be defined not only by the appended claims but also by equivalents thereof.

Claims (10)

1. A high-resolution SAR image vehicle target key part extraction method based on fast RCNN is characterized by comprising the following steps:
preprocessing a plurality of SAR sensing images to obtain a plurality of slice image data;
labeling the slice image data, and dividing the labeled slice image data into a training data set and a test data set according to a preset proportion;
building a Faster RCNN network model based on a VGG16 network and an RPN network, setting training parameters of the Faster RCNN network model and building a loss function of the Faster RCNN network model;
and sequentially inputting the training data set and the testing data set into the Faster RCNN network model for training and testing, and storing the tested network model parameters to obtain the tested Faster RCNN network model, wherein the tested Faster RCNN network model is used for extracting the key parts of the vehicle target.
2. The method for extracting key parts of a vehicle from a high-resolution SAR image based on fast RCNN according to claim 1, wherein the constructing the loss function of the fast RCNN network model comprises:
constructing a loss function of the Faster RCNN network model according to the L1 norm, parameter loss between the position structure parameter and the true value of the predicted target and a characteristic function established based on the observation condition and the scattering characteristic of the target; the characteristic function is obtained through image space change and represents strong scattering characteristics of interaction between the target vehicle and the ground.
3. The method for extracting key parts of a vehicle target from a high resolution SAR image based on Faster RCNN according to claim 1, wherein the preprocessing the plurality of SAR sensing images to obtain a plurality of slice image data comprises:
carrying out geocoding geometric correction and slice segmentation processing on a plurality of SAR sensing images acquired by an SAR sensor at different shooting angles in the same working mode to obtain a plurality of slice image data; wherein the plurality of slice image data are the same size.
4. The method for extracting key parts of a vehicle target from a high-resolution SAR image based on fast RCNN as claimed in claim 1, wherein the labeling process of the slice image data comprises:
and according to the three-dimensional model of the vehicle with the known type, labeling the slice image data by using a minimum external rectangular frame and a vehicle main shaft according to the image areas corresponding to the vehicle head and the vehicle body to obtain the labeled slice image data.
5. The method for extracting key parts of a vehicle target from a high-resolution SAR image based on fast RCNN as claimed in claim 1, wherein the step of dividing the labeled slice image data into a training data set and a testing data set according to a preset proportion comprises the steps of:
generating corresponding xml parameter files for each marked slice image data, and recording parameter information of the vehicle area;
and dividing the slice image data with the recorded parameters into a training data set and a test data set according to a preset proportion.
6. The method for extracting key parts of a vehicle target from a high resolution SAR image based on Faster RCNN as claimed in claim 1, wherein the training parameters at least comprise: the method comprises the following steps of inputting the number of channels, outputting the number of channels of a full connection layer, the size of an ROI feature layer, a Classification parameter, a model learning rate, the number of extraction frames, the number of data samples captured by one-time training and the total iteration number.
7. The method for extracting key portions of vehicle targets from high resolution SAR images based on Faster RCNN as claimed in claim 2, wherein the loss functionLossThe following relationship is satisfied:
Figure 888195DEST_PATH_IMAGE001
wherein the content of the first and second substances,λ 1λ 2 andλ 3 all take values of 0.1;L para an L1 norm representing a parameter estimate;L iou representing parameter loss between the position structure parameter of the predicted target and a real value and IOU loss between a predicted frame and a real frame;L d representing a feature function established on the basis of observation conditions and scattering features of the target, whereinL d The following relationship is satisfied:
Figure 805336DEST_PATH_IMAGE002
wherein N represents the number of samples;
Figure 647390DEST_PATH_IMAGE003
representing the estimated value of the parameter for removing the wavy line;θ i is shown asiThe angle value of the calculation process of the sample data;
Figure 42599DEST_PATH_IMAGE004
to representθ i The calculated angle value.
8. The method for extracting key parts of the vehicle target from the high-resolution SAR image based on fast RCNN as claimed in claim 1, wherein the preset proportion is 3:1 or 4:1.
9. The method for extracting key parts of a vehicle target from a high-resolution SAR image based on Faster RCNN according to claim 1, wherein the key parts of the vehicle target comprise: the key parts of the vehicle head and the vehicle body.
10. A high resolution SAR image vehicle target key part extraction device based on Faster RCNN is characterized by comprising:
the image preprocessing module is used for preprocessing the SAR sensing images to obtain a plurality of slice image data;
the data set generating module is used for labeling the slice image data and dividing the labeled slice image data into a training data set and a test data set according to a preset proportion;
the model building module is used for building a Faster RCNN model based on a VGG16 network and an RPN network, setting training parameters of the Faster RCNN model and building a loss function of the Faster RCNN model;
and the model training and testing module is used for sequentially inputting the training data set and the testing data set into the Faster RCNN network model for training and testing, storing parameters of the tested network model and obtaining the tested Faster RCNN network model, wherein the tested Faster RCNN network model is used for extracting key parts of a vehicle target.
CN202210913427.5A 2022-08-01 2022-08-01 High-resolution SAR image vehicle target key part extraction method based on fast RCNN Active CN114973023B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210913427.5A CN114973023B (en) 2022-08-01 2022-08-01 High-resolution SAR image vehicle target key part extraction method based on fast RCNN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210913427.5A CN114973023B (en) 2022-08-01 2022-08-01 High-resolution SAR image vehicle target key part extraction method based on fast RCNN

Publications (2)

Publication Number Publication Date
CN114973023A CN114973023A (en) 2022-08-30
CN114973023B true CN114973023B (en) 2022-10-04

Family

ID=82970204

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210913427.5A Active CN114973023B (en) 2022-08-01 2022-08-01 High-resolution SAR image vehicle target key part extraction method based on fast RCNN

Country Status (1)

Country Link
CN (1) CN114973023B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109284704A (en) * 2018-09-07 2019-01-29 中国电子科技集团公司第三十八研究所 Complex background SAR vehicle target detection method based on CNN
CN112084897A (en) * 2020-08-25 2020-12-15 西安理工大学 Rapid traffic large-scene vehicle target detection method of GS-SSD
CN114511780A (en) * 2022-01-21 2022-05-17 南京航空航天大学 Multi-mode small target detection method and system based on remote sensing image

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9978013B2 (en) * 2014-07-16 2018-05-22 Deep Learning Analytics, LLC Systems and methods for recognizing objects in radar imagery

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109284704A (en) * 2018-09-07 2019-01-29 中国电子科技集团公司第三十八研究所 Complex background SAR vehicle target detection method based on CNN
CN112084897A (en) * 2020-08-25 2020-12-15 西安理工大学 Rapid traffic large-scene vehicle target detection method of GS-SSD
CN114511780A (en) * 2022-01-21 2022-05-17 南京航空航天大学 Multi-mode small target detection method and system based on remote sensing image

Also Published As

Publication number Publication date
CN114973023A (en) 2022-08-30

Similar Documents

Publication Publication Date Title
US10885399B2 (en) Deep image-to-image network learning for medical image analysis
CN112734641B (en) Training method and device for target detection model, computer equipment and medium
WO2020258793A1 (en) Target detection and training of target detection network
CN109003267B (en) Computer-implemented method and system for automatically detecting target object from 3D image
WO2017215622A1 (en) Object segmentation method and apparatus and computing device
CN114202672A (en) Small target detection method based on attention mechanism
CN114842365B (en) Unmanned aerial vehicle aerial photography target detection and identification method and system
Raghavan et al. Optimized building extraction from high-resolution satellite imagery using deep learning
DE102021128292A1 (en) SCALABLE SEMANTIC IMAGE SEARCH WITH DEEP TEMPLATE MATCHING
US11482009B2 (en) Method and system for generating depth information of street view image using 2D map
CN112927234A (en) Point cloud semantic segmentation method and device, electronic equipment and readable storage medium
CN113780296A (en) Remote sensing image semantic segmentation method and system based on multi-scale information fusion
US10846854B2 (en) Systems and methods for detecting cancer metastasis using a neural network
CN111414953B (en) Point cloud classification method and device
US11430123B2 (en) Sampling latent variables to generate multiple segmentations of an image
CN113326851B (en) Image feature extraction method and device, electronic equipment and storage medium
CN109949304B (en) Training and acquiring method of image detection learning network, image detection device and medium
CN112215217B (en) Digital image recognition method and device for simulating doctor to read film
US11776130B2 (en) Progressively-trained scale-invariant and boundary-aware deep neural network for the automatic 3D segmentation of lung lesions
CN112330701A (en) Tissue pathology image cell nucleus segmentation method and system based on polar coordinate representation
CN115375899A (en) Point cloud semantic segmentation network training method, point cloud semantic segmentation method and point cloud semantic segmentation device
CN113096080B (en) Image analysis method and system
Wang et al. Geometric consistency enhanced deep convolutional encoder-decoder for urban seismic damage assessment by UAV images
CN114973023B (en) High-resolution SAR image vehicle target key part extraction method based on fast RCNN
CN114881763B (en) Post-loan supervision method, device, equipment and medium for aquaculture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant