CN112926457B - SAR image recognition method based on fusion frequency domain and space domain network model - Google Patents

SAR image recognition method based on fusion frequency domain and space domain network model Download PDF

Info

Publication number
CN112926457B
CN112926457B CN202110220080.1A CN202110220080A CN112926457B CN 112926457 B CN112926457 B CN 112926457B CN 202110220080 A CN202110220080 A CN 202110220080A CN 112926457 B CN112926457 B CN 112926457B
Authority
CN
China
Prior art keywords
image
frequency domain
domain
frequency
space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110220080.1A
Other languages
Chinese (zh)
Other versions
CN112926457A (en
Inventor
李雪松
李晓冬
杜记川
罗子娟
吴蔚
杨东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 28 Research Institute
Original Assignee
CETC 28 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 28 Research Institute filed Critical CETC 28 Research Institute
Priority to CN202110220080.1A priority Critical patent/CN112926457B/en
Publication of CN112926457A publication Critical patent/CN112926457A/en
Application granted granted Critical
Publication of CN112926457B publication Critical patent/CN112926457B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/48Extraction of image or video features by mapping characteristic values of the pattern into a parameter space, e.g. Hough transformation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention belongs to the technical field of image recognition, and discloses an SAR image recognition method based on a fusion frequency domain and space domain network model, which comprises the following steps: converting the original spatial domain image into a frequency domain image; carrying out channel selection on the frequency domain image to obtain an effective frequency domain signal; inputting the effective frequency domain signal into a frequency domain backbone network to extract frequency domain characteristics; inputting an original spatial domain image into a spatial domain backbone network, and extracting spatial domain characteristics; fusing the spatial domain characteristics and the frequency domain characteristics through a network model; and inputting the fused features into a classifier to realize the identification and classification of the target in the SAR image. The invention designs an end-to-end network model fusing the frequency domain and the spatial domain, not only considers the pixel characteristics of the spatial domain of the SAR image, but also extracts the frequency domain characteristics of the SAR image aiming at the imaging characteristics that the SAR is different from visible light, and the effectiveness and the robustness of the SAR image recognition model can be further improved by fusing the spatial domain characteristics and the frequency domain characteristics.

Description

SAR image recognition method based on fusion frequency domain and space domain network model
Technical Field
The invention belongs to the technical field of computer image recognition, and particularly relates to an SAR image recognition method based on a fusion frequency domain and space domain network model.
Background
Synthetic Aperture Radar (SAR) image recognition distinguishes specific targets by using characteristic information of the targets, and interpretation and analysis of SAR images are realized.
SAR image target identification is widely applied to the military field, resource exploration, environmental monitoring and other aspects. Compared with visible light images and infrared images, the SAR images have the characteristics of all weather, strong penetrability, richer image information and the like. On the one hand, since the SAR image is imaged by microwave reflection of the target; SAR images, on the other hand, typically contain a large amount of noise interference and geometric deformation. Therefore, target recognition of SAR images is very challenging.
SAR image recognition attracts a large number of scholars at home and abroad to research. The traditional framework of SAR image recognition comprises the following steps: (1) an image preprocessing module: speckle noise usually exists in the SAR image, the noise interference influences the performance of image identification, and the function of the image preprocessing module is to suppress the noise; (2) a feature extraction module: the extraction and selection of the characteristics play a critical role in the performance of target identification, and the SAR target characteristics mainly comprise geometric characteristics, scattering characteristics and transformation characteristics; (3) a classification identification module: and mapping the extracted features to a feature space through a classifier to realize target classification and identification. With the deep study and the wide application of the deep learning method in the field of computer vision, a plurality of deep neural network models are migrated and applied to SAR image recognition, and the effect better than that of the traditional method is achieved. This is mainly because the conventional methods rely on a manually designed feature extractor, require a professional knowledge background and a complex parameter adjustment process, and each method is only specific to a specific application and a fixed scene, and the generalization performance and robustness of the model are poor. Deep learning is to perform feature extraction in a data-driven mode by constructing a deep neural network structure, deep feature representation closely related to tasks can be obtained according to learning of a large number of samples, the data set expression is more efficient and accurate, the extracted abstract features have better generalization capability, and the model robustness is stronger in an end-to-end mode.
Although some deep learning methods for visible light image recognition also achieve good performance in the field of SAR image recognition, some differences exist between SAR images and visible light images: in one aspect, the complex data for each pixel of the SAR image may be transformed in the frequency domain to extract corresponding amplitude and phase information. The amplitude information has great correlation with the gray scale information of the visible light image, and is the backscattering intensity of the ground target to radar waves; the phase information is the round-trip propagation distance of the sensor from the ground target. On the other hand, the visible light image recognition model generally only considers the pixel points of the spatial domain and the high-order feature modeling of the relationship between the pixel points, but does not consider the characteristics of the SAR image and the target to be recognized, such as the nonuniformity of the strong background scattering clutter. Therefore, only the feature extraction and model construction of the spatial domain are considered, and the method is not suitable for SAR image recognition.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art: due to the imaging characteristic of the SAR image, not only can an amplitude image of a space domain be obtained, but also the back scattering characteristic of a frequency domain is included. Most of the existing deep learning methods only build a network model to model the airspace characteristics of an SAR target, do not dig out the frequency domain characteristics of an SAR image, and cause the loss of key information. Some deep learning methods also mine the frequency domain characteristics of the SAR image, but do not verify the validity of the frequency domain signal. On one hand, because the background information exists in the spatial domain image, the frequency domain signal converted by the background information and the frequency domain signal converted by the foreground information are distributed in different frequency domain channels, namely, a part of signals of the frequency domain channels exist as noise information, which is not beneficial to the identification of the SAR image; on the other hand, speckle noise exists in the SAR image, and is also distributed in different frequency domain channels, which may interfere with the recognition performance of the SAR image. Therefore, how to separate out the effective frequency domain signal is crucial.
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to solve the technical problem of the prior art and provides an SAR image recognition method based on a fusion frequency domain and space domain network model, wherein the space domain pixel information and the frequency domain characteristics of the SAR image are extracted in an end-to-end mode, and the characteristics are fused to obtain the essential characteristics of a target, so that the accuracy and the robustness of SAR image recognition are further improved, and the interpretability of the model is enhanced.
In order to solve the technical problem, the invention discloses an SAR image recognition method based on a fusion frequency domain and space domain network model, which comprises the following steps:
step 1, acquiring an SAR image to be identified, and performing image data enhancement on the acquired SAR image, wherein the enhanced image is used as a spatial domain image;
and 2, converting the size of the spatial domain image in the step 1 into N x N, and dividing the converted spatial domain image into N x N size blocks to obtain N/N x N/N size blocks. Transforming the image signal of each size block space domain into a structural form expressed by frequency components by a frequency domain conversion method, wherein each size block has n x n different frequency components; the frequency components of corresponding positions in blocks with different sizes are used as one channel of the constructed frequency domain image, and all n channels form a new frequency domain image F through dimension transformation 1
Step 3, inputting the spatial domain image in the step 1 into a spatial domain backbone network to extract a spatial domain feature vector F space
And 4, different frequency channels have different influences on the model performance, some frequency channels play a key role in model identification, and some frequency channels do not help, but increase the time for model training and inference. Thus, by comparing the frequency domain image F obtained in step 2 1 Selecting channels to obtain effective frequency domain signal F 2
Step 5, the effective frequency domain signal F obtained in the step 4 is processed 2 Inputting the data into a frequency domain backbone network to extract a frequency domain feature vector F frequency
Step 6, the space domain feature vector F obtained in the step 3 is processed space And the frequency domain feature vector F obtained in step 5 frequency Carrying out feature fusion to obtain a fused target feature vector F fusion
Step 7, fusing the characteristic vector F obtained in the step 6 fusion Input the methodAnd in a subsequent network, performing feature dimension reduction and class probability prediction to realize the identification and classification of the target.
In one implementation, the image data enhancement of the SAR image in step 1 includes: in the training stage, the data enhancement mode comprises the modes of image standardization, image scale transformation, geometric transformation (translation, turnover and the like), random cutting and the like; in the testing stage, the enhanced image is used as a spatial domain image only by adopting the modes of image scale transformation and image standardization;
in one implementation, the step 2 includes:
step 2.1, converting the size of the spatial domain image into N × N, namely, the length and the width of the converted spatial domain image are both N, and dividing the spatial domain image into N/N × N/N size blocks; the size of the spatial domain image is converted into N x N so as to ensure the consistency of frequency domain signals obtained from all images in the data set, and the size of the frequency domain signals is divided into N x N size blocks, wherein the obtained N/N x N/N size blocks are used for separating the frequency domain signals with different frequencies;
step 2.2, the image signal of each size block space domain is converted into a structural form expressed by frequency components by adopting a two-dimensional discrete cosine transform mode, and the calculation formula of the two-dimensional discrete cosine transform is as follows:
Y=C N ·X·(C N ) T
where X is an image signal of each size block spatial domain, Y is an output frequency domain signal of each size block, and C is a transform coefficient matrix, and the expression formula is as follows:
Figure BDA0002954447820000031
where j, k ∈ {0,1,2, …, N-1}, j and k respectively represent the positions of the horizontal and vertical axes of the pixel points in the spatial-domain image signal. When j is 0, α j 1 is ═ 1; when j > 0, α j =2;
The image signal of each size block space domain is converted into a structural form expressed by frequency components by adopting a two-dimensional discrete cosine transform mode, so that the frequency domain energy focusing degree is better, and unimportant frequency domain regions can be filtered;
step 2.3, extracting frequency components at corresponding positions for the frequency domain signals Y obtained from the blocks with different sizes to connect, and using the frequency components as a channel of the constructed frequency domain image; since the frequency domain signal Y has a total of n 2 Each position, so that the constructed frequency domain image is a two-dimensional feature vector with the size of n 2 *(N/n) 2 The corresponding position refers to the ith position in the frequency domain signal Y obtained by each size block, i is equal to {1,2, …, n 2 }; the two-dimensional frequency domain image is expanded through dimensionality to form a new three-dimensional frequency domain image F 1 Image size n 2 (N/N) N, wherein N 2 The number of channels is the frequency domain image. The method is beneficial to restoring the spatial information of the frequency domain signal, and is used for extracting the subsequent frequency domain characteristics and fusing the frequency domain characteristics and the spatial domain characteristics.
In one implementation manner, in the step 3, the spatial domain image is input to a spatial domain backbone network based on ResNet50 to extract a spatial domain feature vector F space . In the step, the spatial domain characteristics of the SAR image are modeled by excavating the pixel values and the relation between the pixels.
In one implementation, the step 4 inputs the frequency domain image, and obtains the effective frequency domain signal by using a channel selection method based on an attention mechanism. The method comprises the following steps:
step 4.1, input frequency domain image F 1 The attention network obtains an attention feature vector Mask, and the expression formula is as follows:
Mask=Sigmoid(BN(Conv(ReLU(Conv(F 1 )))))
wherein Conv represents 1 × 1 convolution operation, BN represents batch normalization, and Sigmoid and ReLU represent activation functions;
step 4.2, fusing attention characteristic vector Mask and frequency domain image F 1 And using convolution network model to select effective frequency to obtain effective frequency signal F 2 The expression formula is as follows:
Figure BDA0002954447820000041
wherein the content of the first and second substances,
Figure BDA0002954447820000042
representing element-by-element multiplication.
Different frequency channels have different influences on the performance of the model, some frequency channels play a key role in the identification of the model, some frequency channels do not help, but the training and deducing time of the model is increased, and step 4.1 utilizes an attention mechanism to model the importance among the characteristics of the different frequency channels and pays attention to the dependency relationship of the model channel level; and 4.2, filtering out frequency channels with low importance by utilizing the importance of the further modeling channel characteristics of the convolutional neural network to obtain effective frequency domain characteristics.
In one implementation, in the step 5, the frequency domain backbone network is a modified frequency domain backbone network of ResNet50, and the modification is to remove the first convolutional layer and the pooling layer of the ResNet50 residual network, so as to ensure that the input of the network model is adapted to the input of the frequency domain signal. In the step, deep learning of frequency domain signals is realized by fine adjustment of the ResNet50 network, frequency distribution and scattering characteristics are automatically mined, and frequency domain characteristics of the SAR image are extracted. The space domain backbone network and the frequency domain backbone network are kept to adopt the same model structure, and subsequent feature fusion of the space domain features and the frequency domain features in the same dimensional space is facilitated.
In one implementation, in step 6, Concat is connected in a channel dimension to a space domain feature vector F space Sum frequency domain feature vector F frequency Fusing to obtain a feature vector F fusion The expression formula is as follows:
F fusion =Concat(F space ,F frequency ,dim=1)
the complementation of the frequency domain characteristic and the spatial domain characteristic is realized through the characteristic fusion of the channel dimensions, and the discriminability and the robustness of the characteristics are enhanced.
In one implementation, in step 7,the subsequent network is a classifier and consists of a full-connection layer network and a Softmax activation function, and the fused feature vector F fusion Inputting the feature vectors into a full-connection layer network, and performing feature dimension reduction to obtain feature vectors of an identification target, wherein the dimensions of the feature vectors of the identification target are the number of categories of all targets; inputting the characteristic vector for identifying the target into a Softmax activation function to predict the probability corresponding to each category, and taking the category with the maximum probability value as the predicted category to realize the prediction of the target category; and in the subsequent network model training stage, inputting the category of target prediction and the labeled category information, and performing supervised training on the subsequent network model by adopting a cross entropy loss function.
Has the advantages that: the invention discloses an SAR image recognition method based on a fusion frequency domain and space domain network model, which is characterized in that an end-to-end network model is designed to fuse the pixel information of an airspace and the scattering characteristics of a frequency domain, the complementation of different domain characteristics is carried out, and deeper key characteristic information is excavated. Meanwhile, the effective frequency domain signal is selected by using the attention-based frequency domain channel selection method, so that the interference of noise frequency domain signals is reduced, and the effect and performance of SAR image identification are further improved.
Drawings
The foregoing and other advantages of the invention will become more apparent from the following detailed description of the invention when taken in conjunction with the accompanying drawings.
FIG. 1 is a flow chart of the implementation steps of the present invention;
FIG. 2 is a network diagram of an embodiment of the present invention;
FIG. 3 is a schematic diagram of a channel selection method based on attention mechanism according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a spatial domain backbone network and a frequency domain backbone network according to an embodiment of the present invention;
Detailed Description
The invention is further elucidated with reference to the drawings and the embodiments.
FIG. 1 is a flow chart of the implementation steps of the present invention, which includes the following steps:
step 1, acquiring an SAR image to be identified, and performing image data enhancement on the acquired SAR image, wherein the enhanced image is used as a spatial domain image;
and 2, converting the size of the spatial domain image in the step 1 into 448 x 448, and dividing the image into 8 x 8 size blocks to obtain 56 x 56 size blocks. Transforming the image signal of the spatial domain of each size block into a structural form expressed by frequency components by a frequency domain conversion method, wherein each size block has 64 different frequency components; the frequency component of the corresponding position in the blocks with different sizes is used as one channel of the constructed frequency domain image, and all 64 channels form a new frequency domain image F through dimension transformation 1
Step 3, inputting the spatial domain image in the step 1 into a spatial domain backbone network to extract a spatial domain feature vector F space
Step 4, obtaining the frequency domain image F in the step 2 1 Selecting channels to obtain effective frequency domain signal F 2
Step 5, the effective frequency domain signal F obtained in the step 4 is processed 2 Inputting the data into a frequency domain backbone network to extract a frequency domain feature vector F frequency
Step 6, the space domain feature vector F obtained in the step 3 is processed space And the frequency domain feature vector F obtained in step 5 frequency Carrying out feature fusion to obtain a fused target feature vector F fusion
Step 7, fusing the characteristic vector F obtained in the step 6 fusion Inputting the data into a subsequent network, performing feature dimension reduction and class probability prediction, and realizing the identification and classification of the target.
In this embodiment, the enhancing the image data of the SAR image in step 1 includes: in the training stage, a data enhancement mode of image standardization, image scale transformation and random cutting is adopted; in the testing stage, the enhanced image is used as a spatial domain image only by adopting the modes of image scale transformation and image standardization;
fig. 2 is a schematic network diagram according to an embodiment of the present invention, where step 2 includes:
step 2.1, converting the size of the spatial domain image into 448 × 448, namely, the length and the width of the converted spatial domain image are 448 pixels, and dividing the spatial domain image into blocks with the size of 8 × 8 to obtain 56 blocks with the size of 56;
step 2.2, the image signal of each size block space domain is converted into a structural form expressed by frequency components by adopting a two-dimensional discrete cosine transform mode, and the calculation formula of the two-dimensional discrete cosine transform is as follows:
Y=C N ·X·(C N ) T
where X is the image signal for each size block spatial domain and N is the transformed spatial domain image length and width, value 448. Y is the frequency domain signal of each size block output, C is the transform coefficient matrix, and the expression formula is as follows:
Figure BDA0002954447820000071
where j, k ∈ {0,1,2, …, 447}, and j and k represent the positions of the horizontal and vertical axes, respectively, of the pixel points in the spatial-domain image signal. When j is 0, α j 1 is ═ 1; when j > 0, α j =2;
Step 2.3, extracting frequency components at corresponding positions for the frequency domain signals Y obtained from the blocks with different sizes to connect, and using the frequency components as a channel of the constructed frequency domain image; since there are 64 positions in the frequency domain signal Y in total, the constructed frequency domain image is a two-dimensional feature vector with a size of 64 × 3136, and the corresponding position refers to the ith position in the frequency domain signal Y obtained by each size block, i belongs to {1,2, …, 64 }; the two-dimensional frequency domain image is expanded through dimensionality to form a new three-dimensional frequency domain image F 1 The image size is 64 × 56, where 64 is the number of channels in the frequency domain image.
Fig. 4 is a schematic structural diagram of a spatial domain backbone network and a frequency domain backbone network according to an embodiment of the present invention, in step 3, a spatial domain image is input to a ResNet 50-based spatial domain backbone network to extract spatial domain featuresVector F space The dimension of the feature vector is 2048.
Fig. 3 is a schematic diagram of a channel selection method based on the attention mechanism according to an embodiment of the present invention, where in step 4, a frequency domain image is input, and an effective frequency domain signal is obtained by using the channel selection method based on the attention mechanism. The method comprises the following steps:
step 4.1, input frequency domain image F 1 The attention network obtains an attention feature vector Mask, and the expression formula is as follows:
Mask=Sigmoid(BN(Conv(ReLU(Conv(F 1 )))))
wherein Conv represents 1 × 1 convolution operation, BN represents batch normalization, and Sigmoid and ReLU represent activation functions;
step 4.2, fusing attention feature vector Mask and frequency domain image F 1 And using the network model to select effective frequency to obtain effective frequency signal F 2 The expression formula is as follows:
Figure BDA0002954447820000072
wherein the content of the first and second substances,
Figure BDA0002954447820000073
representing element-by-element multiplication.
Fig. 4 is a schematic structural diagram of a spatial domain backbone network and a frequency domain backbone network according to an embodiment of the present invention, in step 5, the frequency domain backbone network is a modified frequency domain backbone network of ResNet50, and the modification is to remove a first convolutional layer and a pooling layer of a ResNet50 residual network, so as to ensure that the input of the network model is adapted to the input of the frequency domain signal. Inputting the frequency domain image into the frequency domain backbone network of the improved ResNet50 to extract the frequency domain feature vector F frequency The feature vector dimension is 2048.
In this embodiment, in the step 6, the Concat is connected in the channel dimension to the space domain feature vector F space Sum frequency domain feature vector F frequency Fusing to obtain a feature vector F fusion To harmonizeThe dimension of the resultant eigenvector is 4096, and the expression formula is as follows:
F fusion =Concat(F space ,F frequency ,dim=1)
in this embodiment, in step 7, the subsequent network is a classifier and is composed of a full-connection layer network and a Softmax activation function, and the feature vector F after fusion is used fusion Inputting the data into a full-connection layer network, performing feature dimension reduction to obtain a feature vector of a recognition target, wherein the dimension of the feature vector input by the full-connection layer network is 4096, and the dimension of the output feature vector is the number of categories of all targets. And inputting the characteristic vector for identifying the target into a Softmax activation function to predict the probability corresponding to each category, and taking the category with the maximum probability value as the predicted category so as to realize the prediction of the target category. And in the subsequent network training stage, inputting the category of target prediction and the labeled category information, and performing supervised training on the subsequent network model by adopting a cross entropy loss function.
The present invention provides a method for recognizing an SAR image based on a fused frequency domain and spatial domain network model, and a plurality of methods and approaches for implementing the technical solution are provided, and the above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, a plurality of improvements and modifications may be made without departing from the principle of the present invention, and these improvements and modifications should also be regarded as the protection scope of the present invention. All the components not specified in this embodiment can be implemented by the prior art.

Claims (8)

1. A SAR image recognition method based on a fusion frequency domain and space domain network model is characterized by comprising the following steps:
step 1, acquiring an SAR image to be identified, and performing image data enhancement on the acquired SAR image, wherein the enhanced image is used as a spatial domain image;
step 2, converting the size of the spatial domain image in the step 1 into N x N, and dividing the converted spatial domain image into N x N size blocks to obtain N/N x N/N size blocks; by frequency domain conversion methodsTransforming the image signal in the spatial domain of each size block into a structural form expressed in frequency components, wherein each size block has n x n different frequency components; the frequency components of corresponding positions in the blocks with different sizes are used as one channel of the constructed frequency domain image, and all n channels form a new frequency domain image F through dimension transformation 1
Step 3, inputting the spatial domain image in the step 1 into a spatial domain backbone network to extract a spatial domain feature vector F space
Step 4, the frequency domain image F obtained in the step 2 is processed 1 Selecting channels to obtain effective frequency domain signal F 2
Step 5, the effective frequency domain signal F obtained in the step 4 is processed 2 Inputting the data into a frequency domain backbone network to extract a frequency domain feature vector F frequency
Step 6, the space domain feature vector F obtained in the step 3 is processed space And the frequency domain feature vector F obtained in step 5 frequency Carrying out feature fusion to obtain a fused target feature vector F fusion
Step 7, fusing the feature vector F obtained in the step 6 fusion Inputting the data into a subsequent network, performing feature dimension reduction and class probability prediction, and realizing the identification and classification of the target.
2. The method for identifying the SAR image based on the fusion frequency domain and space domain network model as claimed in claim 1, wherein the image data enhancement of the SAR image in the step 1 comprises: in the training stage, the data enhancement mode comprises an image standardization mode, an image scale transformation mode, a geometric transformation mode and a random cutting mode; in the testing stage, the enhanced image is used as a spatial domain image by adopting the modes of image scale transformation and image standardization.
3. The method for SAR image recognition based on the fusion frequency domain and space domain network model according to claim 1, wherein the step 2 comprises:
step 2.1, converting the size of the spatial domain image into N × N, namely, the length and the width of the converted spatial domain image are both N, and dividing the spatial domain image into N/N × N/N size blocks;
step 2.2, the image signal of each size block space domain is converted into a structural form expressed by frequency components by adopting a two-dimensional discrete cosine transform mode, and the calculation formula of the two-dimensional discrete cosine transform is as follows:
Y=C N ·X·(C N ) T
wherein, X is an image signal of each size block space domain, Y is an output frequency domain signal of each size block, C is a transform coefficient matrix, and the expression formula is as follows:
Figure FDA0003748177300000021
wherein j, k belongs to {0,1,2, …, N-1}, and j and k respectively represent the positions of the horizontal axis and the vertical axis of a pixel point in the spatial domain image signal; when j is 0, α j 1 is ═ 1; when j > 0, α j =2;
Step 2.3, extracting frequency components at corresponding positions for the frequency domain signals Y obtained from the blocks with different sizes to connect, and using the frequency components as a channel of the constructed frequency domain image; since there is a total of n in the frequency domain signal Y 2 Each position, so that the constructed frequency domain image is a two-dimensional feature vector with the size of n 2 *(N/n) 2 The corresponding position refers to the ith position in the frequency domain signal Y obtained by each size block, i is equal to {1,2, …, n 2 }; the frequency domain image is expanded through dimensionality to form a new three-dimensional frequency domain image F 1 Image size n 2 (N/N) (N/N), wherein N 2 The number of channels is the frequency domain image.
4. The SAR image recognition method based on the fusion frequency domain and space domain network model as claimed in claim 1, wherein in step 3, the space domain image is input to a ResNet 50-based space domain backbone network to obtain a space domain feature vector F space
5. The method for SAR image recognition based on the fusion frequency domain and space domain network model according to claim 1, wherein the step 4 comprises:
step 4.1, input frequency domain image F 1 The attention network obtains an attention feature vector Mask, and the expression formula is as follows:
Mask=Sigmoid(BN(Conv(ReLU(Conv(F 1 )))))
wherein Conv represents 1 × 1 convolution operation, BN represents batch normalization, and Sigmoid and ReLU represent activation functions;
step 4.2, fusing attention characteristic vector Mask and frequency domain image F 1 And using convolution network model to select effective frequency to obtain effective frequency signal F 2 The expression formula is as follows:
Figure FDA0003748177300000022
wherein the content of the first and second substances,
Figure FDA0003748177300000023
representing element-by-element multiplication.
6. The method as claimed in claim 1, wherein in step 5, the frequency domain backbone network is a modified ResNet50 frequency domain backbone network, and the modification is to remove a first convolutional layer and a pooling layer of a ResNet50 residual network to ensure that the input of the network model is adapted to the input of the frequency domain signal.
7. The SAR image recognition method based on the fusion frequency domain and space domain network model as claimed in claim 1, wherein in the step 6, Concat is connected to the space domain feature vector F in the channel dimension space Sum frequency domain feature vector F frequency Fusing to obtain a feature vector F fusion The expression formula is as follows:
F fusion =Concat(F space ,F frequency ,dim=1)
where dim denotes a channel dimension, and dim-1 denotes one channel dimension.
8. The SAR image recognition method based on the fusion frequency domain and space domain network model as claimed in claim 1, wherein in step 7, the subsequent network is a classifier composed of a full link network and a Softmax activation function, and the fused feature vector F is obtained fusion Inputting the feature vector into a full-connection layer network, performing feature dimension reduction to obtain a feature vector of an identification target, inputting the feature vector of the identification target into a Softmax activation function to predict the probability corresponding to each category, and taking the category with the maximum probability value as the predicted category; and in the subsequent network model training stage, inputting the category of target prediction and the labeled category information, and performing supervised training on the subsequent network model by adopting a cross entropy loss function.
CN202110220080.1A 2021-02-26 2021-02-26 SAR image recognition method based on fusion frequency domain and space domain network model Active CN112926457B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110220080.1A CN112926457B (en) 2021-02-26 2021-02-26 SAR image recognition method based on fusion frequency domain and space domain network model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110220080.1A CN112926457B (en) 2021-02-26 2021-02-26 SAR image recognition method based on fusion frequency domain and space domain network model

Publications (2)

Publication Number Publication Date
CN112926457A CN112926457A (en) 2021-06-08
CN112926457B true CN112926457B (en) 2022-09-06

Family

ID=76172421

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110220080.1A Active CN112926457B (en) 2021-02-26 2021-02-26 SAR image recognition method based on fusion frequency domain and space domain network model

Country Status (1)

Country Link
CN (1) CN112926457B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113537151B (en) * 2021-08-12 2023-10-17 北京达佳互联信息技术有限公司 Training method and device for image processing model, image processing method and device
CN113643261B (en) * 2021-08-13 2023-04-18 江南大学 Lung disease diagnosis method based on frequency attention network
CN114049551B (en) * 2021-10-22 2022-08-05 南京航空航天大学 ResNet 18-based SAR raw data target identification method
CN113903075A (en) * 2021-12-10 2022-01-07 中科视语(北京)科技有限公司 Category estimation method, category estimation device, electronic equipment and storage medium
CN114240935B (en) * 2022-02-24 2022-05-20 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) Space-frequency domain feature fusion medical image feature identification method and device
CN116433770B (en) * 2023-04-27 2024-01-30 东莞理工学院 Positioning method, positioning device and storage medium
CN117435764B (en) * 2023-12-21 2024-03-15 中国海洋大学 Ocean remote sensing image-text retrieval method and system based on frequency domain and space domain double perception

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017190574A1 (en) * 2016-05-04 2017-11-09 北京大学深圳研究生院 Fast pedestrian detection method based on aggregation channel features
CN111311563A (en) * 2020-02-10 2020-06-19 北京工业大学 Image tampering detection method based on multi-domain feature fusion
WO2020181685A1 (en) * 2019-03-12 2020-09-17 南京邮电大学 Vehicle-mounted video target detection method based on deep learning
CN111915542A (en) * 2020-08-03 2020-11-10 汪礼君 Image content description method and system based on deep learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017190574A1 (en) * 2016-05-04 2017-11-09 北京大学深圳研究生院 Fast pedestrian detection method based on aggregation channel features
WO2020181685A1 (en) * 2019-03-12 2020-09-17 南京邮电大学 Vehicle-mounted video target detection method based on deep learning
CN111311563A (en) * 2020-02-10 2020-06-19 北京工业大学 Image tampering detection method based on multi-domain feature fusion
CN111915542A (en) * 2020-08-03 2020-11-10 汪礼君 Image content description method and system based on deep learning

Also Published As

Publication number Publication date
CN112926457A (en) 2021-06-08

Similar Documents

Publication Publication Date Title
CN112926457B (en) SAR image recognition method based on fusion frequency domain and space domain network model
CN110472627B (en) End-to-end SAR image recognition method, device and storage medium
Cui et al. Image data augmentation for SAR sensor via generative adversarial nets
Venugopal Automatic semantic segmentation with DeepLab dilated learning network for change detection in remote sensing images
CN113033520B (en) Tree nematode disease wood identification method and system based on deep learning
CN111027497B (en) Weak and small target rapid detection method based on high-resolution optical remote sensing image
CN113240040B (en) Polarized SAR image classification method based on channel attention depth network
CN114037891A (en) High-resolution remote sensing image building extraction method and device based on U-shaped attention control network
CN115853173A (en) Building curtain wall for construction and installation
Venugopal Sample selection based change detection with dilated network learning in remote sensing images
Zhang et al. Learning an SAR image despeckling model via weighted sparse representation
Mathias et al. Deep Neural Network Driven Automated Underwater Object Detection.
Venu Object Detection in Motion Estimation and Tracking analysis for IoT devices
Jiang et al. Semantic segmentation network combined with edge detection for building extraction in remote sensing images
CN116386042A (en) Point cloud semantic segmentation model based on three-dimensional pooling spatial attention mechanism
CN115861669A (en) Infrared dim target detection method based on clustering idea
CN113902975B (en) Scene perception data enhancement method for SAR ship detection
Amjadipour et al. Building detection using very high resolution SAR images with multi-direction based on weighted-morphological indexes
CN114663916A (en) Thermal infrared human body target identification method based on depth abstract features
CN115223033A (en) Synthetic aperture sonar image target classification method and system
CN114863235A (en) Fusion method of heterogeneous remote sensing images
Yu et al. A lightweight ship detection method in optical remote sensing image under cloud interference
Wang et al. Sonar Objective Detection Based on Dilated Separable Densely Connected CNNs and Quantum‐Behaved PSO Algorithm
Rimavičius et al. Automatic benthic imagery recognition using a hierarchical two-stage approach
Amjadipour et al. Estimation of free parameters of morphological profiles for building extraction using SAR images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB02 Change of applicant information

Address after: 210000 No.1, Lingshan South Road, Qixia District, Nanjing City, Jiangsu Province

Applicant after: THE 28TH RESEARCH INSTITUTE OF CHINA ELECTRONICS TECHNOLOGY Group Corp.

Address before: 210007 1 East Street, alfalfa garden, Qinhuai District, Nanjing, Jiangsu.

Applicant before: THE 28TH RESEARCH INSTITUTE OF CHINA ELECTRONICS TECHNOLOGY Group Corp.

CB02 Change of applicant information