CN113537120B - Complex convolution neural network target identification method based on complex coordinate attention - Google Patents

Complex convolution neural network target identification method based on complex coordinate attention Download PDF

Info

Publication number
CN113537120B
CN113537120B CN202110858271.0A CN202110858271A CN113537120B CN 113537120 B CN113537120 B CN 113537120B CN 202110858271 A CN202110858271 A CN 202110858271A CN 113537120 B CN113537120 B CN 113537120B
Authority
CN
China
Prior art keywords
complex
output
module
input
attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110858271.0A
Other languages
Chinese (zh)
Other versions
CN113537120A (en
Inventor
张袁鹏
解岩
张雷
陈一畅
姚汉英
李槟槟
范亚
朱振波
余方利
汤子跃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Air Force Early Warning Academy
Original Assignee
Air Force Early Warning Academy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Air Force Early Warning Academy filed Critical Air Force Early Warning Academy
Priority to CN202110858271.0A priority Critical patent/CN113537120B/en
Publication of CN113537120A publication Critical patent/CN113537120A/en
Application granted granted Critical
Publication of CN113537120B publication Critical patent/CN113537120B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/02Preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The invention discloses a target identification method of a complex convolution neural network based on complex coordinate attention, which relates to the field of target identification, wherein the convolution neural network comprises the following steps: the device comprises an input layer, N basic units, a classification unit and an output layer; the processing unit is used for mapping the complex numbers into corresponding real numbers through modular operation and performing classification and identification; the N basic units include first to nth basic units, each of the N basic units including: the system comprises a first complex convolution module, a first complex batch normalization module, a first complex activation module and a first complex pooling module; wherein one of the N basic units further comprises: a plurality of coordinate attention modules; the complex coordinate attention module includes: the invention realizes the high-precision identification of the similar space cone target.

Description

Complex convolution neural network target identification method based on complex coordinate attention
Technical Field
The invention relates to the field of target identification, in particular to a target identification method of a complex convolutional neural network based on complex coordinate attention.
Background
Ballistic missiles release a bait that is very similar to the warhead during penetration, and therefore it is necessary to identify the warhead and the bait during the missile, thereby reducing the interception cost. The warhead and the bait can be regarded as similar space cone targets with the shapes consistent with the movement forms and only slight differences in movement parameters, so that the similar space cone target identification plays an important role in the fields of space resource utilization, space monitoring (surveyability) and military.
In recent years, studies for spatial object recognition by introducing a Convolutional Neural Network (CNN) have been increasing based on the idea of extracting a fine motion feature in an image domain and then using the extracted fine motion feature for recognition. Li et al, based on CNN, investigated the problem of identification of spatially different shaped, different precession frequency targets using a Multi-mode Fusion (Multi-mode Fusion) approach. Generating S-frequency band and X-frequency band one-dimensional distance images and time-frequency spectrograms of targets with different shapes and different precession frequencies by using an ideal point scattering model; then, the multimode data is used as input of CNN, and three targets of cone, small cone and cylinder are identified. Bai et al, xu et al and Han et al all use the time-frequency spectrogram of the target as the network input method to identify the target, and actually they convert the problem of identifying the target by using the micro-motion feature into the problem of identifying the image. Designing a CNN with three layers of depths, and generating a time-frequency spectrogram in three micro-motion forms of spin, precession and nutation by using an ideal point scattering model; then, the time-frequency spectrograms are properly cut and used as input of the designed CNN, and the three common micromotion forms are identified. Xu et al, designing a CNN with the depth of six layers, and generating echo signals in four micromotion forms of spin, rolling, precession and nutation by using a scattering point model; then obtaining time-frequency spectrograms of a plurality of micro-motion periods through WVD (Wigner-Ville distribution); and finally, the time-frequency spectrogram is used as input of the CNN to finish the identification of the four micromotion forms. Han et al designed a deep learning network consisting of one-dimensional parallel structures (1-D parallel structures) and Long-Short-Term Memory (LSTM) layers. The method comprises the following steps that the method utilizes an electromagnetic calculation method to simulate five targets with different structural parameters and different micro-motion forms, and echo data are obtained; then, performing Time-frequency analysis on the echoes by Short-Time Fourier Transform (STFT) to obtain Time-frequency spectrograms of a plurality of micro-motion periods; and finally, the time-frequency spectrograms are sent to a designed network to finish the identification of the five targets. Wang et al, based on electromagnetic calculation data, obtained more than one precession cycle distance-slow time image of three targets with different geometric shapes and the same micromotion form of cone, cone-cylinder (cone-cylinder) and cone-cylinder skirt (cone-cylinder-skirt), and sent it to the designed CNN to realize target recognition. From the prior art listed above, the main way of doing so is to retain the processing from "echo data domain" to "image domain" as preprocessing, and then replace the process of extracting the micro-motion features from "image domain" for identification with a deep convolutional neural network, which has the following problems: (1) Preprocessing such as time-frequency analysis (time-frequency analysis) or distance slow time imaging is required, so that longer signal processing time is required; (2) A long time of continuous observation of the target is required to obtain a complete periodic image of the target; (3) These methods only target different shapes, different micro-motion forms, and do not achieve the identification of similar spatial cone targets.
Disclosure of Invention
In order to solve the three problems existing in the identification of the space cone target based on the image domain CNN at present, the invention integrates the advantages of CV-CNN and an attention mechanism, introduces the coordinate attention of a real number domain into a complex number domain, constructs a convolutional neural network based on a complex number coordinate attention module and a target identification method, aims to directly operate radar echo complex data as input data, fully utilizes amplitude and phase information, and realizes the high-precision identification of similar space cone targets with the same geometric shape and micromotion form and slightly different micromotion parameters.
To achieve the above object, the present invention provides a convolutional neural network based on a complex coordinate attention module, the convolutional neural network comprising:
the device comprises an input layer, N basic units, a classification unit and an output layer;
the processing unit is used for mapping the complex numbers into corresponding real numbers through modular operation and performing classification and identification; the N basic units comprise first to Nth basic units, the first basic unit is connected with the input layer, the output of the first basic unit is the input of the second basic unit, the input of the Nth basic unit is the output of the (N-1) th basic unit, N is an integer larger than 1, the output of the Nth basic unit is the input of the processing unit, the output of the processing unit is the input of the classifier, and the classifier is connected with the output layer; the N basic units each include: the system comprises a first complex convolution module, a first complex batch normalization module, a first complex activation module and a first complex pooling module; wherein one of the N basic units further comprises: a plurality of coordinate attention modules; the complex coordinate attention module includes: the system comprises a complex coordinate attention embedding unit and a complex coordinate attention generating unit, wherein for each channel, the complex coordinate attention embedding unit is used for encoding a first complex input characteristic diagram of the channel along the horizontal direction and the vertical direction respectively, and generating first output characteristic information of the first complex input characteristic diagram encoded along the horizontal direction and second output characteristic information of the first complex input characteristic diagram encoded along the vertical direction in the channel respectively;
for each channel, the complex coordinate attention generating unit is to: splicing the first output characteristic information and the second output characteristic information to generate a characteristic information splicing result of the channel; performing feature dimensionality reduction on the feature information splicing result of the channel to obtain feature information after dimensionality reduction, and activating the feature information after dimensionality reduction to obtain a first complex output feature map of the channel; splitting the first complex output profile into a first tensor and a second tensor along a spatial dimension; adjusting the dimensions of the first tensor and the second tensor to be the same as the dimensions of the first complex input feature map, and obtaining a second complex output feature map of the channel in the horizontal direction and a third complex output feature map of the channel in the vertical direction; obtaining a third tensor and a fourth tensor, wherein the third tensor is the set of the second complex output characteristic maps of all the channels, and the fourth tensor is the set of the third complex output characteristic maps of all the channels;
expressing each element in the third tensor and the fourth tensor in a polar coordinate form, constraining the amplitude of the polar coordinate by using a constraint function, respectively obtaining a fourth complex output feature map and a fifth complex output feature map in the horizontal and vertical spatial directions, expanding the fourth complex output feature map and the fifth complex output feature map to generate attention weight distribution in the horizontal and vertical spatial directions, and applying the attention weight distribution to a complex input feature map of the complex coordinate attention module to obtain a complex output feature map of the complex coordinate attention module;
wherein, the complex input characteristic diagram and the complex output characteristic diagram are both complex characteristic diagrams.
When the convolutional neural network based on the complex coordinate attention module is used for target identification, preprocessing such as time-frequency analysis or distance slow time imaging is not needed, so that longer signal processing time is not needed, and the efficiency is higher; when the convolutional neural network based on the complex coordinate attention module is used for identifying the target, the target does not need to be continuously observed for a long time to obtain a complete periodic image of the target, and the efficiency is high; the convolutional neural network based on the complex coordinate attention module realizes the identification of the similar space cone target.
Preferably, in the basic unit not including the complex coordinate attention module, the output of the complex convolution module in the basic unit is the input of the complex batch normalization module, the output of the complex batch normalization module is the input of the complex activation module, and the output of the complex activation module is the input of the complex pooling module.
Preferably, in the basic unit including the complex coordinate attention module, the output of the complex convolution module in the basic unit is the input of the complex batch normalization module, the output of the complex batch normalization module is the input of the complex coordinate attention module, the output of the complex coordinate attention module is the input of the complex activation module, and the output of the complex activation module is the input of the complex pooling module.
Preferably, the classification unit includes:
the second complex convolution module, the second complex batch normalization module, the second complex activation module, the third complex convolution module and the classifier; the output of the second complex convolution module is the input of the second complex batch normalization module, the output of the second complex batch normalization module is the input of the second complex activation module, the output of the second complex activation module is the input of the third complex convolution module, and the output of the third complex convolution module is the input of the classifier.
Preferably, the convolutional neural network includes first to sixth basic units.
Preferably, the sixth base unit comprises said complex coordinate attention module.
Preferably, an optimizer is arranged in the convolutional neural network and used for updating the network weight and the bias term.
Preferably, the number of the convolution kernels of the first to sixth basic units is 64, 128, 256 and 256, respectively, the sizes of the convolution kernels are all 1 × 3, the sizes of the sampling windows of the first complex pooling module are all 1 × 2, the sliding step of the convolution is 1, and the padding number is 1.
Preferably, the complex input feature map is a complex input feature map of the spatial target identification signal, and the complex output feature map is a complex output feature map of the spatial target identification signal.
On one hand, a complex coordinate attention module CV-CA utilizes a complex convolution neural network to obtain amplitude and phase characteristics of a signal through complex real part and imaginary part correlation learning; on the other hand, spatial information and channel information in the horizontal and vertical directions are concerned by the attention of complex coordinates, remote modeling of characteristic information is better carried out according to the lazy relationship, and the characteristic characterization capability of the target object is enhanced.
In the channel attention, global pooling is usually adopted to encode global spatial information, but it compresses the global spatial information into a channel descriptor, so that it is difficult to maintain location information, which is particularly important for capturing the spatial structure. Therefore, in the coordinate attention module, the operation of decomposing the global pooling into two one-dimensional feature codes is extended to a complex field, the complex feature maps X of each channel are encoded along two horizontal and vertical directions respectively (direction-correlation is called horizontal and vertical directions for short), and direction-correlated complex feature maps are generated, so that the features in two spatial directions are integrated respectively.
The complex coordinate attention embedding unit outputs accurate spatial position information aggregated under the global receptive field. Based on the encoding result of the complex coordinate attention embedding unit, the complex coordinate attention module designs a second transformation called a complex coordinate attention generating unit. The complex coordinate attention generating unit transformation includes three parts, respectively: the method comprises the following steps of (1) direction-related feature information aggregation, (2) direction-related complex feature map splitting, and (3) complex coordinate attention automatic allocation.
Preferably, in the present invention, X is a complex input feature map of the complex coordinate attention module,
Figure GDA0004042011540000041
wherein x is c For a complex input characteristic map of the c-th channel, the decision is taken>
Figure GDA0004042011540000042
Is a C x W x H dimension number tensor, is greater than or equal to>
Figure GDA0004042011540000043
The method comprises the following steps of (1) obtaining a complex space, C being the number of channels of input feature maps, W being the width of each input feature map, and H being the height of each input feature map; y is a complex output characteristic diagram of the complex coordinate attention module, based on the characteristic diagram>
Figure GDA0004042011540000044
Wherein, y c Is a complex output characteristic diagram of the C channel, C is an integer which is greater than or equal to 1 and less than or equal to C, and the dimension of X is the same as that of Y;
the output of the p channel of the complex input feature diagram X after being coded along the horizontal direction is
Figure GDA0004042011540000045
The output of the p channel of the multiple input characteristic diagram X after being coded along the vertical direction is->
Figure GDA0004042011540000046
Wherein:
Figure GDA0004042011540000047
Figure GDA0004042011540000048
wherein j represents an imaginary unit,
Figure GDA0004042011540000049
represents the real part of a complex number, is greater than or equal to>
Figure GDA00040420115400000410
Representing the imaginary part of the complex number, h being the horizontal pixel index of the input feature map, x p (h, j) is the value of the h row and j column of the p channel of the complex input characteristic diagram, i is the pixel index of the vertical direction of the input characteristic diagram, x p (i, w) is the value of the ith row and the w column of the ith channel of the complex input characteristic diagram.
Preferably, the invention will
Figure GDA0004042011540000051
And &>
Figure GDA0004042011540000052
Splicing is carried out to obtain a characteristic information splicing result M,
Figure GDA0004042011540000053
each tensor in M is denoted as pick>
Figure GDA0004042011540000054
Wherein, wherein [, ]]Representing splicing operation, wherein T is a transposition matrix;
the complex coordinate attention generating unit uses a convolution kernel of 1 multiplied by 1 to carry out feature dimensionality reduction on the feature information splicing result, wherein the feature dimensionality reduction can reduce the number of parameters, and meanwhile, cross-channel information interaction and integration can be realized, the complex coordinate attention generating unit uses a convolution kernel of 1 multiplied by 1 to carry out feature dimensionality reduction on the feature information splicing result, and the complex coordinate attention generating unit sets the feature dimensionality reduction
Figure GDA0004042011540000055
Is a 1 x 1 rewinding core shared by the convolution layers, wherein it is present>
Figure GDA0004042011540000056
Represents the kth rewinding log kernel, k =1,2>
Figure GDA0004042011540000057
Represents medium->
Figure GDA0004042011540000058
C ofA 1X 1 rewinding and accumulating nucleus>
Figure GDA0004042011540000059
Represents medium->
Figure GDA00040420115400000510
Q =1, 2.. C, r denotes a scaling coefficient for controlling the number of channels of the convolution output feature map, s denotes the step size of the convolution operation, and the k-th feature map of the convolution output is v k (i, j) wherein:
Figure GDA00040420115400000511
f k (i,j)=σ(v k (i,j))
wherein m is q For the qth tensor in M,
Figure GDA00040420115400000512
m q (i · s, j · s) are the values of the i · s row and j · s column of the q th tensor after feature information stitching, v k (i, j) represents the complex output characteristic map of the k-th channel which is not activated, which is taken together with the signal>
Figure GDA00040420115400000513
A complex output feature map representing the kth channel, the collection of complex feature maps for each channel being marked as >>
Figure GDA00040420115400000514
f C/r For the complex output characteristic diagram of the C/r channel, sigma (-) represents a complex activation function; the complex activation function is a CReLU function, which is:
Figure GDA0004042011540000061
wherein z is a complex variable.
Preferably, in the invention, the collection of the complex characteristic maps of all channels is emptySplitting the inter-dimension into the first measures
Figure GDA0004042011540000062
And said second tensor +>
Figure GDA0004042011540000063
Figure GDA0004042011540000064
Figure GDA0004042011540000065
Wherein the content of the first and second substances,
Figure GDA0004042011540000066
and &>
Figure GDA0004042011540000067
For a complex output characteristic map of the kth channel in the horizontal direction, based on the characteristic map>
Figure GDA0004042011540000068
The complex output characteristic diagram of the kth channel in the vertical direction, the complex output characteristic diagram of the C/r channel in the horizontal direction and the complex output characteristic diagram of the C/r channel in the vertical direction are shown.
Preferably, the present invention uses a 1 × 1 rewinding kernel to convert f h And f is w Restoring to the same dimension as the X to obtain
Figure GDA0004042011540000069
And &>
Figure GDA00040420115400000610
Wherein:
Figure GDA00040420115400000611
Figure GDA00040420115400000612
wherein the content of the first and second substances,
Figure GDA00040420115400000613
is a complex output characteristic diagram for the ith channel in the horizontal direction>
Figure GDA00040420115400000614
A complex output characteristic diagram for the ith channel in the vertical direction>
Figure GDA00040420115400000615
Is->
Figure GDA00040420115400000616
The (o) th 1 x 1 rewinding kernel of (1), o =1,2, · C/r, · v ·>
Figure GDA00040420115400000623
For a complex output characteristic diagram in the horizontal direction for the mth channel>
Figure GDA00040420115400000618
Is->
Figure GDA00040420115400000619
The (o) th 1 x 1 rewinding and accumulating kernel of (4), (v), and (v)>
Figure GDA00040420115400000620
For a complex output characteristic diagram in the vertical direction for the mth channel>
Figure GDA00040420115400000621
Second complex output profile, v, representing the ith channel in the horizontal direction h For the set of the second complex output characteristic maps of all channels, a->
Figure GDA00040420115400000622
Denotes the first in the vertical directionThird complex output profile of the channel, v w For the set of the third complex output profiles of all channels,
Figure GDA0004042011540000071
preferably, v is defined in the present invention h And said v w Each element in the system is represented in a polar coordinate form, and the amplitude of the polar coordinate is constrained by a Sigmoid function, which specifically comprises the following steps:
Figure GDA0004042011540000072
Figure GDA0004042011540000073
Figure GDA0004042011540000074
Figure GDA0004042011540000075
Figure GDA0004042011540000076
Figure GDA0004042011540000077
wherein the content of the first and second substances,
Figure GDA0004042011540000078
and &>
Figure GDA0004042011540000079
The complex output characteristic diagrams of the ith channel in the horizontal direction and the ith channel in the vertical direction are respectively subjected to the result of the restraint of the Sigmoid function on the amplitude, and are then subjected to the judgment of the result>
Figure GDA00040420115400000710
And &>
Figure GDA00040420115400000711
The phase of the complex output characteristic map of the ith channel in the horizontal and vertical directions, respectively, < > 4>
Figure GDA00040420115400000712
And &>
Figure GDA00040420115400000713
The magnitude of the complex output signature for the ith channel in the horizontal and vertical directions respectively,
Figure GDA00040420115400000714
and &>
Figure GDA00040420115400000715
The complex output profile phase, sig (·) for the ith channel in the horizontal and vertical directions, respectively, represents the Sigmoid function, and the result after constraint by Sigmoid is recorded as £ er>
Figure GDA00040420115400000716
And &>
Figure GDA00040420115400000717
Under a rectangular coordinate system:
Figure GDA00040420115400000718
Figure GDA00040420115400000719
preferably, in the present invention, g is h And said g w Performing expansion to generate attention weight distribution in horizontal and vertical space directions, and applying the attention weight distribution to the complex number of seatsMarking a complex input feature diagram of the attention module, and obtaining a complex output feature diagram of the complex coordinate attention module, wherein the complex output feature diagram of the complex coordinate attention module is y l (i,j):
Figure GDA00040420115400000720
Wherein x is l And (i, j) the ith row and jth column of the ith channel complex input characteristic diagram.
The input and the output of the complex coordinate attention module are both complex form feature information, and the complex coordinate attention module can process the complex feature information.
The complex coordinate attention module obtains the amplitude and phase characteristics of the signal through the associated learning of a real part and an imaginary part of a complex number by using a complex convolution neural network.
According to the invention, the complex coordinate attention module focuses on the spatial information and the channel information in the horizontal and vertical directions simultaneously through the complex coordinate attention, so that the remote lazy relationship of the characteristic information is better modeled, and the characteristic representation capability of the target object is enhanced.
The invention also provides a target identification method, which comprises the following steps:
obtaining a target signal;
inputting the target information into the complex coordinate attention module-based convolutional neural network;
and the convolutional neural network outputs a target recognition result.
Preferably, the target signal is radar echo data.
One or more technical schemes provided by the invention at least have the following technical effects or advantages:
the invention integrates the advantages of CV-CNN and an attention mechanism, introduces the coordinate attention of a real number domain into a complex number domain, constructs a convolutional neural network and a target identification method based on a complex coordinate attention module, can take radar echo complex data as input data to carry out direct operation, fully utilizes amplitude and phase information, and realizes the high-precision identification of similar space cone targets with the same geometric shape and micromotion form and slightly different micromotion parameters.
When the convolutional neural network based on the complex coordinate attention module is used for target identification, preprocessing such as time-frequency analysis or distance slow time imaging is not needed, so that longer signal processing time is not needed, and the efficiency is higher; when the convolutional neural network based on the complex coordinate attention module is used for identifying the target, the target does not need to be continuously observed for a long time to obtain a complete periodic image of the target, and the efficiency is high.
The convolutional neural network based on the complex coordinate attention module can realize high-precision identification on similar space cone targets with the same micromotion form and only slightly different micromotion parameters.
According to the end-to-end similar space cone target identification method, the radar echo complex data are input and the identification result is output, so that echo signal preprocessing and phase information loss are avoided, and the time required by identification is remarkably shortened.
Drawings
The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention;
FIG. 1 is a schematic diagram of a convolutional neural network based on a complex coordinate attention module;
FIG. 2 is a schematic diagram of a plurality of coordinate attention modules;
fig. 3 is a flowchart illustrating a complex input feature map processing method.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments of the present invention and features of the embodiments may be combined with each other without conflicting with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described and thus the scope of the present invention is not limited by the specific embodiments disclosed below.
It is understood that the terms "a" and "an" should be interpreted as meaning that a number of one element or element is one in one embodiment, while a number of other elements is one in another embodiment, and the terms "a" and "an" should not be interpreted as limiting the number.
Example one
Referring to fig. 1, fig. 1 is a schematic structural diagram of a convolutional neural network based on a complex coordinate attention module, the convolutional neural network based on the complex coordinate attention module, and the convolutional neural network includes:
the device comprises an input layer, N basic units, a classification unit and an output layer;
the processing unit is used for mapping the complex numbers into corresponding real numbers through modular operation and performing classification and identification; the N basic units comprise first to Nth basic units, the first basic unit is connected with the input layer, the output of the first basic unit is the input of the second basic unit, the input of the Nth basic unit is the output of the (N-1) th basic unit, N is an integer larger than 1, the output of the Nth basic unit is the input of the processing unit, the output of the processing unit is the input of the classifier, and the classifier is connected with the output layer; the N basic units each include: the system comprises a first complex convolution module, a first complex batch normalization module, a first complex activation module and a first complex pooling module; wherein one of the N basic units further comprises: a plurality of coordinate attention modules; the complex coordinate attention module includes: the system comprises a complex coordinate attention embedding unit and a complex coordinate attention generating unit, wherein for each channel, the complex coordinate attention embedding unit is used for encoding a first complex input feature map of the channel along the horizontal direction and the vertical direction respectively, and generating first output feature information of the first complex input feature map after the first complex input feature map is encoded along the horizontal direction and second output feature information of the first complex input feature map after the first complex input feature map is encoded along the vertical direction;
for each channel, the complex coordinate attention generating unit is to: splicing the first output characteristic information and the second output characteristic information to generate a characteristic information splicing result of the channel; performing feature dimension reduction on the feature information splicing result of the channel to obtain feature information after dimension reduction, and activating the feature information after dimension reduction to obtain a first complex output feature map of the channel; splitting the first complex output profile into a first tensor and a second tensor along a spatial dimension; adjusting the dimensions of the first tensor and the second tensor to be the same as the dimensions of the first complex input feature map to obtain a second complex output feature map of the channel in the horizontal direction and a third complex output feature map of the channel in the vertical direction; obtaining a third tensor and a fourth tensor, the third tensor being the set of the second complex output eigenmaps for all channels, the fourth tensor being the set of the third complex output eigenmaps for all channels;
expressing each element in the third tensor and the fourth tensor in a polar coordinate form, constraining the amplitude of the polar coordinate by using a constraint function, respectively obtaining a fourth complex output feature map and a fifth complex output feature map in the horizontal and vertical spatial directions, expanding the fourth complex output feature map and the fifth complex output feature map to generate attention weight distribution in the horizontal and vertical spatial directions, and applying the attention weight distribution to a complex input feature map of the complex coordinate attention module to obtain a complex output feature map of the complex coordinate attention module;
wherein, the complex input characteristic diagram and the complex output characteristic diagram are both complex characteristic diagrams.
The invention integrates the advantages of CV-CNN and attention mechanism, introduces the coordinate attention of real number domain into complex number domain, and constructs complex number attention network. An attention mechanism (attention mechanism) can capture long-distance dependency relationship through global information search, automatically focus important information through weighted distribution, and ignore unimportant redundant information, which is useful for similar space cone target identification in a short observation duration. The attention mechanism undergoes the development process of spatial attention, channel attention and space-channel attention. The spatial dimension and channel dimension information play a role in improving the network identification capability, which is proved again in a newly proposed Coordinate Attention (CA) module, and the model performance is improved by embedding spatial position information into the channel Attention.
The Complex-Valued convolutional neural network (CV-CNN) can directly process echo Complex data, fully utilize amplitude and phase information, avoid echo preprocessing and reduce recognition time.
The CV-CANet is built based on a CV-CA module. The CV-CANet is an end-to-end complex convolutional neural network, and the architecture of the network is shown in FIG. 1. Each layer of basic unit of the network consists of 4 basic modules of complex convolution, complex batch normalization, complex activation and complex pooling, wherein a CV-CA module is embedded in the sixth layer. The numbers of convolution kernels of the first layer to the sixth layer are 64, 128, 256 and 256, respectively, the size of each convolution kernel (volumetric kernel) is 1 × 3, the size of the sampling window of the pooling layer (pooling layer) is 1 × 2, the sliding step (stride) of all convolution layers is 1, and the padding (padding) is 1. The last two layers of the network replace the traditional full connection with full convolution to reduce the model parameters. The output result of the final layer of full convolution is complex number, and the class label of the target is real number. Therefore, the complex numbers are mapped into corresponding real numbers through modular operation and then sent to a Softmax classifier for classification and identification. The loss function is a cross entropy loss function. Adaptive motion Estimation (Adam) acts as an optimizer for updating network weights and bias terms.
The number of layers of the complex convolutional neural network, the number of the basic units and the concrete CV-CA module embedded in any basic unit are not limited, and the number of layers and the number of the basic units can be flexibly adjusted according to actual requirements.
According to the similar space cone target end-to-end identification method based on CV-CANet, the identification result of the similar space cone target is directly obtained by inputting radar echo data, so that the problems of complex echo signal preprocessing and phase information loss are solved. In order to directly process radar echo complex signals, the invention provides a CV-CA module and constructs CV-CANet based on the module. The invention introduces a coordinate attention mechanism into a complex field, deduces and establishes basic structures such as direction-related complex feature information aggregation, direction-related complex feature diagram splitting, complex coordinate attention automatic allocation and the like. And effective identification can be realized for similar space cone targets with the same micromotion form and only slightly different micromotion parameters.
The method is usually carried out under the condition that the observation time length does not exceed a half cycle, and in practice, a radar cannot observe a target for a long time, or noise exists in data, or data is lost, so that the target is expected to be better identified by less data, and the real-time property is ensured.
Example two
Referring to fig. 2, fig. 2 is a schematic diagram illustrating a composition of a plurality of coordinate attention modules, in this embodiment, the plurality of coordinate attention modules include: the system comprises a complex coordinate attention embedding unit and a complex coordinate attention generating unit, wherein for each channel, the complex coordinate attention embedding unit is used for encoding a first complex input feature map of the channel along the horizontal direction and the vertical direction respectively, and generating first output feature information of the first complex input feature map after the first complex input feature map is encoded along the horizontal direction and second output feature information of the first complex input feature map after the first complex input feature map is encoded along the vertical direction;
for each channel, the complex coordinate attention generating unit is to: splicing the first output characteristic information and the second output characteristic information to generate a characteristic information splicing result of the channel; performing feature dimensionality reduction on the feature information splicing result of the channel to obtain feature information after dimensionality reduction, and activating the feature information after dimensionality reduction to obtain a first complex output feature map of the channel; splitting the first complex output profile into a first tensor and a second tensor along a spatial dimension; adjusting the dimensions of the first tensor and the second tensor to be the same as the dimensions of the first complex input feature map, and obtaining a second complex output feature map of the channel in the horizontal direction and a third complex output feature map of the channel in the vertical direction; obtaining a third tensor and a fourth tensor, wherein the third tensor is the set of the second complex output characteristic maps of all the channels, and the fourth tensor is the set of the third complex output characteristic maps of all the channels;
expressing each element in the third tensor and the fourth tensor in a polar coordinate form, constraining the amplitude of the polar coordinate by using a constraint function, respectively obtaining a fourth complex output feature map and a fifth complex output feature map in the horizontal and vertical spatial directions, expanding the fourth complex output feature map and the fifth complex output feature map to generate attention weight distribution in the horizontal and vertical spatial directions, and applying the attention weight distribution to a complex input feature map of the complex coordinate attention module to obtain a complex output feature map of the complex coordinate attention module;
wherein, the complex input characteristic diagram and the complex output characteristic diagram are both complex characteristic diagrams.
The prior art CV-CNN that simply separates the real and imaginary parts of the complex number or uses a real convolution kernel does not take advantage of the complex convolution kernel. Therefore, the invention follows the law of complex number calculation, and carries out detailed formula derivation and constructs a complex number coordinate attention module (CV-CA module) according to a complex number network basic unit and a Real number coordinate attention (CV-CA, RV-CA) module.
The CV-CA module proposed by the present invention includes a Complex-Coordinate Attention Information Embedding (CVCIE) and a Complex-Coordinate Attention authorization Generation (CVCAG) unit.
In practical application, the input of the CV-CA module may be any complex-form feature information, and the present invention is described by taking a radar echo signal as an example, but the input feature information in the present invention is not limited to the feature information of the radar echo signal, H =1 is used for narrow-band radar echo data, and the value of H in practical application may be determined according to practical situations, and the present invention is not limited specifically.
The echo signal of the radar measurement can be expressed as:
S t (n)=S th (n)+ν(n)
wherein S is th And (n) is a theoretical radar echo signal, v (n) represents independent and equally distributed Gaussian white noise generated by a radar receiver, and n represents a pulse sequence number.
Is provided with
Figure GDA0004042011540000121
Is a multiple input feature map, in which->
Figure GDA0004042011540000122
A complex input characteristic diagram representing the p-th channel, which is evaluated>
Figure GDA0004042011540000123
Is a complex output characteristic diagram in which>
Figure GDA0004042011540000124
And (3) a complex output characteristic diagram of the p channel, wherein the dimension of the complex output characteristic diagram is the same as that of the X.
Global pooling is usually used for encoding global spatial information in channel attention, but it compresses the global spatial information into a channel descriptor, so that it is difficult to maintain the position information, which is particularly important for capturing the spatial structure. Therefore, in the coordinate attention module, the operation of decomposing the global pooling into two one-dimensional feature codes is extended to the complex field, the complex feature maps X of each channel are encoded along two horizontal and vertical directions (direction-correlation is abbreviated as horizontal and vertical directions), and a direction-correlation complex feature map is generated, so as to integrate the features in two spatial directions, respectively, and this operation is described mathematically as:
Figure GDA0004042011540000125
Figure GDA0004042011540000126
wherein j represents an imaginary number unit,
Figure GDA0004042011540000127
represents the real part of a complex number, is greater than or equal to>
Figure GDA0004042011540000128
Representing the imaginary part of the complex number, h being the horizontal pixel index of the input feature map, x p (h, j) is the value of the h row and j column of the p channel of the complex input characteristic diagram, i is the pixel index of the vertical direction of the input characteristic diagram, x p (i, w) is the value of the ith row and the w column of the ith channel of the complex input characteristic diagram. The plural characteristic maps of each channel X are transformed to respectively obtain two plural tensors ^ greater than or equal to>
Figure GDA0004042011540000131
Wherein->
Figure GDA0004042011540000132
Figure GDA0004042011540000133
Wherein->
Figure GDA0004042011540000134
The above CVCIE outputs accurate spatial location information aggregated under the global receptive field. Based on the CVCIE encoding results, the CV-CA module designs a second transformation, referred to as CVCAG. The CVCAG transform includes three steps, respectively: the method comprises the following steps of (1) direction-related feature information aggregation, (2) direction-related complex feature map splitting, and (3) complex coordinate attention automatic allocation.
(1) Direction-related complex feature information aggregation
And (4) performing complex splicing. And (3) splicing the results obtained by the formula (1) and the formula (2). Is provided with
Figure GDA0004042011540000135
For the stitched result, each tensor in M is represented as:
Figure GDA0004042011540000136
wherein [, ] represents a stitching operation.
And (5) reducing the dimension of the feature. The 1 x 1 convolution kernel is used for reducing the dimension of the characteristic channel, reducing the parameter quantity and realizing the information interaction and integration of the cross-channel. Is provided with
Figure GDA0004042011540000137
A 1 x 1 rewind integrating kernel shared for the layer, wherein>
Figure GDA0004042011540000138
Representing the kth rewinding kernel, k =1,2, \ 8230;, C/r, \ depending on the condition>
Figure GDA0004042011540000139
Represents medium->
Figure GDA00040420115400001310
The Cth 1 x 1 rewinding area nucleus of (4), based on the number of pixels in the file system, and a number of pixels in the file system>
Figure GDA00040420115400001311
Represents medium->
Figure GDA00040420115400001312
Q =1, 2.. Times, C, r denotes a scaling coefficient for controlling the number of channels of the convolution output feature map (r =18 in the present invention, where r may take other values in practical applications, and embodiments of the present invention are not specifically limited), s denotes a step size of convolution operation, and k-th feature map of convolution output is v £ 1 k (i, j) wherein:
Figure GDA00040420115400001313
f k (i,j)=σ(v k (i,j)) (5)
wherein m is q For the qth tensor in M,
Figure GDA00040420115400001314
m q (i · s, j · s) are the values of the i · s row and j · s column of the q th tensor after feature information stitching, v k (i, j) represents the complex output characteristic map of the k-th channel which is not activated, which is taken together with the signal>
Figure GDA0004042011540000141
A complex output characteristic diagram representing the k-th channel, and the set of complex characteristic diagrams of each channel is denoted as
Figure GDA0004042011540000142
f C/r For the complex output characteristic diagram of the C/r channel, sigma (-) represents the complex activation function; the complex activation function is a CReLU function, which is:
Figure GDA0004042011540000143
wherein z is a complex variable.
(2) Direction-dependent complex feature map splitting
The complex signature is split. Splitting f into two independent tensors along the spatial dimension
Figure GDA0004042011540000144
And &>
Figure GDA0004042011540000145
Namely:
Figure GDA0004042011540000146
/>
Figure GDA0004042011540000147
wherein the content of the first and second substances,
Figure GDA0004042011540000148
and (5) feature dimension increasing. Using a 1X 1 rewinding kernel to f h And f w Reverting to the same dimension as the input signature X. Setting the rewinding product kernel of 1 x 1 in the convolution operation in the horizontal direction as
Figure GDA0004042011540000149
Wherein
Figure GDA00040420115400001410
Represents the ith (l =1,2, \8230;, C) rewinding and accumulating nucleus, and/or the combination thereof>
Figure GDA00040420115400001411
Represents->
Figure GDA00040420115400001412
The (o =1, 2., C/r) 1 × 1 rewinding core. In the same way, is based on>
Figure GDA00040420115400001413
Is a rewinding multiplication kernel of 1 × 1 in the case of a convolution operation in the vertical direction, wherein->
Figure GDA00040420115400001415
Indicates the lth>
Figure GDA00040420115400001422
Complex convolution kernel->
Figure GDA00040420115400001417
Represents->
Figure GDA00040420115400001418
The qth (l =1, 2.., C) rewinding core of 1 × 1, then:
using a 1 x 1 rewinding kernel to wrap said f h And f is w Restoring to the same dimension as the X to obtain
Figure GDA00040420115400001419
And
Figure GDA00040420115400001420
wherein:
Figure GDA00040420115400001421
Figure GDA0004042011540000151
wherein the content of the first and second substances,
Figure GDA0004042011540000152
a complex output characteristic diagram for the ith channel in the horizontal direction>
Figure GDA0004042011540000153
Is a complex output characteristic diagram for the ith channel in the vertical direction>
Figure GDA0004042011540000154
Is->
Figure GDA0004042011540000155
O =1,2,. C/r,. X.r,. For the (o) th 1 x 1 rewinding nucleus in (1)>
Figure GDA00040420115400001527
For a complex output characteristic diagram in the horizontal direction for the mth channel>
Figure GDA0004042011540000157
Is->
Figure GDA0004042011540000158
The (o) th 1 x 1 rewinding and accumulating kernel of (4), (v), and (v)>
Figure GDA0004042011540000159
For a complex output characteristic map in the vertical direction for the ith channel>
Figure GDA00040420115400001510
Second complex output profile, v, representing the ith channel in the horizontal direction h For the set of the second complex output characteristic maps of all channels, a->
Figure GDA00040420115400001511
Third complex output characteristic diagram, v, representing the ith channel in the vertical direction w For the set of the third complex output profiles of all channels,
Figure GDA00040420115400001512
(3) Automatic complex coordinate attention allocation
Direction-dependent complex attention weight coefficients are calculated. Tensor v of complex eigen-map h And v w Each element (complex value) in the system is written into a polar coordinate form, then the amplitude of the polar coordinate is constrained by adopting a Sigmoid function, and the amplitude is limited within a value range of 0-1, namely:
Figure GDA00040420115400001513
Figure GDA00040420115400001514
Figure GDA00040420115400001515
Figure GDA00040420115400001516
Figure GDA00040420115400001517
Figure GDA00040420115400001518
wherein the content of the first and second substances,
Figure GDA00040420115400001519
and &>
Figure GDA00040420115400001520
The complex output characteristic diagrams of the ith channel in the horizontal direction and the ith channel in the vertical direction are respectively subjected to the result of the restraint of the Sigmoid function on the amplitude, and are then subjected to the judgment of the result>
Figure GDA00040420115400001521
And &>
Figure GDA00040420115400001522
The phase of the complex output characteristic map of the ith channel in the horizontal and vertical directions, respectively, < > 4>
Figure GDA00040420115400001523
And &>
Figure GDA00040420115400001524
The magnitude of the complex output signature for the ith channel in the horizontal and vertical directions respectively,
Figure GDA00040420115400001525
and &>
Figure GDA00040420115400001526
The phase of the complex output characteristic diagram of the ith channel in the horizontal direction and the vertical direction respectively, sig (·) represents a Sigmoid function and is used for converting the final amplitude into a numerical value between 0 and 1, and the result after being constrained by the Sigmoid is recorded as a numerical value
Figure GDA0004042011540000161
And &>
Figure GDA0004042011540000162
This conversion of the amplitude of the polar coordinates into the 0-1 range alone does not affect the phase, i.e. the phase information is preserved.
Since the original image is in the rectangular coordinate system, the expressions of equation (11) and equation (14) are converted to the rectangular coordinate system:
under a rectangular coordinate system:
Figure GDA0004042011540000163
Figure GDA0004042011540000164
complex coordinate attention is automatically assigned. G output for horizontal and vertical spatial directions h And g w And expanding, generating attention weight distribution in each space direction, and acting on the complex input feature map to realize automatic distribution of complex coordinate attention. The output of the complex coordinate attention module is obtained as:
Figure GDA0004042011540000165
wherein x is l And (i, j) the ith row and jth column of the ith channel complex input characteristic diagram.
According to the above CV-CA construction process, on one hand, the CV-CA obtains amplitude and phase characteristics of a target signal, such as a radar echo signal, through complex real part and imaginary part associated learning by using a complex convolutional neural network; on the other hand, spatial information and channel information in the horizontal and vertical directions are concerned by the attention of complex coordinates, remote modeling of characteristic information is better carried out according to the lazy relationship, and the characteristic characterization capability of the target object is enhanced.
The CV-CA module provided by the invention comprises two parts, wherein the first part is complex coordinate attention embedding, and the second part is complex coordinate attention generation. The physical significance of each part is explained in detail below.
The first part is complex coordinate attention insertion. In the field of computer vision, the position information on the feature map has an important influence on acquiring the structural features of the space. Since the targets to be resolved by the invention are very similar space cone targets, the invention considers that the space structure information is beneficial to the resolution and identification of the targets. Therefore, in order for the proposed complex coordinate attention module to retain the position information and further capture the long distance dependency on the space by using the position information, the present invention decomposes the global pooling in CNN into pooling operation in the horizontal direction and pooling operation in the vertical direction.
The second part is complex coordinate attention generation, which is done in three sub-steps. For this section, the general design principle of the present invention has three points: 1) The modules should be as simple and light-weight as possible. 2) The module should make full use of the spatial location information obtained in the first part. 3) The module should take into account the interrelationship between the channels in order to take advantage of the channel attention.
Direction-related feature information aggregation. The first part has obtained spatial position information in both the horizontal and vertical directions. Under the principle that the designed module is as simple as possible and the parameter quantity is as small as possible, the invention firstly splices the spatial position information in the horizontal direction and the vertical direction, and aims to simultaneously retain the information in the two directions. Then, the invention uses 1 × 1 convolution kernel to convolute the splicing result for dimension reduction. By the design, the characteristic information among the channels is considered, and the parameter quantity is reduced.
The direction-related complex feature information is split. The present invention expects that the weight in the horizontal direction and the weight in the vertical direction should be applied to the horizontal direction and the vertical direction of the input feature map, respectively, and the number of channels of the weights should be consistent with the number of channels of the input feature map. Therefore, in the first sub-step, the weights of the spatial position information and the channel information are taken into consideration, and the weight in the horizontal direction and the weight in the vertical direction of the channel information are taken into consideration. Then, the two directions are subjected to dimensionality raising by using a convolution kernel of 1 × 1 respectively.
A plurality of attentions are automatically assigned. After the above operation steps, the present invention obtains complex weights considering both spatial position information and channel information, and on one hand, the present invention is expected to keep the phase information of the complex weights, and on the other hand, the magnitude of the weights is limited to the interval of 0-1. Finally, the attention weight is respectively applied to each element of the input feature vector and each channel, so that the complex coordinate attention provided by the invention is realized. The method can weight each channel and pay attention to the important channel; spatial information is also considered; regions that facilitate target recognition are focused.
In addition, the CV-CA module in the embodiment can obtain the optimal feature recognition capability with the increase of fewer parameters under the condition of ensuring the operation efficiency of the model, and can improve the recognition capability of the model and reduce the target misjudgment probability.
EXAMPLE III
A third embodiment of the present invention provides a complex input feature map processing method, please refer to fig. 3, where fig. 3 is a schematic flow chart of the complex input feature map processing method, where the method includes:
obtaining a complex input characteristic diagram to be processed;
coding the first complex input characteristic diagram of the channel along the horizontal direction and the vertical direction respectively, and generating first output characteristic information of the first complex input characteristic diagram coded along the horizontal direction and second output characteristic information coded along the vertical direction in the channel respectively;
for each channel, splicing the first output characteristic information and the second output characteristic information to generate a characteristic information splicing result of the channel; performing feature dimensionality reduction on the feature information splicing result of the channel to obtain feature information after dimensionality reduction, and activating the feature information after dimensionality reduction to obtain a first complex output feature map of the channel; splitting the first complex output feature map into a first tensor and a second tensor along a spatial dimension; adjusting the dimensions of the first tensor and the second tensor to be the same as the dimensions of the first complex input feature map to obtain a second complex output feature map of the channel in the horizontal direction and a third complex output feature map of the channel in the vertical direction; obtaining a third tensor and a fourth tensor, wherein the third tensor is the set of the second complex output characteristic maps of all the channels, and the fourth tensor is the set of the third complex output characteristic maps of all the channels;
expressing each element in the third tensor and the fourth tensor in a polar coordinate form, constraining the amplitude of the polar coordinate by using a constraint function, respectively obtaining a fourth complex output feature diagram and a fifth complex output feature diagram in the horizontal and vertical spatial directions, expanding the fourth complex output feature diagram and the fifth complex output feature diagram to generate attention weight distribution in the horizontal and vertical spatial directions, and applying the attention weight distribution to the to-be-processed complex input feature diagram to obtain a processed complex output feature diagram;
wherein, the complex input characteristic diagram and the complex output characteristic diagram are both complex characteristic diagrams.
The method can be used for processing complex output characteristic information, amplitude and phase characteristics of signals are obtained through complex convolution neural network through complex real part and imaginary part correlation learning, spatial information and channel information in the horizontal direction and the vertical direction are concerned by the method through complex coordinate attention, remote lazy relation of the characteristic information is better modeled, and characteristic characterization capability of a target object is enhanced.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (8)

1. A method for identifying a target by a complex convolutional neural network based on complex coordinate attention, the method comprising:
obtaining a plurality of radar echo data of the space cone target;
inputting the complex radar echo data into a complex convolution neural network based on a complex coordinate attention module;
the complex convolutional neural network outputs the identification result of the space cone target, and the space cone target is identified to be a bullet or a bait;
the complex convolutional neural network includes:
the system comprises an input layer, N basic units, a classification unit of a space cone target and an output layer;
the input layer is used for inputting the complex radar echo data;
the N basic units comprise first to Nth basic units, the first basic unit is connected with the input layer, the output of the first basic unit is the input of the second basic unit,. The input of the Nth basic unit is the output of the (N-1) th basic unit, N is an integer larger than 1, and the output of the Nth basic unit is the input of the classification unit of the space cone target; the N basic units each include: the system comprises a first complex convolution module, a first complex batch normalization module, a first complex activation module and a first complex pooling module; wherein one of the N basic units further comprises: a plurality of coordinate attention modules; the complex coordinate attention module includes: the system comprises a complex coordinate attention embedding unit and a complex coordinate attention generating unit, wherein for each channel, the complex coordinate attention embedding unit is used for encoding a first complex input feature map of a space cone target of the channel along the horizontal direction and the vertical direction respectively, and generating first output feature information of the space cone target encoded by the first complex input feature map along the horizontal direction and second output feature information of the space cone target encoded along the vertical direction in the channel respectively;
for each channel, the complex coordinate attention generating unit is to: splicing the first output characteristic information and the second output characteristic information to generate a characteristic information splicing result of the channel; performing feature dimensionality reduction on the feature information splicing result of the channel to obtain feature information after dimensionality reduction, and activating the feature information after dimensionality reduction to obtain a first complex output feature map of the channel; splitting the first complex output feature map along a spatial dimension into a first tensor along a horizontal direction and a second tensor along a vertical direction; adjusting the dimensions of the first tensor and the second tensor to be the same as the dimensions of the first complex input feature map to obtain a second complex output feature map of the channel in the horizontal direction and a third complex output feature map of the channel in the vertical direction; obtaining a third tensor and a fourth tensor, wherein the third tensor is the set of the second complex output characteristic maps of all the channels, and the fourth tensor is the set of the third complex output characteristic maps of all the channels; expressing each element in the third tensor and the fourth tensor in a polar coordinate form, constraining the amplitude of the polar coordinate by using a constraint function, respectively obtaining a fourth complex output feature map and a fifth complex output feature map in the horizontal and vertical spatial directions, expanding the fourth complex output feature map and the fifth complex output feature map to generate attention weight distribution in the horizontal and vertical spatial directions, and applying the attention weight distribution to a complex input feature map of the complex coordinate attention module to obtain a complex output feature map of the complex coordinate attention module;
wherein, the complex input characteristic diagram and the complex output characteristic diagram are both complex characteristic diagrams;
the classification unit of the space cone target is connected with the output layer; and the classification unit of the space cone target is used for mapping the complex output data of the Nth basic unit into corresponding real numbers through modular operation and performing classification and identification on the space cone target.
2. The method of claim 1, wherein in the basic unit not including the complex coordinate attention module, the output of the complex convolution module in the basic unit is the input of the complex batch normalization module, the output of the complex batch normalization module is the input of the complex activation module, and the output of the complex activation module is the input of the complex pooling module.
3. The method of claim 1, wherein in the basic unit comprising the plurality of coordinate attention modules, the outputs of the plurality of convolution modules in the basic unit are the inputs of the plurality of batch normalization modules, the outputs of the plurality of batch normalization modules are the inputs of the plurality of coordinate attention modules, the outputs of the plurality of coordinate attention modules are the inputs of the plurality of activation modules, and the outputs of the plurality of activation modules are the inputs of the plurality of pooling modules.
4. The method of claim 1, wherein the classification unit comprises:
the second complex convolution module, the second complex batch normalization module, the second complex activation module, the third complex convolution module and the classifier; the output of the second complex convolution module is the input of the second complex batch normalization module, the output of the second complex batch normalization module is the input of the second complex activation module, the output of the second complex activation module is the input of the third complex convolution module, and the output of the third complex convolution module is the input of the classifier.
5. The complex coordinate attention-based object recognition method of a complex convolutional neural network as claimed in claim 1, wherein the complex convolutional neural network comprises first to sixth basic units.
6. The complex coordinate attention-based object recognition method of a complex convolutional neural network as claimed in claim 5, wherein a sixth basic unit comprises the complex coordinate attention module.
7. The complex coordinate attention-based target identification method of the complex convolutional neural network as claimed in claim 1, wherein an optimizer is provided in the complex convolutional neural network for updating the network weight and the bias term.
8. The method of claim 1, wherein the complex input feature map is a complex input feature map of complex radar echo data of the spatial cone target, and the complex output feature map is a complex output feature map of complex radar echo data of the spatial cone target.
CN202110858271.0A 2021-07-28 2021-07-28 Complex convolution neural network target identification method based on complex coordinate attention Active CN113537120B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110858271.0A CN113537120B (en) 2021-07-28 2021-07-28 Complex convolution neural network target identification method based on complex coordinate attention

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110858271.0A CN113537120B (en) 2021-07-28 2021-07-28 Complex convolution neural network target identification method based on complex coordinate attention

Publications (2)

Publication Number Publication Date
CN113537120A CN113537120A (en) 2021-10-22
CN113537120B true CN113537120B (en) 2023-04-07

Family

ID=78121256

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110858271.0A Active CN113537120B (en) 2021-07-28 2021-07-28 Complex convolution neural network target identification method based on complex coordinate attention

Country Status (1)

Country Link
CN (1) CN113537120B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114972280B (en) * 2022-06-07 2023-11-17 重庆大学 Fine coordinate attention module and application thereof in surface defect detection

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9373059B1 (en) * 2014-05-05 2016-06-21 Atomwise Inc. Systems and methods for applying a convolutional network to spatial data
CN111340186A (en) * 2020-02-17 2020-06-26 之江实验室 Compressed representation learning method based on tensor decomposition
CN112329538A (en) * 2020-10-10 2021-02-05 杭州电子科技大学 Target classification method based on microwave vision
CN112965062A (en) * 2021-02-09 2021-06-15 西安电子科技大学 Radar range profile target identification method based on LSTM-DAM network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9373059B1 (en) * 2014-05-05 2016-06-21 Atomwise Inc. Systems and methods for applying a convolutional network to spatial data
CN111340186A (en) * 2020-02-17 2020-06-26 之江实验室 Compressed representation learning method based on tensor decomposition
CN112329538A (en) * 2020-10-10 2021-02-05 杭州电子科技大学 Target classification method based on microwave vision
CN112965062A (en) * 2021-02-09 2021-06-15 西安电子科技大学 Radar range profile target identification method based on LSTM-DAM network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Yaxin Li et.al.Multi-mode Fusion and Classification Method for Space Targets Based on Convolutional Neural Network.《2019 6th Asia-Pacific Conference on Synthetic Aperture Radar (APSAR)》.2020,全文. *
周剑 .面向复杂场景的目标分类方法.《中国优秀硕士学位论文全文数据库 信息科技辑》.2021,全文. *
王光光 .基于深度学习的PolSAR分类和视频行为识别.《中国优秀硕士学位论文全文数据库 信息科技辑》.2021,全文. *

Also Published As

Publication number Publication date
CN113537120A (en) 2021-10-22

Similar Documents

Publication Publication Date Title
CN110119703B (en) Human body action recognition method fusing attention mechanism and spatio-temporal graph convolutional neural network in security scene
CN108509910B (en) Deep learning gesture recognition method based on FMCW radar signals
CN112364779A (en) Underwater sound target identification method based on signal processing and deep-shallow network multi-model fusion
CN108846323A (en) A kind of convolutional neural networks optimization method towards Underwater Targets Recognition
CN111784560A (en) SAR and optical image bidirectional translation method for generating countermeasure network based on cascade residual errors
CN113283298B (en) Real-time behavior identification method based on time attention mechanism and double-current network
Alnujaim et al. Generative adversarial networks to augment micro-Doppler signatures for the classification of human activity
CN113537120B (en) Complex convolution neural network target identification method based on complex coordinate attention
Qu et al. Human activity recognition based on WRGAN-GP-synthesized micro-Doppler spectrograms
Kamal et al. Generative adversarial learning for improved data efficiency in underwater target classification
Li et al. Supervised domain adaptation for few-shot radar-based human activity recognition
CN112884062B (en) Motor imagery classification method and system based on CNN classification model and generated countermeasure network
CN113569735B (en) Complex input feature graph processing method and system based on complex coordinate attention module
CN113435276A (en) Underwater sound target identification method based on antagonistic residual error network
CN116794608A (en) Radar active interference identification method based on improved MobileViT network
CN114550047B (en) Behavior rate guided video behavior recognition method
CN110599556A (en) Method for converting time sequence into image based on improved recursive graph
Ibrahim et al. Auto-encoder based deep learning for surface electromyography signal processing
Qiao et al. Gesture-ProxylessNAS: A lightweight network for mid-air gesture recognition based on UWB radar
CN114966587A (en) Radar target identification method and system based on convolutional neural network fusion characteristics
Bose et al. Fine-Grained Independent Approach for Workout Classification Using Integrated Metric Transfer Learning
CN114528918A (en) Hyperspectral image classification method and system based on two-dimensional convolution sum LSTM
El-Bana et al. Evaluating the Potential of Wavelet Pooling on Improving the Data Efficiency of Light-Weight CNNs
CN113421281A (en) Pedestrian micromotion part separation method based on segmentation theory
CN112241001A (en) Radar human body action recognition method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant