CN114283326A - Underwater target re-identification method combining local perception and high-order feature reconstruction - Google Patents

Underwater target re-identification method combining local perception and high-order feature reconstruction Download PDF

Info

Publication number
CN114283326A
CN114283326A CN202111582065.8A CN202111582065A CN114283326A CN 114283326 A CN114283326 A CN 114283326A CN 202111582065 A CN202111582065 A CN 202111582065A CN 114283326 A CN114283326 A CN 114283326A
Authority
CN
China
Prior art keywords
target
context
characteristic
network
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111582065.8A
Other languages
Chinese (zh)
Inventor
付先平
蒋广琪
姚铭泽
彭锦佳
王辉兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Maritime University
Original Assignee
Dalian Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Maritime University filed Critical Dalian Maritime University
Priority to CN202111582065.8A priority Critical patent/CN114283326A/en
Publication of CN114283326A publication Critical patent/CN114283326A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses an underwater target re-identification method combining local perception and high-order feature reconstruction, which comprises the steps of carrying out underwater target feature processing on an acquired image by utilizing a cross-phase network of a target detection algorithm model to obtain an underwater target feature map; the method comprises the steps of carrying out mapping processing and scale pooling on an underwater target characteristic graph to obtain a characteristic graph with a characteristic matrix, inputting a path aggregation network and predicting to generate a target object, sampling the target object, extracting image characteristics through a residual error network to obtain a three-dimensional tensor and vertical bars, carrying out average pooling on the vertical bars to generate a predicted image, carrying out tensor reconstruction on the predicted image to identify a context information segment, obtaining context characteristics by utilizing synthesis and element-level multiplication and weighted average, and carrying out average pooling and minimum cross entropy loss processing on the context characteristics to obtain a prediction result. And (3) tensor reconstruction is carried out to extract a characteristic graph of higher-order characteristic information, and the characteristic graph is used for predicting to obtain a re-recognition prediction result with higher robustness and higher precision.

Description

Underwater target re-identification method combining local perception and high-order feature reconstruction
Technical Field
The invention relates to the field of underwater target identification, in particular to an underwater target re-identification method combining local perception and high-order feature reconstruction.
Background
The existing underwater target recognition method mainly extracts underwater target features for recognition and detection tasks based on a related target recognition algorithm. However, due to the navigation angle of the underwater robot, the changes of the navigation speed and the navigation track can cause the photographed target object to be influenced by deformation, angle, contrast and the like. This brings many challenges to the task of re-identification of underwater targets where the vision-guided underwater robot identifies the same target object from different angles. The current target re-identification task is mainly based on identification of unified targets under different cameras on land, and an existing underwater target identification algorithm cannot sense the current position and cannot repeatedly identify the same target object due to changes of the shape and the air route of an underwater robot, so that the size change of an acquired picture is influenced due to changes of the track in the navigation process of the underwater robot, and the accuracy is influenced due to final identification.
Disclosure of Invention
The invention provides an underwater target re-identification method combining local perception and high-order feature reconstruction, and aims to solve the technical problems that the existing underwater target identification method is not accurate in result and the like.
In order to achieve the purpose, the technical scheme of the invention is as follows:
an underwater target re-identification method combining local perception and high-order feature reconstruction comprises the following steps:
step 1, collecting an image of a passing area of an underwater robot;
step 2, performing underwater target feature processing on the acquired image by using a cross-stage network of a YOLOv4 target detection algorithm model to obtain an underwater target feature map;
step 3, mapping the underwater target characteristic graph by using a Mish activation function to obtain a target characteristic result graph;
step 4, performing space pyramid pooling operation on the target characteristic result graph through a target detection algorithm, and performing character string connection on the pooled result, so as to separate the characteristic graph with the characteristic matrix;
step 5, inputting the characteristic diagram with the characteristic matrix into a path aggregation network for characteristic fusion to obtain a fusion characteristic diagram;
step 6, inputting the fusion characteristic graph into a prediction network, setting an anchor frame, predicting the anchor frame by using a clustering algorithm to generate a prediction frame, wherein the prediction frame comprises a network output image and is used for detecting target objects with different scales;
step 7, after sampling the network output images in batches, extracting the characteristics of the network output images through a residual error network to obtain a three-dimensional tensor T,
step 8, performing average pooling on the three-dimensional tensor T to divide the three-dimensional tensor T into p vertical strips, namely, partitioning, identifying and processing, and acquiring activation vectors of the p vertical strips by using a matrix channel axis of the three-dimensional tensor T and defining the activation vectors as column vectors;
step 9, inputting the column vectors into a classifier to perform prediction processing to obtain a predicted image;
step 10, carrying out tensor reconstruction and re-identification processing on the predicted image by using a residual network, extracting matrix characteristics of the predicted image after tensor reconstruction and re-identification processing, and carrying out characteristic generation from three angles of a channel, a width and a height by using the matrix characteristics to obtain a context information segment;
step 11, repeating step 10, and synthesizing the obtained context information segments to obtain a context attention drawing representing the three-dimensional context characteristics; activating and aggregating the context attention by using element-level products and weighted averages to obtain fine-grained context features on space and channel dimensions;
step 12, performing global average pooling on the fine-grained context feature map through a global average pooling layer to obtain a target feature map; and connecting the target characteristic diagram with the column vectors, optimizing the connection result by using the minimized cross entropy loss to obtain a joint optimization characteristic diagram, and predicting the joint optimization characteristic diagram to obtain a prediction result.
Further, in the step 5, a feature pyramid structure in the path aggregation network is used to connect the top-down path and the bottom-up path in a horizontal manner, so as to divide the feature map with the feature matrix, a self-adaptive pooling operation is used to perform a pooling extraction operation on each feature map with the feature matrix to obtain a target feature, and a full-connection fusion operation is performed on the extracted target feature to obtain a fusion feature map.
Further, in step 6, the anchor frame is predicted by using the global intersection ratio as a prediction frame regression function to generate a prediction frame, where the specific formula of the global intersection ratio is as follows:
Figure BDA0003427345840000021
Figure BDA0003427345840000031
Figure BDA0003427345840000032
where ρ is2(b,bgt) The Euclidean distance between the center points of the prediction frame and the real frame is represented, c represents the diagonal length of the minimum external moment between the center point of the prediction frame and the center point of the real frame, IOU represents the intersection ratio between the prediction frame and the real frame, alpha represents an influence factor corresponding to a parameter v, v represents a parameter for measuring the consistency of the length-width ratio, w represents the width of the corresponding frame, h represents the height of the corresponding frame, gt represents the real frame, and CIOU represents the global intersection ratio.
Further, step 8 includes modifying the residual network to extract the depth features of the network output image to obtain a three-dimensional tensor T, and specifically modifying the residual network includes eliminating a global average pooling layer, a full connection layer and an output layer of the convolutional neural network.
Further, in step 11, activating and aggregating the context attention by using element-level product and weighted average, so as to obtain fine-grained context features in space and channel dimensions, specifically:
step 11.1, repeating the step 10 to obtain context information segments in different directions;
step 11.2, carrying out tensor reconstruction processing on different context information fragments to obtain the sub attention image A in different directionsi
Step 11.3, annotating the sub-points in different directions with graph AiSynthesizing to obtain a context intention;
and 11.4, activating and aggregating the context attention by utilizing element-level multiplication and weighted average, thereby obtaining fine-grained context characteristics on space and channel dimensions.
Further, the specific method in step 11.4 is as follows:
Figure BDA0003427345840000033
S={s1,s2,...,sCHW}
A={a1,a2,...,aCHW}
wherein S represents the feature matrix of the input, A represents the context attention, Y represents the fine-grained context feature, i represents the ith feature, CHW represents the total number of features, S represents the input feature, and A represents the context attention.
Has the advantages that: the invention adopts a target detection network to obtain the marking information of an underwater target object for constructing an underwater target characteristic diagram. A local perception branch network is constructed to carry out blocking and re-identification processing on an underwater target, the feature extraction performance is improved, meanwhile, a high-order reconstruction branch network is constructed to reconstruct the high-order discriminant features of the image features in a tensor reconstruction mode, and a feature map of higher-order feature information is further extracted. By using the cross-stage network to extract the characteristic diagram with more detailed characteristics, the high-order reconstruction branch network extracts the characteristic diagram with more high-order characteristic information, and the two characteristic diagrams are connected for prediction, so that a re-identification prediction result with stronger robustness and higher precision is obtained.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic diagram of a bus exception handling apparatus according to the present invention;
FIG. 2 is a diagram of a model architecture of the object detection algorithm of the present invention;
FIG. 3 is a diagram of a cross-phase network architecture in accordance with the present invention;
fig. 4 is a diagram illustrating a dual-branch re-identification network according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment provides an underwater target re-identification method combining local perception and high-order feature reconstruction, as shown in fig. 1 to 4, including the following steps:
step 1, collecting an image of an underwater robot path area;
step 2, performing underwater target feature processing on the acquired image by using a cross-stage network of a YOLOv4 target detection algorithm model, splitting the stacked residual blocks into two parts, and combining the two parts by using a cross-stage hierarchical structure to obtain an underwater target feature graph F with stronger robustnessg
The cross-phase network comprises a plurality of standard 3 x 3 convolutions for feature extraction, which are used for preventing the training network from overfitting and more accurately obtaining the Mish activation function of the result, and the formula is as follows:
Mish=X×tan h(ln(1+ex))
specifically, feature blocks are extracted for marine product objects in an image of a visible area of the underwater robot.
First the cross-phase network (CSPDarknet53) consists of multiple standard 3 × 3 convolutions, the Mish function, and the CSPNet. The purpose of the multiple standard 3 x 3 convolutions is to better extract the features of the target, after the features are extracted, the CSPNet is used to divide the feature mapping of the basic layer into two parts, and then the two parts are combined through a cross-stage hierarchical structure to extract the target feature graph F with strong robustnessg. The global features are then used for input, training and final optimization of the next feature pyramid network.
Step 3, using Mish activation function to perform characteristic diagram F on underwater targetgMapping processing is carried out to obtain a more accurate and stable target characteristic result graph;
step 4, carrying out space pyramid pooling operation on the target characteristic result graph through a target detection algorithm, carrying out character string connection on the pooled result, and further separating a characteristic graph F with a characteristic matrixf
Using four pooling checks with different sizes to perform maximum pooling on the feature maps respectively, wherein the pooling results are Fm1,Fm2,Fm3,Fm4. And then connecting the four pooling results to improve the discriminability and the comprehensiveness of the cross-stage network characteristics and obviously separate more important context characteristics, wherein the connection part is shown in the following formula:
Ff=concat(Fm1,Fm2,Fm3,Fm4)
step 5, the characteristic diagram F with the characteristic matrixfPerforming feature fusion on an input path aggregation network, specifically, segmenting, extracting and fusing a feature map with a feature matrix, enhancing the weight of more features of feature map channel information, reducing the weight of unimportant features, and obtaining a fused feature map;
step 6, inputting the fusion characteristic graph into a prediction network, setting an anchor frame, predicting the anchor frame by using a clustering algorithm to generate a prediction frame, wherein the prediction frame comprises a network output image and is used for detecting target objects with different scales; specifically, the selection of anchor points in the clustering algorithm is set, anchor frames with different sizes are correspondingly generated for prediction by setting three anchor points with different sizes, and finally, the result of the predicted network is the network output image with three different sizes;
step 7, after batch sampling is carried out on the network output image, extracting the characteristics of the network output image through a residual error network to obtain a three-dimensional tensor T;
step 8, performing average pooling on the three-dimensional tensor T to divide the three-dimensional tensor T into p vertical strips, namely, partitioning, identifying and processing, and acquiring activation vectors of the p vertical strips by using a matrix channel axis of the three-dimensional tensor T and defining the activation vectors as column vectors;
step 9, performing average pooling on each vertical bar in the p vertical bars to generate a single part-level column vector with more detailed characteristics, and inputting the column vector into a classifier to perform prediction processing to obtain a predicted image; in order to dynamically classify all column vectors e in the three-dimensional tensor T, the classifier is composed of an FC layer and a Softmax function, and Softmax activation is used as a partial classifier as follows:
Figure BDA0003427345840000061
wherein P (P)iI e) is e belongs to PiThe prediction probability of a part, p, is the number of previously defined partitions. W is a trainable weight matrix;
step 10, carrying out tensor reconstruction and re-identification processing on the predicted image by using a convolutional neural network, extracting matrix characteristics of the predicted image after tensor reconstruction and re-identification processing, and carrying out characteristic generation from three angles of a channel, a width and a height by using the matrix characteristics to obtain context information segments in the channel direction, the width direction and the height direction;
step 11, repeating step 10, obtaining context information segments in different directions, and performing synthesis processing on the context information segments in different directions to obtain context attention representing three-dimensional context characteristics; and repeating continuously, reconstructing other context segments, and activating and aggregating the context attention by utilizing element-level products and weighted average, thereby obtaining fine-grained context characteristics on space and channel dimensions.
Step 12, performing global average pooling on the fine-grained context feature map through a global average pooling layer to obtain a higher-order target feature map; and connecting the target characteristic diagram with the column vectors, optimizing the connection result by using the minimized cross entropy loss to obtain a joint optimization characteristic diagram, and predicting the joint optimization characteristic diagram to obtain a prediction result.
In a specific embodiment, in step 5, a feature pyramid structure in the path aggregation network is used to connect the top-down path and the bottom-up path in the horizontal direction to segment the feature map with the feature matrix, a pooling extraction operation is performed on each feature map with the feature matrix by using an adaptive pooling operation to obtain a target feature, and a full-connection fusion operation is performed on the extracted target feature to obtain a fusion feature map. In order to generate a target mask better, a self-adaptive pooling is used for carrying out a pooling extraction operation on each representative feature map in the path to obtain target features, and a full-connection fusion operation is carried out on the extracted target features to obtain a fusion feature map.
In a specific embodiment, in step 6, the anchor frame is predicted by using a global intersection ratio as a prediction frame regression function to generate a prediction frame, where the global intersection ratio is specifically represented by the following formula:
Figure BDA0003427345840000071
Figure BDA0003427345840000072
Figure BDA0003427345840000073
where ρ is2(b,bgt) The Euclidean distance between the center points of the prediction frame and the real frame is represented, c represents the diagonal length of the minimum external moment between the center point of the prediction frame and the center point of the real frame, IOU represents the intersection ratio between the prediction frame and the real frame, alpha represents an influence factor corresponding to a parameter v, v represents a parameter for measuring the consistency of the length-width ratio, w represents the width of the corresponding frame, h represents the height of the corresponding frame, gt represents the real frame, and CIOU represents the global intersection ratio.
In a specific embodiment, the step 8 further includes modifying the residual network to extract depth features of the network output image to obtain a three-dimensional tensor T, and specifically modifying the residual network includes eliminating a global average pooling layer, a full connection layer, and an output layer of the convolutional neural network.
In a specific embodiment, in step 11, activating and aggregating the context attention by using element-level multiplication and weighted average, so as to obtain fine-grained context features in spatial and channel dimensions, specifically:
step 11.1, repeating the step 10 to obtain context information segments in different directions;
step 11.2, carrying out tensor reconstruction processing on different context information fragments to obtain the sub attention image A in different directionsi
Step 11.3, annotating the sub-points in different directions with graph AiSynthesizing to obtain a context intention;
and 11.4, activating and aggregating the context attention by utilizing element-level multiplication and weighted average, thereby obtaining fine-grained context characteristics on space and channel dimensions.
In a specific embodiment, the specific method in step 11.4 is:
Figure BDA0003427345840000074
S={s1,s2,...,sCHW}
A={a1,a2,...,aCHW}
wherein S represents the feature matrix of the input, A represents the context attention, Y represents the fine-grained context feature, i represents the ith feature, CHW represents the total number of features, S represents the input feature, and A represents the context attention.
Specifically, the dual-branch re-identification network proposed by the present invention, as shown in fig. 4, includes a local sensing branch network and a high-order reconstruction branch network;
(1) the local perception branch network is used for performing average pooling on the three-dimensional tensor T in the step 8 to divide the three-dimensional tensor T into p vertical strips, namely, block re-identification processing, and obtaining the activation vectors of the p vertical strips by using the matrix channel axis of the three-dimensional tensor T and defining the activation vectors as column vectors.
Specifically, dividing the three-dimensional tensor T into five sub-feature tensors with equal size from the horizontal direction can be described as: t (i) ([ T (i,1), T (i,2),.. T (i, p) ] s.t.i) ("1, 2.. N"), where N is the number of samples in the data set and p represents the number of blocks in the first branch. Then, global average pooling is adopted for the sub-regions of each partition, so that a column vector corresponding to each sub-feature tensor is obtained, and the column vector can effectively acquire the detail features of the image. Then, each block-segmentation-based sub-vector obtained by passing the convolutional layer for each column vector is represented as H (i) ═ H (i,1), H (i,2),.. H (i, p) ] s.t.i ═ 1,2.. N. The method can model the feature distribution of each local area to obtain the feature extraction performance of the local perception feature improvement network. In the training process, M is the class number in the training set. Finally, the identity of the prediction input is obtained by inputting the column vector into a classifier for prediction processing. The classifier is composed of an FC layer and a Softmax function, and Softmax activation is shown as a partial classifier as follows:
Figure BDA0003427345840000081
where p is the number of previously defined partitions, M is the number of classes in the training set, LsoftmaxFor local perceptual branch loss, H (i, p) is the ith sample obtained by convolving the pth column vector, and H (j, p) is the jth sample obtained by convolving the pth column vector. And obtaining the prediction probability of the local refined features of the underwater target by predicting each block of feature sub-vector so as to increase the local perception performance of the network.
(2) The high-order reconstruction branch network performs tensor reconstruction re-identification processing on the predicted image by using a residual network in step 10, extracts the matrix characteristics of the predicted image after the tensor reconstruction re-identification processing, and performs characteristic generation from three angles of a channel, a width and a height by using the matrix characteristics to obtain context information segments in three directions.
(3) The tensor reconstruction part in the high-order reconstruction branch network mainly performs feature generation on the tensor T obtained by the cross-phase network from three angles of a channel, a width and a height to obtain context information fragments in three directions. And processing the context features generated in the three directions to obtain a sub-attention drawing representing a part of the three-dimensional context features. And continuously and iteratively reconstructing other context segments, and activating and aggregating the subgraphs through element-level multiplication and weighted average, thereby obtaining fine-grained context characteristics on space and channel dimensions.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (6)

1. An underwater target re-identification method combining local perception and high-order feature reconstruction is characterized by comprising the following steps:
step 1, collecting an image of a passing area of an underwater robot;
step 2, performing underwater target feature processing on the acquired image by using a cross-stage network of a YOLOv4 target detection algorithm model to obtain an underwater target feature map;
step 3, mapping the underwater target characteristic graph by using a Mish activation function to obtain a target characteristic result graph;
step 4, performing space pyramid pooling operation on the target characteristic result graph through a target detection algorithm, and performing character string connection on the pooled result, so as to separate the characteristic graph with the characteristic matrix;
step 5, inputting the characteristic diagram with the characteristic matrix into a path aggregation network for characteristic fusion to obtain a fusion characteristic diagram;
step 6, inputting the fusion characteristic graph into a prediction network, setting an anchor frame, predicting the anchor frame by using a clustering algorithm to generate a prediction frame, wherein the prediction frame comprises a network output image and is used for detecting target objects with different scales;
step 7, after sampling the network output images in batches, extracting the characteristics of the network output images through a residual error network to obtain a three-dimensional tensor T,
step 8, performing average pooling on the three-dimensional tensor T to divide the three-dimensional tensor T into p vertical strips, namely, partitioning, identifying and processing, and acquiring activation vectors of the p vertical strips by using a matrix channel axis of the three-dimensional tensor T and defining the activation vectors as column vectors;
step 9, inputting the column vectors into a classifier to perform prediction processing to obtain a predicted image;
step 10, carrying out tensor reconstruction and re-identification processing on the predicted image by using a residual network, extracting matrix characteristics of the predicted image after tensor reconstruction and re-identification processing, and carrying out characteristic generation from three angles of a channel, a width and a height by using the matrix characteristics to obtain a context information segment;
step 11, repeating step 10, and synthesizing the obtained context information segments to obtain a context attention drawing representing the three-dimensional context characteristics; activating and aggregating the context attention by using element-level products and weighted averages to obtain fine-grained context features on space and channel dimensions;
step 12, performing global average pooling on the fine-grained context feature map through a global average pooling layer to obtain a target feature map; and connecting the target characteristic diagram with the column vectors, optimizing the connection result by using the minimized cross entropy loss to obtain a joint optimization characteristic diagram, and predicting the joint optimization characteristic diagram to obtain a prediction result.
2. The underwater target re-identification method combining local perception and high-order feature reconstruction as claimed in claim 1, wherein: in the step 5, a characteristic pyramid structure in the path aggregation network is used to transversely connect the top-down path and the bottom-up path to divide the characteristic diagram with the characteristic matrix, a self-adaptive pooling operation is used to perform a pooling extraction operation on each characteristic diagram with the characteristic matrix to obtain target characteristics, and a full-connection fusion operation is performed on the extracted target characteristics to obtain a fusion characteristic diagram.
3. The underwater target re-identification method combining local perception and high-order feature reconstruction as claimed in claim 2, wherein: and 6, predicting the anchor frame by using the global intersection ratio as a prediction frame regression function to generate a prediction frame, wherein the specific formula of the global intersection ratio is as follows:
Figure FDA0003427345830000021
Figure FDA0003427345830000022
Figure FDA0003427345830000023
where ρ is2(b,bgt) The Euclidean distance between the center points of the prediction frame and the real frame is represented, c represents the diagonal length of the minimum external moment between the center point of the prediction frame and the center point of the real frame, IOU represents the intersection ratio between the prediction frame and the real frame, alpha represents an influence factor corresponding to a parameter v, v represents a parameter for measuring the consistency of the length-width ratio, w represents the width of the corresponding frame, h represents the height of the corresponding frame, gt represents the real frame, and CIOU represents the global intersection ratio.
4. The underwater target re-identification method combining local perception and high-order feature reconstruction as claimed in claim 3, wherein: step 8, modifying the residual error network to extract the depth features of the network output image to obtain a three-dimensional tensor T, wherein the specific modification of the residual error network comprises eliminating a global average pooling layer, a full connection layer and an output layer of the convolutional neural network.
5. The underwater target re-identification method combining local perception and high-order feature reconstruction as claimed in claim 4, wherein: in step 11, activating and aggregating the context attention by using element-level product and weighted average, so as to obtain fine-grained context features in space and channel dimensions, specifically:
step 11.1, repeating the step 10 to obtain context information segments in different directions;
step 11.2, carrying out tensor reconstruction processing on different context information fragments to obtain the sub attention image A in different directionsi
Step 11.3, annotating the sub-points in different directions with graph AiSynthesizing to obtain a context intention;
and 11.4, activating and aggregating the context attention by utilizing element-level multiplication and weighted average, thereby obtaining fine-grained context characteristics on space and channel dimensions.
6. The method for re-identifying the underwater target by combining the local perception and the high-order feature reconstruction as claimed in claim 5, wherein the step 11.4 is specifically as follows:
Figure FDA0003427345830000031
S={s1,s2,...,sCHW}
A={a1,a2,...,aCHW}
wherein S represents the feature matrix of the input, A represents the context attention, Y represents the fine-grained context feature, i represents the ith feature, CHW represents the total number of features, S represents the input feature, and A represents the context attention.
CN202111582065.8A 2021-12-22 2021-12-22 Underwater target re-identification method combining local perception and high-order feature reconstruction Pending CN114283326A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111582065.8A CN114283326A (en) 2021-12-22 2021-12-22 Underwater target re-identification method combining local perception and high-order feature reconstruction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111582065.8A CN114283326A (en) 2021-12-22 2021-12-22 Underwater target re-identification method combining local perception and high-order feature reconstruction

Publications (1)

Publication Number Publication Date
CN114283326A true CN114283326A (en) 2022-04-05

Family

ID=80873838

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111582065.8A Pending CN114283326A (en) 2021-12-22 2021-12-22 Underwater target re-identification method combining local perception and high-order feature reconstruction

Country Status (1)

Country Link
CN (1) CN114283326A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116503914A (en) * 2023-06-27 2023-07-28 华东交通大学 Pedestrian re-recognition method, system, readable storage medium and computer equipment
CN116596904A (en) * 2023-04-26 2023-08-15 国网江苏省电力有限公司泰州供电分公司 Power transmission detection model construction method and device based on adaptive scale sensing

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116596904A (en) * 2023-04-26 2023-08-15 国网江苏省电力有限公司泰州供电分公司 Power transmission detection model construction method and device based on adaptive scale sensing
CN116596904B (en) * 2023-04-26 2024-03-26 国网江苏省电力有限公司泰州供电分公司 Power transmission detection model construction method and device based on adaptive scale sensing
CN116503914A (en) * 2023-06-27 2023-07-28 华东交通大学 Pedestrian re-recognition method, system, readable storage medium and computer equipment
CN116503914B (en) * 2023-06-27 2023-09-01 华东交通大学 Pedestrian re-recognition method, system, readable storage medium and computer equipment

Similar Documents

Publication Publication Date Title
CN110135267B (en) Large-scene SAR image fine target detection method
CN111931684B (en) Weak and small target detection method based on video satellite data identification features
CN111612008B (en) Image segmentation method based on convolution network
CN114202672A (en) Small target detection method based on attention mechanism
CN109165540B (en) Pedestrian searching method and device based on prior candidate box selection strategy
CN110852182B (en) Depth video human body behavior recognition method based on three-dimensional space time sequence modeling
CN114359851A (en) Unmanned target detection method, device, equipment and medium
CN110826462A (en) Human body behavior identification method of non-local double-current convolutional neural network model
CN114283326A (en) Underwater target re-identification method combining local perception and high-order feature reconstruction
CN105574545B (en) The semantic cutting method of street environment image various visual angles and device
CN112507861A (en) Pedestrian detection method based on multilayer convolution feature fusion
CN115861619A (en) Airborne LiDAR (light detection and ranging) urban point cloud semantic segmentation method and system of recursive residual double-attention kernel point convolution network
CN112634369A (en) Space and or graph model generation method and device, electronic equipment and storage medium
CN107610224B (en) 3D automobile object class representation algorithm based on weak supervision and definite block modeling
CN116342894A (en) GIS infrared feature recognition system and method based on improved YOLOv5
CN116091946A (en) Yolov 5-based unmanned aerial vehicle aerial image target detection method
CN115019201A (en) Weak and small target detection method based on feature refined depth network
CN114332473A (en) Object detection method, object detection device, computer equipment, storage medium and program product
CN115937736A (en) Small target detection method based on attention and context awareness
CN113723558A (en) Remote sensing image small sample ship detection method based on attention mechanism
CN114119669A (en) Image matching target tracking method and system based on Shuffle attention
CN113496260A (en) Grain depot worker non-standard operation detection method based on improved YOLOv3 algorithm
CN113780145A (en) Sperm morphology detection method, sperm morphology detection device, computer equipment and storage medium
CN117437691A (en) Real-time multi-person abnormal behavior identification method and system based on lightweight network
CN116934820A (en) Cross-attention-based multi-size window Transformer network cloth image registration method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination