CN116740418A - Target detection method based on graph reconstruction network - Google Patents

Target detection method based on graph reconstruction network Download PDF

Info

Publication number
CN116740418A
CN116740418A CN202310575816.6A CN202310575816A CN116740418A CN 116740418 A CN116740418 A CN 116740418A CN 202310575816 A CN202310575816 A CN 202310575816A CN 116740418 A CN116740418 A CN 116740418A
Authority
CN
China
Prior art keywords
graph
time
space
dimension
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310575816.6A
Other languages
Chinese (zh)
Inventor
邸江磊
江文隽
秦智坚
吴计
王萍
任振波
秦玉文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202310575816.6A priority Critical patent/CN116740418A/en
Publication of CN116740418A publication Critical patent/CN116740418A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/62Extraction of image or video features relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the field of target detection, and discloses a target detection method based on a graph reconstruction network. According to the method, firstly, multispectral images for a period of time are collected, dimension reduction and feature extraction are carried out on the multispectral images, and a physical feature map, a spatial feature map and a spectral feature map of the multispectral images are respectively extracted in a map embedding mode. And connecting the obtained three feature graphs with different node types through the link edges and the nodes, and obtaining a heterogeneous graph fusing the multi-source features by adopting a self-attention-based graph pooling method. And sequencing and inputting the fused graph data according to the time dimension, acquiring time and space dimension characteristics of the data, convoluting by using a plurality of layers of space-time graphs to extract information of different time and space dimensions, enabling a result of the space-time convolution to be consistent with a predicted target dimension through convolution operation and full connection operation of CNN, and classifying and positioning targets through a full connection layer with shared weight values to finish a target detection task. The time-space characteristics obtained by the method enable detection to be more accurate.

Description

Target detection method based on graph reconstruction network
Technical Field
The invention relates to the field of target detection, in particular to a target detection method based on a graph reconstruction network.
Background
Object detection is an important task in the field of computer vision, whose object is to accurately detect objects of interest in images or videos and to mark their positions. A multispectral image is an image that contains information for a plurality of bands. The system not only contains the spatial information of the target, but also contains the spectral information, thereby overcoming the problem of limited single-mode image information. For a dynamic weak target, the target area and the position information can be acquired more accurately by utilizing the multi-mode information of the multi-spectrum image to perform target identification. Therefore, the method combines the characteristics of the multispectral image, and can improve the detection accuracy and reliability by applying the multispectral image to the target detection task. By utilizing various wave band information in the multispectral image, the algorithm can better distinguish the target from the background and extract richer characteristic information from the target, thereby realizing more accurate target detection results.
Early multi-spectral target identification was achieved mainly by manual band selection. If a specific characteristic wave band separates a detection target from a complex field background, the target detection is realized by utilizing polarization multispectral image fusion aiming at the spectrum characteristics of a camouflage target. In recent years, traditional detection means for artificial feature selection and fusion are gradually replaced by convolutional neural networks. Zhang Shaoting from university of North Carolina verifies the impact of feature fusion at different stages of CNN on the target detection performance of multispectral images. Meanwhile, the Hangil utilizes CNN and support vector regression to complete the combined feature extraction of visible and far infrared spectrogram images. The depth residual error network is utilized by the university of northwest industry He Ming and the like to extract different layers of characteristics of the multispectral remote sensing image, and the detection of a remarkable target is realized in an end-to-end mode.
However, CNNs as the underlying network model only excel in processing spatial network data and establishing spatial local neighborhood relationships between pixels, readily ignoring implicit relationships between visual information and irregular representations of the data itself. The down-sampling process in the CNN reduces the spatial resolution of the feature images, thus inevitably causing the loss of small target information, making it difficult for the detection network to perform characterization learning from limited and distorted structural information, and at the same time, the CNN cannot extract the time dimension feature information from frame to frame. For this reason, it is necessary to propose a solution to the above-mentioned problems.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a target detection method based on a graph reconstruction network.
The technical scheme for solving the technical problems is as follows: the target detection method based on the graph reconstruction network comprises the following steps:
(S1) collecting multispectral images for a period of time, and carrying out dimension reduction and feature extraction on the multispectral images;
(S2) respectively extracting a physical characteristic diagram, a spatial characteristic diagram and a spectral characteristic diagram of the multispectral image in a diagram embedding mode;
(S3) connecting the obtained three-dimensional feature graphs with the node types through the link edges, and obtaining a heterogeneous graph fusing the multi-source features by adopting a self-attention-based graph pooling method;
(S4) sequencing and inputting the fused graph data according to time dimension, and acquiring time dimension and space dimension characteristics and time-space correlation of the data by using multi-layer space-time graph convolution;
and S5, enabling the space-time convolution result to be consistent with the predicted target dimension through the convolution operation and the full-connection operation of the CNN, and classifying and positioning the target through the full-connection layer with the shared weight.
Preferably, in step (S1), the multispectral image is captured by a multispectral camera that can collect 3 or more spectral bands simultaneously.
Preferably, in step (S1), the feature dimension reduction and feature extraction are that the data information is subjected to weight distribution of different pixel spectrum feature similarities by using spatial spectrum embedding, and similarity classification and feature dimension reduction are performed on local neighborhood space and spectrum information through manifold learning.
Preferably, in step (S2), the three feature maps are obtained respectively as follows: extracting a physical feature map of the spectrum data by utilizing the spectrum data after dimension reduction and combining the infrared spectrum features; determining super-pixel neighbor node information by using a linear iterative clustering method, constructing edge connection relations among nodes according to the spatial connectivity relations of the super-pixels, and extracting a spatial feature map; and combining the spectrum characteristic similarity of the target, sampling and recombining from different spectrum band dimensions to obtain the spectrum characteristic distribution of the target, and effectively representing the spectrum data residing on the smooth manifold by using the graph neural network.
Preferably, in step (S3), the linked network model is a graph self-encoder including, but not limited to, a graph convolutional self-encoder, a variational graph convolutional self-encoder, an anti-regularization graph self-encoder.
Preferably, in step (S3), the pooling method includes, but is not limited to DiffPool, SAGPool, ASAP.
Preferably, in step (S4), the space-time diagram convolution performs feature extraction in a time dimension and a space dimension by different methods, wherein a network for extracting the time dimension includes, but is not limited to, RNN, GRU, LSTM, TCmodule, transformer, and a feature network for extracting the space dimension includes, but is not limited to, GCN, GAT, GCN and GAT.
Preferably, in step (S5), the detection method is to implement classification and positioning on the target by passing through the convolution module after the convolution of the space-time diagrams of four layers, and finally feeding the extracted features to the target detection module to complete the target recognition task.
Compared with the prior art, the invention has the following beneficial effects:
the multi-dimensional characteristic information of the spatial characteristic, the physical characteristic and the spectral characteristic is obtained by carrying out dimension reduction, characteristic extraction and image embedding on the multi-spectral image. The obtained graph structures are combined to obtain the heterogeneous graph of the multi-source information, and semantic association and relative position information of the targets can be supplemented by the multi-source heterogeneous information better.
Meanwhile, the nodes and the edge features in the graph data represent potential relations between the data, the nodes have the characteristics of disorder and variable size, the states of the nodes can be updated according to neighbor nodes with any depth, and therefore the attribute feature relations are represented, and the characteristics enable the nodes to be used for representing long-range space and spectrum relations in multispectral images.
In addition, aiming at the fact that CNN can not extract time dimension characteristic information between frames, the time dimension characteristic is extracted by utilizing space-time diagram convolution, and the associated characteristic extraction of a moving target and global background information of inter-frame changes can be more accurately determined through the characteristic.
Drawings
Fig. 1 is a flowchart of a target detection method based on a graph reconstruction network according to the present invention.
Fig. 2 is a frame diagram of a target detection method based on a graph reconstruction network according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but embodiments of the present invention are not limited thereto.
Referring to fig. 1, the target detection method based on the graph reconstruction network of the present invention includes the following steps:
(S1) collecting multispectral images for a period of time, and carrying out dimension reduction and feature extraction on the multispectral images;
(S2) respectively extracting a physical characteristic diagram, a spatial characteristic diagram and a spectral characteristic diagram of the multispectral image in a diagram embedding mode;
(S3) connecting the obtained three-dimensional feature graphs with the node types through the link edges, and obtaining a heterogeneous graph fusing the multi-source features by adopting a self-attention-based graph pooling method;
(S4) sequencing and inputting the fused graph data according to time dimension, and acquiring time dimension and space dimension characteristics and time-space correlation of the data by using multi-layer space-time graph convolution;
and S5, enabling the space-time convolution result to be consistent with the predicted target dimension through the convolution operation and the full-connection operation of the CNN, and classifying and positioning the target through the full-connection layer with the shared weight.
Referring to fig. 1, in step (S1), the acquired multispectral data is composed of multispectral images of four bands, and 1000 multispectral images of different time periods are taken.
Referring to fig. 1, in step (S1), the dimension reduction and feature extraction method: the spectrum and the spatial information are fused by using the augmentation vector:
x=(u,v,b 1 ,b 2 ,...,b B )=(x 1 ,x 2 ,...,x B+2 ) T (1)
where h (u, v) is a pixel on the graph, (b) 1 ,b 2 ,b 3 ,b B ) Is a band array.
In this embodiment, images of 4 bands are acquired, so b=4.
Will augment the vectorAs training data, normalized to any x i And carrying out same-class classification under a supervision mode, constructing a pixel local neighborhood by a k nearest neighbor algorithm, carrying out similarity classification and feature dimension reduction on local neighborhood space and spectrum information by manifold learning, combining spatial spectrum polynomial local area or neighborhood embedding to finish weight distribution on different pixel spectrum feature similarities in the local neighborhood, and finally combining element matrix multiplication to establish low-dimensional nonlinear explicit mapping between multispectral data.
In a specific embodiment, the number of marked elements in the augmentation vector is 6.
Referring to fig. 1, in step (S2), the physical characteristic map, including the equivalent temperature and equivalent area physical characteristics, is represented as a map by a random walk map embedding method.
Referring to fig. 1, in step (S2), the spatial feature map performs super-pixel segmentation on the multispectral image through the SLIC algorithm, calculates the spatial distance and the spectral distance between the pixel points, balances the weights, and iteratively updates the cluster center and the range boundary of the super-pixel to obtain multispectral image data formed by the super-pixels, and constructs the edge connection relationship between the nodes according to the spatial connectivity relationship of the super-pixels.
Referring to fig. 1, the spectral signature described in step (S2) is constructed by a method that passes through a semi-supervised adjacency matrix. Specifically, the method is constructed based on information provided by a limited number of tag data and a large number of unlabeled data, pseudo tags are constructed by using a Dirichlet process mixed model based on variation, and space spectrum adjacency matrix construction is realized based on an inherent clustering algorithm in a data sample.
Referring to fig. 1, in step (S3), the nodes and edges of the obtained three feature graphs are analyzed, and three different types of node and edge feature graphs are connected using a network structure based on a graph self-encoder.
Specifically, each given graph is analyzed, the node feature vectors among different graphs are analyzed through cosine similarity, and nodes with high similarity in the three graphs are reserved. And for the three processed graphs, the graph convolution network is used for calculating the three processed graphs, so that the node representation z of each node is obtained. The following formula is then used:
wherein the method comprises the steps ofIs the predictive probability between the linked nodes (i, j), where σ is the Sigmoid activation function. Here, the probability greater than 0.8 is set for linking, and the probability less than 0.2 is not connected, so that a new graph after three graphs are linked is obtained.
Referring to fig. 1, in step (S3), the node of the new graph is extracted and aggregated with the SAGpool method.
Specifically, the extracted new graph is subjected to a convolution operation of a graph nerve, and the GCN learns the characteristic representation of each node V epsilon V, namely, the characteristic representation of the node V is obtained by aggregating the neighbor node characteristics of each node. For each node v, a self-attention mechanism is used to calculate an attention score z for each node. Next, topk is used to select the most important node, and the number of reserved nodes is determined by pooling the ratio k, where we set k to 0.5. By thus obtaining the attention-based mask map, the mask map is multiplied by the corresponding node of the map structure of the fused heterogeneous information of the original input, and the final output map, i.e. the heterogeneous map fused with the multi-source features, is obtained.
Referring to fig. 1, in step (S4), the network for extracting the time dimension is TCmodule. The time module consists of two expansion initiation layers.
Specifically, the output source of the whole time convolution module is divided into two parts, the input of the module is filtered through an expansion starting layer formed by a group of one-dimensional convolution filters, and the difference is that the subsequent activation functions of the expansion starting layer are different. The expansion initiation layer adopts a structure consisting of filter sizes of 1 x 2, 1 x 3, 1 x 6 and 1 x 7, so that the above-mentioned time period can be covered by a combination of these filter sizes.
In this example, we input 10 graphs of fused heterogeneous features at a time, i.e. extract the front-to-back temporal feature relationship from the original 10-frame multispectral image.
Referring to fig. 1, in step (S4), the extracted feature network of the spatial dimension is GCN plus GAT.
Specifically, spatial features are extracted through a GCN layer after a network of a time module is passed, information transfer between nodes is performed through a GAT graph annotation force layer, and dependency relations among the nodes are captured. The features after passing through TCmodule, GCN and GAT once again pass through the same processing, and the features generated after each processing are subjected to feature extraction.
In this example, we extract four levels of features and splice the acquired features with a concat function to obtain multi-scale spatio-temporal features.
Referring to fig. 1, in step (S5), after the extracted spatio-temporal features, the result of the spatio-temporal convolution is consistent with the prediction target dimension through a rolling block of five layers of CNNs, and finally, the acquired features are classified by using MLP, so as to acquire position information in the image, and the task of target detection is completed for the category information.
In this example, training was performed in the Linux operating system of Ubuntu 18.04.3, pyCharm compiling environment Python 3.9 programming language, pytorch-cuda11.7 deep learning application library, environment on GeForce 3090 graphics card.
Also in this example, the IOU threshold is set to 0.5. A space-time diagram convolutional network was constructed using pyrerch. Here, the loss function we set to a two-class cross entropy loss function:
where y is groudtruth, 1 if positive and 0 if negative, andis the probability of the accuracy of the model predictions. The learning rate is set to 0.01, the epoch is set to 300 during training, the batch size is 8, and 1000 pictures are input.
The foregoing is only a preferred embodiment of the present invention, and the scope of the present invention is not limited to the above examples, but all technical solutions falling within the spirit and principle of the present invention fall within the scope of the present invention. It should be noted that modifications and adaptations to the present invention are intended to be within the scope of the present invention without departing from the principles thereof.

Claims (8)

1. The target detection method based on the graph reconstruction network is characterized by comprising the following steps of:
(S1) collecting multispectral images for a period of time, and carrying out dimension reduction and feature extraction on the multispectral images;
(S2) respectively extracting a physical characteristic diagram, a spatial characteristic diagram and a spectral characteristic diagram of the multispectral image in a diagram embedding mode;
(S3) connecting the obtained three-dimensional feature graphs with the node types through the link edges, and obtaining a heterogeneous graph fusing the multi-source features by adopting a self-attention-based graph pooling method;
(S4) sequencing and inputting the fused graph data according to time dimension, and acquiring time dimension and space dimension characteristics and time-space correlation of the data by using multi-layer space-time graph convolution;
and S5, enabling the space-time convolution result to be consistent with the predicted target dimension through the convolution operation and the full-connection operation of the CNN, and classifying and positioning the target through the full-connection layer with the shared weight.
2. The method of claim 1, wherein in step (S1), the multispectral image is captured by a multispectral camera capable of capturing 3 or more spectral bands simultaneously.
3. The method for detecting the target based on the graph reconstruction network according to claim 1, wherein in the step (S1), the feature dimension reduction and feature extraction are that the data information is subjected to weight distribution of different pixel spectrum feature similarities by utilizing spatial spectrum embedding, and the local neighborhood space and the spectrum information are subjected to similarity classification and feature dimension reduction through manifold learning.
4. The method for detecting an object based on a graph reconstruction network according to claim 1, wherein in step (S2), the three feature graphs are obtained by: extracting a physical feature map of the spectrum data by utilizing the spectrum data after dimension reduction and combining the infrared spectrum features; determining super-pixel neighbor node information by using a linear iterative clustering method, constructing edge connection relations among nodes according to the spatial connectivity relations of the super-pixels, and extracting a spatial feature map; and combining the spectrum characteristic similarity of the target, sampling and recombining from different spectrum band dimensions to obtain the spectrum characteristic distribution of the target, and effectively representing the spectrum data residing on the smooth manifold by using the graph neural network.
5. The method of claim 1, wherein in step (S3), the linked network model is a graph self-encoder including, but not limited to, a graph convolutional self-encoder, a variational graph convolution self-encoder, and an anti-regularization graph self-encoder.
6. The method of claim 1, wherein in step (S3), the pooling method includes, but is not limited to, diffPool, SAGPool, ASAP.
7. The method of claim 1, wherein in step (S4), the space-time graph convolution performs feature extraction in a time dimension and a space dimension respectively, wherein the network for extracting the time dimension includes but is not limited to RNN, GRU, LSTM, TCN, transformer, and the feature network for extracting the space dimension includes but is not limited to GCN, GAT, GCN in combination with GAT.
8. The method for detecting the target based on the graph reconstruction network according to claim 1, wherein in the step (S5), after the four-level space-time graph convolution, the extracted features are passed through a convolution module and finally fed into a target detection module, so as to classify and locate the target, and the target recognition task is completed.
CN202310575816.6A 2023-05-22 2023-05-22 Target detection method based on graph reconstruction network Pending CN116740418A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310575816.6A CN116740418A (en) 2023-05-22 2023-05-22 Target detection method based on graph reconstruction network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310575816.6A CN116740418A (en) 2023-05-22 2023-05-22 Target detection method based on graph reconstruction network

Publications (1)

Publication Number Publication Date
CN116740418A true CN116740418A (en) 2023-09-12

Family

ID=87914282

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310575816.6A Pending CN116740418A (en) 2023-05-22 2023-05-22 Target detection method based on graph reconstruction network

Country Status (1)

Country Link
CN (1) CN116740418A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116934754A (en) * 2023-09-18 2023-10-24 四川大学华西第二医院 Liver image identification method and device based on graph neural network
CN117830752A (en) * 2024-03-06 2024-04-05 昆明理工大学 Self-adaptive space-spectrum mask graph convolution method for multi-spectrum point cloud classification

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116934754A (en) * 2023-09-18 2023-10-24 四川大学华西第二医院 Liver image identification method and device based on graph neural network
CN116934754B (en) * 2023-09-18 2023-12-01 四川大学华西第二医院 Liver image identification method and device based on graph neural network
CN117830752A (en) * 2024-03-06 2024-04-05 昆明理工大学 Self-adaptive space-spectrum mask graph convolution method for multi-spectrum point cloud classification
CN117830752B (en) * 2024-03-06 2024-05-07 昆明理工大学 Self-adaptive space-spectrum mask graph convolution method for multi-spectrum point cloud classification

Similar Documents

Publication Publication Date Title
CN111191736B (en) Hyperspectral image classification method based on depth feature cross fusion
Mou et al. Relation matters: Relational context-aware fully convolutional network for semantic segmentation of high-resolution aerial images
Song et al. A survey of remote sensing image classification based on CNNs
CN111259786B (en) Pedestrian re-identification method based on synchronous enhancement of appearance and motion information of video
CN113065558A (en) Lightweight small target detection method combined with attention mechanism
Venugopal Automatic semantic segmentation with DeepLab dilated learning network for change detection in remote sensing images
Alidoost et al. A CNN-based approach for automatic building detection and recognition of roof types using a single aerial image
Hsueh et al. Human behavior recognition from multiview videos
CN116740418A (en) Target detection method based on graph reconstruction network
CN111291809A (en) Processing device, method and storage medium
CN114821164A (en) Hyperspectral image classification method based on twin network
CN112507853B (en) Cross-modal pedestrian re-recognition method based on mutual attention mechanism
CN112308152A (en) Hyperspectral image ground object classification method based on spectrum segmentation and homogeneous region detection
CN112580480B (en) Hyperspectral remote sensing image classification method and device
CN110222718A (en) The method and device of image procossing
Etezadifar et al. A new sample consensus based on sparse coding for improved matching of SIFT features on remote sensing images
Khoshboresh-Masouleh et al. Improving weed segmentation in sugar beet fields using potentials of multispectral unmanned aerial vehicle images and lightweight deep learning
CN116740419A (en) Target detection method based on graph regulation network
Cao et al. Learning spatial-temporal representation for smoke vehicle detection
Sehree et al. Olive trees cases classification based on deep convolutional neural network from unmanned aerial vehicle imagery
Hong et al. Graph-induced aligned learning on subspaces for hyperspectral and multispectral data
Khoshboresh-Masouleh et al. Robust building footprint extraction from big multi-sensor data using deep competition network
Fırat et al. Hybrid 3D convolution and 2D depthwise separable convolution neural network for hyperspectral image classification
WO2023222643A1 (en) Method for image segmentation matching
Yaman et al. Image processing and machine learning‐based classification method for hyperspectral images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination