CN111340189B - Space pyramid graph convolution network implementation method - Google Patents

Space pyramid graph convolution network implementation method Download PDF

Info

Publication number
CN111340189B
CN111340189B CN202010108770.3A CN202010108770A CN111340189B CN 111340189 B CN111340189 B CN 111340189B CN 202010108770 A CN202010108770 A CN 202010108770A CN 111340189 B CN111340189 B CN 111340189B
Authority
CN
China
Prior art keywords
network
convolution
graph
graph convolution
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010108770.3A
Other languages
Chinese (zh)
Other versions
CN111340189A (en
Inventor
林宙辰
李夏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202010108770.3A priority Critical patent/CN111340189B/en
Publication of CN111340189A publication Critical patent/CN111340189A/en
Application granted granted Critical
Publication of CN111340189B publication Critical patent/CN111340189B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Mathematical Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Mathematics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a space pyramid graph convolution network realization method, which comprises the steps of firstly generating an incidence matrix aiming at a depth network characteristic graph, and deriving an efficient calculation method by decomposing the incidence matrix and using a multiplication combination law, so that graph convolution can be directly carried out in an original characteristic space. The invention is used as a lightweight network, breaks through the limitation that graph convolution needs to be carried out in semantic nodes in the past method, and further improves the expression capability of the network by carrying out graph reasoning on a plurality of scales; the invention can effectively solve the problem of insufficient perception domain of the full convolution network and remarkably improve the performance of the full convolution network; the graph convolution scheme provided by the invention obtains remarkable performance in a density prediction task in computer vision, and examples and experiments fully verify the effectiveness and potential application value of the graph convolution scheme.

Description

Space pyramid graph convolution network implementation method
Technical Field
The invention belongs to the field of graph convolution network and depth network structure design, and particularly relates to a method for realizing a spatial pyramid graph convolution network.
Background
In recent years, network architectures based on fully-convoluted networks have achieved tremendous success in computer vision tasks. The full convolution network consists of only convolution layers and pooling layers, and by stacking the convolution layers, the theoretical perceptual domain of the network can increase as the depth of the network increases. But their effective perceptual domain is limited so that they can only capture local information for each. Thus, full convolution networks have difficulty capturing complex context information. While for dense prediction tasks, such as semantic segmentation, depth estimation, etc., context information is critical. This problem limits the performance of a full convolution network.
There are a number of approaches currently being attempted to address this problem. Spatial pyramids based on hole convolution propose to capture context information at different distances using different coefficients of convolution expansion. The deformable convolution adaptively determines the final perceptual domain by learning the offset of the convolution locations. Non-local neural networks and dual-attention networks attempt to introduce new interaction modules that make the pixels perceive the entire space. Recurrent neural networks are also used to perform remote reasoning. The above methods enlarge the perception domain and can help the network to capture the remote information, but have the problem of large calculation amount.
Standard convolution extends over unstructured data, producing graph convolution. Subsequent studies performed approximate calculations on the graph convolution formulas to reduce the computational and training costs. Based on the above work, graph convolution has achieved a series of results on graph structure data such as semi-supervised learning, node or graph classification, and molecular prediction. Since the graph convolution can capture global information, it is introduced into the full convolution network as a complement to the standard convolution. The methods map pixels to a semantic node space, perform a graph convolution operation in the semantic space, and then map back to the pixel space.
Disclosure of Invention
The invention aims to provide a space pyramid graph convolution network realization method aiming at the defects of the prior art. The invention can effectively enlarge the effective perception domain of the network and solve the problem of insufficient perception domain of the full convolution network by executing compact graph convolution operation in the original feature space.
The aim of the invention is realized by the following technical scheme: a method for realizing a spatial pyramid graph convolution network comprises the following steps:
(1) The image is convolved to obtain the feature X of original scale 0 Then to X 0 Downsampling to obtain multi-scale features, the formula is as follows:
X s+1 =∏ down (X s )
wherein, pi down Representing downsampling; the superscript indicates the scale level, the range is 0-S, X s+1 、X s Features representing scale levels s, s+1;
(2) Carrying out graph convolution on the multi-scale features obtained in the step (1), and then carrying out upscalingSampling and adding the graph convolution output of the previous stage of scale characteristics to finally obtain the output Y of the network 0 The formula is as follows:
Y s =GR(X s )+∏ up (Y s+1 )
wherein Y is s Network output representing a scale feature of level s, and Y S =GR(X S ),X S Features that are of minimum scale; GR represents graph convolution; pi-shaped structure up Representing upsampling.
Further, the graph convolution is implemented by the steps of:
(2.1) input feature map X ε R H×W×C The method comprises the steps of carrying out a first treatment on the surface of the Wherein H, W, C respectively refer to the height, width and channel number of the feature map X;
(2.2) obtaining an association matrix a, comprising the sub-steps of:
(2.1.1) transformation of X inputted in the step (2.1) by 1X 1 convolution to obtain phi (X; W) with M channels Φ ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein W is Φ Representing a 1 x 1 convolution parameter;
(2.1.2) the X input in step (2.1) is first spatially pooled to 1×1×C, then a vector with dimension M is obtained by 1×1 convolution, and finally p (X; W) is obtained by sigmoid operation p ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein W is p Representing a 1 x 1 convolution parameter;
(2.1.3) calculating the correlation matrix a according to steps (2.1.1) to (2.1.2):
A=φ(X;W Φ )diag(p(X;W p ))φ(X;W Φ ) T
(2.3) calculating a degree matrix D according to the incidence matrix A obtained in the step (2.2):
Λ(X)=diag(p(X;W p ))
wherein,is an all 1 vector of length HW;
(2.4) calculating a normalized Laplace matrix L according to the incidence matrix A obtained in the step (2.2) and the degree matrix D obtained in the step (2.3):
L=I-D -1/2 AD -1/2
wherein I is an identity matrix;
(2.5) the output GR (X) of the graph convolution is obtained by:
LX=X-P(Λ(X)(P T X))
GR(X)=σ(LXΘ)
where Θ represents the parameters of the graph convolution and σ is the nonlinear activation function.
Further, the channel number M is in the range of
Further, the upsampling is achieved by quadratic linear interpolation.
Further, the downsampling is achieved by a max pooling layer.
Compared with the prior art, the invention has the beneficial effects that: according to the space pyramid graph convolution network implementation method, the generation mode of the incidence matrix A is designed according to the characteristics of the depth network feature graph, and the calculation complexity is remarkably reduced by decomposing the incidence matrix A and using a multiplication combination law; the invention is used as a lightweight network, so that the graph volume layer which is only applied to the compressed semantic node space can be directly applied to the original feature space at present; in addition, the invention proposes to fully exploit the potential of graph convolution using a spatial pyramid structure, further increasing the expressive power of the perceptual domain features of the network by graph reasoning over multiple scales. The invention has remarkable performance in the density prediction task in computer vision, and examples and experiments fully verify the effectiveness and potential application value of the invention.
Drawings
FIG. 1 is a schematic diagram of an incidence matrix operation process used in the present invention;
FIG. 2 is a schematic diagram of a spatial pyramid convolution network employing the present invention;
fig. 3 is a schematic diagram of the overall network for use with the present invention.
Detailed Description
The invention is further described by way of examples in the following with reference to the accompanying drawings, but in no way limit the scope of the invention.
The invention designs the generation mode of the Laplace matrix according to the characteristics of the full convolution network, and the calculated amount is obviously reduced by decomposing the Laplace matrix; this change makes it possible to run directly on the original feature space, avoiding the loss of information due to the mapping-inverse mapping procedure. The invention discloses a spatial pyramid graph rolling network, which comprises a graph rolling module shown in figure 1, wherein the graph rolling module is realized by the following steps:
(1) Inputting a characteristic diagram X epsilon R extracted by a backbone network H×W×C The method comprises the steps of carrying out a first treatment on the surface of the Wherein, H, W, and C refer to the height, width, and channel number of the feature map X, respectively, h×w is the number of nodes of the feature map X, and C is the feature dimension of the nodes of the feature map X.
(2) The process of obtaining the association matrix A is shown in FIG. 1, and comprises the following substeps:
(1.1) converting the X input in step 1 by 1×1×C convolution to obtain Φ (X; W) Φ ) The number of channels is reduced from C to M; wherein M is C HW, takingW Φ Is a 1 x 1 convolution parameter.
(1.2) the X input in step 1 is subjected to space pooling to 1×1×C, then a vector with dimension M is obtained through 1×1 convolution, and finally p (X; W) is obtained through sigmoid operation p ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein W is p Is a 1 x 1 convolution parameter.
(1.3) calculating an association matrix A:
A=φ(X;W Φ )diag(p(X;W p ))φ(X;W Φ ) T
wherein diag is used to extract matrix diagonal elements.
(3) Calculating a degree matrix D according to the incidence matrix A obtained in the step (2):
wherein,is an all 1 vector of length HW; Λ (X) =diag (p (X; W) p ) Λ (X) is correlated with the input data, a better metric can be learned for the correlation matrix a. Here, the computational complexity required to calculate D is combined by matrix multiplication from O (HWM) 2 +(HW) 2 M+(HW) 2 ) Down to O (HWM) 2 +M 2 )。
(4) Calculating a normalized Laplace matrix L according to the incidence matrix A obtained in the step (2) and the degree matrix D obtained in the step (3):
L=I-D -1/2 AD -1/2
wherein I is an identity matrix; the dimension of L is hw×hw.
(5) The output GR (X) of the graph convolution operation is obtained by:
GR(X)=σ(LXΘ)
wherein GR represents a graph convolution mode proposed by the invention;Θ represents the parameters of the graph convolution, σ is the nonlinear activation function; the computational complexity of LX is derived from O (HWM) 2 +(HW) 2 M+(HW) 2 C+(HW) 2 ) Reduced to O (HWM) 2 +HWCM)。
Although graph reasoning enables capturing global context, we note that the same image contains multiple long-range context patterns. For example, a finer granularity representation may have a more detailed remote context, while a coarser granularity representation may provide more global dependencies. Since our graph inference module is directly implemented in the original feature space, we organize the input features into a spatial pyramid to extend the remote context patterns that our method can capture.
It has a similar form to the feature pyramid network; however, we implement our method on the original features, rather than multi-scale features in the convolutional neural network (Convolutional Neural Networks, CNN) backbone. Graphical reasoning is performed on each scale obtained by downsampling, and then the output features are combined by upsampling, the flow of which is shown in fig. 2.
In the present invention, graph reasoning about the spatial pyramid can be expressed as:
(1) The image is subjected to convolution network to obtain a feature map X of original scale 0 Then to X 0 Downsampling to obtain multi-scale features:
X s+1 =∏ down (X s )
wherein, pi down Representing downsampling; the superscript indicates the scale level, the range is 0-S, X s+1 、X s Features with scale levels s, s+1 are represented.
(2) Carrying out graph convolution on the multi-scale features obtained in the step (1), then carrying out up-sampling and combining the graph convolved output features to finally obtain the output Y of the network 0 The formula is as follows:
Y s =GR(X s )+∏ up (Y s+1 )
wherein Y is s Network output representing a scale feature of level s, and Y S =GR(X S ),X S Features that are of minimum scale; pi-shaped structure up And pi down Respectively, up-sampling and down-sampling, in particular up-sampling is achieved by quadratic linear interpolation and down-sampling is achieved by a max pooling layer.
The invention can be applied to any occasion using the deep neural network, and can be directly cooperated with the deep neural network through residual connection. The effectiveness of the proposed spatial pyramid graph convolution module is demonstrated through the classical task of semantic segmentation. The present invention significantly improves the baseline model and defeats the attention mechanism that is widely used in computer vision.
For semantic segmentation, the embodiment specifically includes the following steps:
step 1, collecting images and labeling correct segmentation results:
natural images of different scenes and under the illumination condition are acquired through the camera lens. Semantic information and categories of objects in the image are annotated at the pixel level. And eliminating errors in the marked data by a mode of marking and taking by multiple persons.
Step 2, establishing an objective function of the semantic segmentation problem:
in a specific implementation, cross entropy is typically used as the loss function. Taking the characteristics of semantic segmentation into consideration, cross entropy can be added to different layers of the depth network as an additional loss function.
And 3, selecting a network structure serving the semantic segmentation task, and adding a spatial pyramid graph rolling module (SpyGR), wherein the whole flow is shown in fig. 3. :
classical ResNet-101 may generally be chosen as the backbone network for the semantic segmentation task. ResNet has been largely improved in generalization ability through pre-training of the ImageNet classification task. To accommodate the needs of the semantic segmentation task, the stride of ResNet c4, c5 can be set to 1, and the condition is changed to 2 and 4, respectively, such that the entire network is downsampled spatially to 8. Placing 3*3 convolutions on top of the ImageNet pre-trained ResNet-101 reduces the number of channels from 2048 to 512 dimensions.
On the basis of the characteristics of network extraction, the spatial pyramid graph convolution module is placed to further obtain better characteristics for complementing the deficiency of the representation of the backbone network extraction and removing redundant information irrelevant to targets. The representation processed by the spatial pyramid convolution module is further reduced to 512 dimensions, and finally pixel-by-pixel classification and interpolation by the full convolutional neural network FCN is performed to recover the original size.
Step 4, preprocessing input data:
for training data sets, the image needs to be transformed to standard size and cropped. Common data enhancements include flipping and multi-scale transforms. In addition, the input data is normalized.
Step 5, determining super parameters of network training:
prior to training, super parameters of the network training, including batch size, learning rate, iteration number, etc., are determined. In the problem of semantic segmentation, different datasets possess different hyper-parameters. For the Cityscapes dataset, the optional superparameter is batch size 8, initial learning rate 0.01, learning rate decay strategy Poly decay, and index 0.9.
Step 6, performing network training:
after the network structure is obtained, the semantic segmentation data for training can be utilized to train the network, and training is stopped after the iteration times are completed. In the implementation example of the invention, the above steps are completed, and the trained deep neural network can be used for executing the semantic segmentation task.
Table 1: network module complexity contrast
Method Floating point calculation amount (GB) Video memory occupation (MB)
Nonlocal 14.60 1072
A 2 -Net 3.11 110
GloRe 3.11 103
SGR 6.24 118
DANet 19.54 1114
SpyGR without pyramid 3.11 120
SpyGR using pyramids 4.12 164
Table 2: cityscapes dataset test set result comparison
Table 1 compares the computational complexity and the spatial complexity of the spatial pyramid convolution module (SpyGR) and other modules of the present invention. Table 2 lists the performance of SpyGR on the Cityscapes dataset, using the index mIoU common to semantic segmentation tasks, the best results are bolded, sub-optimal underlined. It can be seen from the combination of tables 1 and 2 that the spatial pyramid graph convolution module (SpyGR) provided by the present invention achieves the highest performance at lower calculation and storage costs.
Table 3: PASCAL VOC dataset comparison results
Table 3 lists the comparison results of the paspal VOC datasets. The results show that SpyGR is significantly better than Deeplabv3 and Deeplabv3+ on the Cityscapes and PASCAL VOC datasets for a variety of testing methods. The invention provides a graph volume integration algorithm based on a spatial pyramid, and the graph volume integration algorithm is realized as a lightweight neural network module. It achieves better performance while reducing the amount of computation and the amount of parameters.
It should be noted that the purpose of the disclosed embodiments is to aid further understanding of the present invention, but those skilled in the art will appreciate that: various alternatives and modifications are possible without departing from the spirit and scope of the invention and the appended claims. Therefore, the invention should not be limited to the disclosed embodiments, but rather the scope of the invention is defined by the appended claims.

Claims (5)

1. The method for realizing the spatial pyramid graph convolution network is characterized by comprising the following steps of:
(1) The image is convolved to obtain the feature X of original scale 0 Then to X 0 Downsampling to obtain multi-scale features, the formula is as follows:
X s+1 =П down (X s )
wherein the II is a kind of a down Representing downsampling; the superscript indicates the scale level, the range is 0-S, X s+1 、X s Features representing scale levels s, s+1;
(2) Carrying out graph convolution on the multi-scale features obtained in the step (1), then carrying out up-sampling and adding graph convolution output of the previous-stage scale features to obtain the output Y of the network 0 The formula is as follows:
Y s =GR(X s )+П up (Y s+1 )
wherein Y is s Network output representing a scale feature of level s, and Y S =GR(X S ),X S Features that are of minimum scale; GR represents graph convolution; pi (Pi) up Representing upsampling;
the spatial pyramid graph convolution network realized by the step (1) and the step (2) is applied to a deep neural network for semantic segmentation, and comprises the following steps:
s1: collecting images and labeling correct segmentation results;
s2: establishing an objective function of the semantic segmentation problem;
s3: selecting a network structure serving a semantic segmentation task, and adding a spatial pyramid graph convolution network;
s4: preprocessing input data;
s5: determining super parameters of network training;
s6: performing network training;
s7: the trained deep neural network can be used for executing semantic segmentation tasks.
2. The method of claim 1, wherein the graph rolling is performed by:
(2.1) input feature map X ε R H×W×C The method comprises the steps of carrying out a first treatment on the surface of the Wherein H, W, C respectively refer to the height, width and channel number of the feature map X;
(2.2) obtaining an association matrix a, comprising the sub-steps of:
(2.1.1) transformation of X inputted in the step (2.1) by 1X 1 convolution to obtain phi (X; W) with M channels Φ ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein W is Φ Representing a 1 x 1 convolution parameter;
(2.1.2) the X input in step (2.1) is first spatially pooled to 1×1×C, then a vector with dimension M is obtained by 1×1 convolution, and finally p (X; W) is obtained by sigmoid operation p ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein W is p Representing a 1 x 1 convolution parameter;
(2.1.3) calculating the correlation matrix a according to steps (2.1.1) to (2.1.2):
A=Φ(X;W Φ )diag(p(X;W p ))Φ(X;W Φ ) T
(2.3) calculating a degree matrix D according to the incidence matrix A obtained in the step (2.2):
Λ(X)=diag(p(X;W p ))
wherein,is an all 1 vector of length HW;
(2.4) calculating a normalized Laplace matrix L according to the incidence matrix A obtained in the step (2.2) and the degree matrix D obtained in the step (2.3):
L=I-D- 1/2 AD -1/2
wherein I is an identity matrix;
(2.5) the output GR (X) of the graph convolution is obtained by:
LX=X-P(Λ(X)(P T X))
GR(X)=σ(LXΘ)
where Θ represents the parameters of the graph convolution and σ is the nonlinear activation function.
3. The method of claim 2, wherein the channel number M is in a range of about
4. The method of claim 1, wherein the upsampling is performed by quadratic linear interpolation.
5. The spatial pyramid graph convolution network implementation method according to claim 1, wherein said downsampling is implemented by a max pooling layer.
CN202010108770.3A 2020-02-21 2020-02-21 Space pyramid graph convolution network implementation method Active CN111340189B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010108770.3A CN111340189B (en) 2020-02-21 2020-02-21 Space pyramid graph convolution network implementation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010108770.3A CN111340189B (en) 2020-02-21 2020-02-21 Space pyramid graph convolution network implementation method

Publications (2)

Publication Number Publication Date
CN111340189A CN111340189A (en) 2020-06-26
CN111340189B true CN111340189B (en) 2023-11-24

Family

ID=71185318

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010108770.3A Active CN111340189B (en) 2020-02-21 2020-02-21 Space pyramid graph convolution network implementation method

Country Status (1)

Country Link
CN (1) CN111340189B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112101251B (en) * 2020-09-18 2022-06-10 电子科技大学 SAR automatic target recognition method based on variable convolutional neural network
CN113554156B (en) * 2021-09-22 2022-01-11 中国海洋大学 Multitask image processing method based on attention mechanism and deformable convolution
CN116523888B (en) * 2023-05-08 2023-11-03 北京天鼎殊同科技有限公司 Pavement crack detection method, device, equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105956532A (en) * 2016-04-25 2016-09-21 大连理工大学 Traffic scene classification method based on multi-scale convolution neural network
CN107392901A (en) * 2017-07-24 2017-11-24 国网山东省电力公司信息通信公司 A kind of method for transmission line part intelligence automatic identification
CN109509192A (en) * 2018-10-18 2019-03-22 天津大学 Merge the semantic segmentation network in Analysis On Multi-scale Features space and semantic space
CN109727249A (en) * 2018-12-10 2019-05-07 南京邮电大学 One of convolutional neural networks semantic image dividing method
CN110533105A (en) * 2019-08-30 2019-12-03 北京市商汤科技开发有限公司 A kind of object detection method and device, electronic equipment and storage medium
CN110674829A (en) * 2019-09-26 2020-01-10 哈尔滨工程大学 Three-dimensional target detection method based on graph convolution attention network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105956532A (en) * 2016-04-25 2016-09-21 大连理工大学 Traffic scene classification method based on multi-scale convolution neural network
CN107392901A (en) * 2017-07-24 2017-11-24 国网山东省电力公司信息通信公司 A kind of method for transmission line part intelligence automatic identification
CN109509192A (en) * 2018-10-18 2019-03-22 天津大学 Merge the semantic segmentation network in Analysis On Multi-scale Features space and semantic space
CN109727249A (en) * 2018-12-10 2019-05-07 南京邮电大学 One of convolutional neural networks semantic image dividing method
CN110533105A (en) * 2019-08-30 2019-12-03 北京市商汤科技开发有限公司 A kind of object detection method and device, electronic equipment and storage medium
CN110674829A (en) * 2019-09-26 2020-01-10 哈尔滨工程大学 Three-dimensional target detection method based on graph convolution attention network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于多尺寸特征图卷积方法的玉米雄穗检测;吴佳;许立兵;孙立新;行鸿彦;;科学技术与工程(27);全文 *

Also Published As

Publication number Publication date
CN111340189A (en) 2020-06-26

Similar Documents

Publication Publication Date Title
CN111340189B (en) Space pyramid graph convolution network implementation method
CN112750082B (en) Human face super-resolution method and system based on fusion attention mechanism
Lin et al. Hyperspectral image denoising via matrix factorization and deep prior regularization
CN110570351B (en) Image super-resolution reconstruction method based on convolution sparse coding
CN115345866B (en) Building extraction method in remote sensing image, electronic equipment and storage medium
CN113421187B (en) Super-resolution reconstruction method, system, storage medium and equipment
Huang et al. Two-step approach for the restoration of images corrupted by multiplicative noise
CN111899203B (en) Real image generation method based on label graph under unsupervised training and storage medium
CN115019143A (en) Text detection method based on CNN and Transformer mixed model
CN114037888A (en) Joint attention and adaptive NMS (network management System) -based target detection method and system
Peng et al. Building super-resolution image generator for OCR accuracy improvement
Zhuo et al. Ridnet: Recursive information distillation network for color image denoising
CN116246138A (en) Infrared-visible light image target level fusion method based on full convolution neural network
CN115936992A (en) Garbage image super-resolution method and system of lightweight transform
ABAWATEW et al. Attention augmented residual network for tomato disease detection andclassification
Ren et al. Robust low-rank convolution network for image denoising
CN116030495A (en) Low-resolution pedestrian re-identification algorithm based on multiplying power learning
CN116188272B (en) Two-stage depth network image super-resolution reconstruction method suitable for multiple fuzzy cores
CN111275076B (en) Image significance detection method based on feature selection and feature fusion
Xiao et al. Effective PRNU extraction via densely connected hierarchical network
Dong et al. Remote sensing image super-resolution via enhanced back-projection networks
Yu et al. MagConv: Mask-guided convolution for image inpainting
CN116758092A (en) Image segmentation method, device, electronic equipment and storage medium
Jeevan et al. WaveMixSR: Resource-efficient neural network for image super-resolution
CN116309429A (en) Chip defect detection method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant