CN111402405A - Attention mechanism-based multi-view image three-dimensional reconstruction method - Google Patents

Attention mechanism-based multi-view image three-dimensional reconstruction method Download PDF

Info

Publication number
CN111402405A
CN111402405A CN202010205875.0A CN202010205875A CN111402405A CN 111402405 A CN111402405 A CN 111402405A CN 202010205875 A CN202010205875 A CN 202010205875A CN 111402405 A CN111402405 A CN 111402405A
Authority
CN
China
Prior art keywords
attention
layer
reconstruction
feature
view image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010205875.0A
Other languages
Chinese (zh)
Inventor
孔德慧
虞义兰
王少帆
李敬华
王立春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202010205875.0A priority Critical patent/CN111402405A/en
Publication of CN111402405A publication Critical patent/CN111402405A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a multi-view image three-dimensional reconstruction method based on an attention mechanism, which is used for solving the problems that feature sampling is limited to regular lattice points during multi-view reconstruction and features in complementary views cannot be effectively fused during fusion. The invention is divided into an encoding layer, a gravity center constraint distance attention aggregation module and a decoding layer, and specifically comprises the following steps: obtaining a depth feature set P of N elements by an image set containing N pictures through a coding layer; inputting the feature set into a gravity center constraint distance attention aggregation module, outputting the fusion features Y ', and generating a predicted three-dimensional model Y' through deconvolution operation of a decoding layer, and obtaining a prediction result closest to a real three-dimensional model (GT) by minimizing total reconstruction loss. The method adopts the idea of deformable convolution, uses the convolution kernel with offset to carry out convolution operation, realizes dynamic self-adaptive adjustment of the receptive field of convolution operation, and has the advantage of improving the quality of feature extraction; meanwhile, a gravity center constraint term is introduced into the attention gathering module, so that the gathering characteristics keep the weight correlation influence of different characteristics of the input characteristic concentration weight in a gravity center distance constraint mode, the deviation of the fusion characteristics and the input characteristics is balanced, better multi-view fusion characteristics are obtained, and the model reconstruction result is further improved.

Description

Attention mechanism-based multi-view image three-dimensional reconstruction method
Technical Field
The invention belongs to the field of computer vision, and relates to a novel method based on an image learning feature fusion mechanism, which is used for deep learning three-dimensional reconstruction of multi-view images.
Background
Conventional three-dimensional reconstruction methods, represented by Structure From Motion (SFM) and visual synchronous localization and mapping (vS L AM), typically rely on manually defined features and multi-view feature matching to reconstruct three-dimensional models, however, if the baseline between multiple viewpoints is too long, the images can produce significant changes in appearance or self-occlusion, which presents a significant challenge to feature matching and thus reduces the quality of the multi-view reconstructed model.
Some depth learning methods in recent years, which are used to estimate three-dimensional shapes from multiple images and achieve encouraging results, such as 3D-R2N2, L SM, deep mvs, RayNet, and attsets.3d-R2N 2 and L SM all describe multi-view reconstruction as a sequence learning problem and use RNNs to fuse multiple depth features extracted by a shared encoder into an input image.
Although DeepMVS and RayNet realize the invariance of the permutation, they only acquire first-order or first-order moment information from a large depth feature set, completely neglect other features which may be valuable for accurate three-dimensional shape estimation, and cause the reconstruction result to be unsatisfactory. According to the method, the deep features of any number of multi-view image sets are subjected to attention aggregation to replace an RNN module or maximum/average pool operation, so that input sequence order-independent multi-view information fusion reconstruction is realized.
However, there are two major areas of AttSets that need improvement. Firstly, the method adopts the traditional convolution operation to extract the characteristics, the convolution kernel of the traditional convolution is rectangular, and the sampling points involved in the operation are limited to regular lattice points, so that the algorithm is limited to extract the more targeted characteristics on the object to be reconstructed, and the quality of the object reconstruction characteristics is reduced. Secondly, the feature fusion method based on attention aggregation has the advantage of strengthening complementary features irrelevant to view angles, but can correspondingly inhibit some view angle-related features, so that the feature deviation of the fused features and the features with smaller weights in the input feature set is larger, and a plurality of effective information contained in the features cannot be effectively reflected in a reconstruction result, thereby reducing the reconstruction quality of the model.
Aiming at the two problems, a feedforward neural network is designed, on one hand, the idea of deformable convolution is adopted, and a convolution kernel with offset is used for performing convolution operation, so that the receptive field of convolution operation is dynamically and adaptively adjusted, and the characteristic extraction quality is improved; meanwhile, a gravity center constraint term is introduced into the attention gathering module, so that the gathering characteristics keep the weight correlation influence of different characteristics of the input characteristic concentration weight in a gravity center distance constraint mode, the deviation of the fusion characteristics and the input characteristics is balanced, better multi-view fusion characteristics are obtained, and the model reconstruction result is further improved.
Disclosure of Invention
Aiming at the problem of multi-view model reconstruction, the invention provides a strategy of adopting deformable convolution to realize the self-adaptive information extraction of single-view input features and adopting a gravity center constraint distance attention aggregation module to perform multi-view feature fusion, so that the quality of the input image features and fusion features is improved while the invariance of input image replacement is ensured, and a better model reconstruction result can be obtained.
The technical scheme of the invention is shown in figure 1. In general, it can be divided into an encoding layer, a gravity center constrained distance attention-gathering module, and a decoding layer. We get the depth feature set P of N elements from the image set containing N pictures through the coding layer { P }1,p2,…,pN},pn∈R1×DWhere N is an arbitrary value and D is a feature dimension fixed for a particular encoder, inputting the feature set into a gravity center constrained distance attention-aggregation module, and outputting a fused feature y' ∈ R1×DAnd Y 'is subjected to deconvolution operation of a decoding layer to generate a predicted three-dimensional model Y'. We get the prediction result closest to the true three-dimensional model (GT) by minimizing equation (1):
Loss=Lae+Lw(1)
l thereinaeFor coding losses, LwIs a center of gravity constraint term.
The encoding operation is as follows: and (3) passing the image set containing N pictures through an encoding layer to obtain a depth feature set P of N elements. One of our innovations is at the coding layer, which is shown in FIG. 2 together with the coding layers of Atttsets. The idea of deformable convolution is adopted, convolution operation is carried out by using a convolution kernel with offset, the receptive field of convolution operation is adjusted dynamically and adaptively, the feature extraction quality is improved, and features with higher expression ability are extracted. Since replacing the conventional convolutional layer with a deformable convolution increases the amount of computation, we find through experiments that the effect of replacing every other layer is almost the same, so half of the conventional convolutional layer is replaced here. The addition of a deformable convolution for coding is one of the innovative points of the inventor.
The gravity center constraint distance attention aggregation module fuses the depth feature sets P of the N elements, loss is calculated in the link, and coding loss is calculated by using cross entropy:
Lae=Y log Y'+(1-Y)log(1-Y') (9)
the coding loss in the present invention is obtained by means of attention aggregation, so that they can also be called attention module. Because the attention aggregation module ignores the characteristics of the view angles with smaller weight, the gravity center constraint distance module is added to ensure that the sum of the distances between the input characteristics and the fusion output characteristics of each view angle is smaller, which is a second innovation point of us, and the calculation formula is as follows:
Figure BDA0002421079110000031
d for measuring piDistance from y', where we choose Euclidean distance for the metric and set λi1. And adding the equations (9) and (10) to obtain the total reconstruction loss corresponding to the model.
Advantageous effects
The method aims to realize dynamic self-adaptive adjustment of the receptive field of convolution operation and improve the characteristic extraction quality by using partial deformable convolution in a coding layer and using a convolution kernel with offset to carry out convolution operation; meanwhile, a gravity center constraint term is introduced into the attention gathering module, so that the gathering characteristics keep the weight correlation influence of different characteristics of the input characteristic concentration weight in a gravity center distance constraint mode, the deviation of the fusion characteristics and the input characteristics is balanced, better multi-view fusion characteristics are obtained, and the model reconstruction result is further improved.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2(a) a coding layer representation of AttSets;
FIG. 2(b) is a representation of the coding layers of the present invention;
FIG. 3(a) is a representation of a conventional convolution rule sample;
FIG. 3(b) is a representation of a deformable convolution sample;
FIG. 4 is a decoding layer representation;
FIG. 5 is a representation of the attention-aggregation module of AttSets;
FIG. 6 is a representation of an attention focus module for gravity-constrained distance;
FIG. 7 is an exemplary graph of single-view reconstruction results on a ShapeNet test set;
FIG. 8 is an exemplary graph of multi-view reconstruction results on a ShapeNet test set.
Detailed Description
The following describes in detail a multi-view image three-dimensional reconstruction method based on the attention mechanism, and the structure of the method includes: the coding layer, the gravity center constraining distance attention module and the decoding layer are specifically implemented as follows:
step 1 encoding an input image
Step 1.1, inputting an image set containing N pictures into a coding layer, wherein the specific structure of the coding layer is shown in fig. 2 (b).
The conventional two-dimensional convolution includes two steps, the first step: sampling is performed on the input feature map x using a regular grid V, which defines the size and step size of the convolution kernel. The second step is that: the sampled values weighted by w are summed. For example, a step size of 1, size 3 x 3, V { (-1, -1), (-1,0), …, (0,1), (1,1) }. Each position a in the feature map p of the encoding operation output of the conventional two-dimensional convolution0Is represented as follows:
Figure BDA0002421079110000041
wherein a ismThe sample point locations in V are enumerated.
It is one of our innovations to introduce a deformable convolution for feature extraction.Deformable is an improvement over conventional rectangular convolution and essentially a sampling improvement.Deformable convolution adds a sample point offset { △ a) to the regular grid Vm1, …, M, where M is | V |. Then, the formula (3) evolves as:
Figure BDA0002421079110000042
after the deformable convolution coding outputs the feature map p, we deform it into 1 × D feature piI represents the ith viewing angleFigure (a).
All the features are coded to obtain a depth feature set P ═ P1,p2,…,pN},pn∈R1×DWhere N is an arbitrary value and D is a characteristic dimension fixed for a particular encoder
Step 2 gravity center constraint distance attention module
Step 2.1 we input each element of the feature set P into an activation function h, which can be a standard neural layer, an optional linear translation layer, followed by a non-linear activation function. Here we use one fully connected layer and one tanh layer as an example, and the bias term is omitted for simplicity. The output of the function h is a set of learned attention activations Z ═ { Z ═ Z1,z2,…,zNTherein of
zn=h(pn,W)=tanh(pn,W),(pn∈R1×D,W∈RD×D,zn∈R1×D) (5)
Step 2.2 normalizes the N learned attention-activating elements and calculates a set of attention scores S ═ S1,s2,…,sN}. We choose softmax as the normalization operation, so the attention score of the nth element is:
Figure BDA0002421079110000051
wherein
Figure BDA0002421079110000052
Is znItem d of (1).
The attention scores S calculated in step 2.3 are multiplied by their corresponding set of raw features P to generate a new set of deep features, called weighted features C ═ C1,c2,…,cNTherein of
cn=pn×sn,(pn∈R1×D,sn∈R1×1,cn∈R1×D) (7)
Step 2.4 sums the set of weighted features for all N elements to obtain a fixed size feature vector denoted y', where
Figure BDA0002421079110000053
Wherein
Figure BDA0002421079110000054
Is cnItem d of (1).
Step 2.5 we introduce the center of gravity to constrain the distance, which is our second innovation point. With D (p)i,pw) Represents piAnd pwThe barycentric constraint distance is calculated as:
Figure BDA0002421079110000055
d for measuring piDistance from y', where we choose Euclidean distance for the metric and set λi=1。
Step 3, decoding the fusion characteristics to obtain a predicted three-dimensional model
And 3.1, decoding the fused features Y ' to obtain a predicted three-dimensional model, and decoding Y ' to obtain a predicted three-dimensional model Y '.
Step 3.2 encoding loss LaeAnd calculating the cross entropy of the real voxel model Y and the predicted three-dimensional model Y'.
Lae=Y log Y'+(1-Y)log(1-Y') (9)
And 3.3, adding the formulas (9) and (10) to obtain the total reconstruction loss corresponding to the model.
Loss=Lae+Lw(1)
The single-view experimental result is shown in table 1, wherein the Ours-de correspondence only introduces deformable convolution in the coding layer, and does not use the feature aggregation reconstruction model constrained by the center of gravity; ours-ba represents a reconstructed model that still uses conventional convolution features but introduces a barycentric constraint term; ours-com represents a reconstruction model of feature clustering that introduces both variability convolution features and center of gravity constraints. Table 2 shows the average IoU of the multi-view reconstruction for different numbers of views. Experimental results show that both the two improvements of the scheme have gains, the deformable convolution can extract more expressive characteristics to generate mass gains, the gravity center constraint term obviously contributes to the gains of the reconstruction mass, and the two improvements can be combined to obtain a better result.
TABLE 1 Single class average IoU for single view reconstruction on ShapeNet. The best number for each category is highlighted in bold.
Figure BDA0002421079110000061
Table 2 average IoU of all 13 classes of multi-view reconstructions on sharenet, the best results for different numbers of views are highlighted in bold.
Figure BDA0002421079110000062

Claims (6)

1. A multi-view image three-dimensional reconstruction method based on an attention mechanism is characterized by comprising the following steps:
(1) obtaining a depth feature set P ═ P of N elements by passing an image set containing N pictures through an encoding layer1,p2,...,pN},pn∈R1×DWhere N is an arbitrary value and D is a characteristic dimension that is fixed for a particular encoder;
(2) inputting the feature set P into a gravity center constraint distance attention aggregation module, and outputting a fused feature y' ∈ R1×D
(3) And Y 'is subjected to deconvolution operation of a decoding layer to generate a predicted three-dimensional model Y', and a prediction result closest to the real three-dimensional model GT is obtained by minimizing the total reconstruction loss.
2. The method for three-dimensional reconstruction of multi-view image based on attention mechanism as claimed in claim 1, wherein: the coding layer employs an AttSets coding layer and is improved by replacing the conventional convolutional layers of all even layers with deformable convolutional layers.
3. The method for three-dimensional reconstruction of multi-view image based on attention mechanism as claimed in claim 1, wherein: the attention-focusing module described is specifically as follows,
(1) each element of the feature set P is input to an activation function h, the output of which is a learned set of attention activations Z ═ Z1,z2,...,zNIn which z isn=h(pnW), W is the learnable weight to be trained, pn∈R1×D,W∈RD×D,zn∈R1×D
(2) N attention activated elements z to be learnednNormalization is performed to calculate a set of attention scores S ═ S1,s2,...,sNSelecting softmax as the normalization operation, so the attention score of the nth element is:
Figure FDA0002421079100000011
wherein the content of the first and second substances,
Figure FDA0002421079100000012
is the d-th term of zn;
(3) the calculated attention scores S are multiplied by their corresponding set of raw features P to generate a new set of deep features, called weighted features C ═ C1,c2,...,cNTherein of
cn=pn×sn,pn∈R1×D,sn∈R1×1,cn∈R1×D(7)
(4) Summing the weighted feature sets of all the N elements to obtain a feature vector of fixed size, denoted as y', as follows:
Figure FDA0002421079100000013
wherein
Figure FDA0002421079100000014
Is cnItem d of (1).
4. The method for three-dimensional reconstruction of multi-view image based on attention mechanism as claimed in claim 3, wherein: the activation function h is a standard neural layer, an optional linear conversion layer, followed by a nonlinear activation function.
5. The method for three-dimensional reconstruction of multi-view image based on attention mechanism as claimed in claim 4, wherein: the activation function h is preferably one fully connected layer and one tanh layer.
6. The method for three-dimensional reconstruction of multi-view image based on attention mechanism as claimed in claim 1, wherein: the total reconstruction loss is as follows:
Loss=Lae+Lw
wherein the content of the first and second substances,
coding loss LaeThe cross entropy of the real voxel model Y and the predicted three-dimensional model Y' is calculated, as follows,
Lae=YlogY′+(1-Y)log(1-Y′),
loss term L for center of gravity constrained distancewSpecifically, as follows, the following description will be given,
Figure FDA0002421079100000021
D(piy') denotes piAnd y', preferably in Euclidean distance.
CN202010205875.0A 2020-03-23 2020-03-23 Attention mechanism-based multi-view image three-dimensional reconstruction method Pending CN111402405A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010205875.0A CN111402405A (en) 2020-03-23 2020-03-23 Attention mechanism-based multi-view image three-dimensional reconstruction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010205875.0A CN111402405A (en) 2020-03-23 2020-03-23 Attention mechanism-based multi-view image three-dimensional reconstruction method

Publications (1)

Publication Number Publication Date
CN111402405A true CN111402405A (en) 2020-07-10

Family

ID=71432775

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010205875.0A Pending CN111402405A (en) 2020-03-23 2020-03-23 Attention mechanism-based multi-view image three-dimensional reconstruction method

Country Status (1)

Country Link
CN (1) CN111402405A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112785684A (en) * 2020-11-13 2021-05-11 北京航空航天大学 Three-dimensional model reconstruction method based on local information weighting mechanism
CN113095172A (en) * 2021-03-29 2021-07-09 天津大学 Point cloud three-dimensional object detection method based on deep learning
CN113129310A (en) * 2021-03-04 2021-07-16 同济大学 Medical image segmentation system based on attention routing

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109285217A (en) * 2018-09-10 2019-01-29 中国科学院自动化研究所 Process type plant model method for reconstructing based on multi-view image
CN110570522A (en) * 2019-08-22 2019-12-13 天津大学 Multi-view three-dimensional reconstruction method
CN110689008A (en) * 2019-09-17 2020-01-14 大连理工大学 Monocular image-oriented three-dimensional object detection method based on three-dimensional reconstruction
CN110853130A (en) * 2019-09-25 2020-02-28 咪咕视讯科技有限公司 Three-dimensional image generation method, electronic device, and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109285217A (en) * 2018-09-10 2019-01-29 中国科学院自动化研究所 Process type plant model method for reconstructing based on multi-view image
CN110570522A (en) * 2019-08-22 2019-12-13 天津大学 Multi-view three-dimensional reconstruction method
CN110689008A (en) * 2019-09-17 2020-01-14 大连理工大学 Monocular image-oriented three-dimensional object detection method based on three-dimensional reconstruction
CN110853130A (en) * 2019-09-25 2020-02-28 咪咕视讯科技有限公司 Three-dimensional image generation method, electronic device, and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BO YANG 等: "Robust Attentional Aggregation of Deep Feature Sets For Multi-view 3D Reconstruction", 《INTERNATIONAL JOURNAL OF COMPUTER VISION》 *
刘伟 等: "基于多视图的三维空间点精确重建算法", 《计算机工程与设计》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112785684A (en) * 2020-11-13 2021-05-11 北京航空航天大学 Three-dimensional model reconstruction method based on local information weighting mechanism
CN112785684B (en) * 2020-11-13 2022-06-14 北京航空航天大学 Three-dimensional model reconstruction method based on local information weighting mechanism
CN113129310A (en) * 2021-03-04 2021-07-16 同济大学 Medical image segmentation system based on attention routing
CN113129310B (en) * 2021-03-04 2023-03-31 同济大学 Medical image segmentation system based on attention routing
CN113095172A (en) * 2021-03-29 2021-07-09 天津大学 Point cloud three-dimensional object detection method based on deep learning

Similar Documents

Publication Publication Date Title
CN110503598B (en) Font style migration method for generating countermeasure network based on conditional cycle consistency
CN108717568B (en) A kind of image characteristics extraction and training method based on Three dimensional convolution neural network
CN111091045B (en) Sign language identification method based on space-time attention mechanism
CN108921893B (en) Image cloud computing method and system based on online deep learning SLAM
CN111798369B (en) Face aging image synthesis method for generating confrontation network based on circulation condition
CN111145116B (en) Sea surface rainy day image sample augmentation method based on generation of countermeasure network
CN111402405A (en) Attention mechanism-based multi-view image three-dimensional reconstruction method
Ma et al. Facial expression recognition using constructive feedforward neural networks
CN109166144B (en) Image depth estimation method based on generation countermeasure network
WO2019227479A1 (en) Method and apparatus for generating face rotation image
CN110503680A (en) It is a kind of based on non-supervisory convolutional neural networks monocular scene depth estimation method
CN112818764B (en) Low-resolution image facial expression recognition method based on feature reconstruction model
CN110288697A (en) 3D face representation and method for reconstructing based on multiple dimensioned figure convolutional neural networks
EP4377898A1 (en) Neural radiance field generative modeling of object classes from single two-dimensional views
CN107316004A (en) Space Target Recognition based on deep learning
CN114359292A (en) Medical image segmentation method based on multi-scale and attention
CN113538608A (en) Controllable character image generation method based on generation countermeasure network
CN112634438A (en) Single-frame depth image three-dimensional model reconstruction method and device based on countermeasure network
CN113688765A (en) Attention mechanism-based action recognition method for adaptive graph convolution network
CN110263203B (en) Text-to-image generation method combined with Pearson reconstruction
Jiang et al. Multi-level memory compensation network for rain removal via divide-and-conquer strategy
Fang et al. One is all: Bridging the gap between neural radiance fields architectures with progressive volume distillation
CN108805844B (en) Lightweight regression network construction method based on prior filtering
CN113222808A (en) Face mask removing method based on generative confrontation network
CN115439849B (en) Instrument digital identification method and system based on dynamic multi-strategy GAN network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination