CN114550162A - Three-dimensional object identification method combining view importance network and self-attention mechanism - Google Patents
Three-dimensional object identification method combining view importance network and self-attention mechanism Download PDFInfo
- Publication number
- CN114550162A CN114550162A CN202210143670.3A CN202210143670A CN114550162A CN 114550162 A CN114550162 A CN 114550162A CN 202210143670 A CN202210143670 A CN 202210143670A CN 114550162 A CN114550162 A CN 114550162A
- Authority
- CN
- China
- Prior art keywords
- view
- importance
- dimensional object
- views
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 230000000007 visual effect Effects 0.000 claims abstract description 28
- 239000011159 matrix material Substances 0.000 claims description 12
- 238000009877 rendering Methods 0.000 claims description 6
- 238000010586 diagram Methods 0.000 claims description 5
- 238000013507 mapping Methods 0.000 claims description 4
- 239000000284 extract Substances 0.000 claims description 3
- 230000017105 transposition Effects 0.000 claims description 3
- 230000002708 enhancing effect Effects 0.000 claims description 2
- 230000009286 beneficial effect Effects 0.000 abstract description 4
- 238000012545 processing Methods 0.000 abstract description 4
- 238000005728 strengthening Methods 0.000 abstract 2
- 238000012549 training Methods 0.000 description 5
- 238000011176 pooling Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 241000282414 Homo sapiens Species 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The invention discloses a three-dimensional object identification method combining a view importance network and a self-attention mechanism. The method comprises the following steps: projecting a three-dimensional object to be identified from n different visual angles to obtain n different two-dimensional views, wherein n is greater than or equal to two; extracting the features of the n views through a basic CNN model to obtain feature maps of the corresponding views; judging the importance degree of each of the n views to the three-dimensional object recognition through the view importance network, and strengthening the features to different degrees according to the importance degree to obtain a view strengthening feature map; processing the view enhancement characteristic graph by using a self-attention mechanism to obtain a three-dimensional shape descriptor; and inputting the three-dimensional shape descriptor into a full-connection network to perform multi-view object recognition, so as to realize three-dimensional object recognition. The method and the device can highlight the important views beneficial to the three-dimensional object recognition, inhibit the interference of the non-important views on the three-dimensional object recognition, and improve the three-dimensional object recognition accuracy.
Description
Technical Field
The invention belongs to the technical field of computer vision, and relates to a three-dimensional object identification method combining a view importance network and a self-attention mechanism.
Background
With the development of indoor robots and computer vision in recent years, it has become practical for indoor robots to actively find and grab objects indoors for human beings, and how to accurately recognize three-dimensional objects is one of the basic problems in the field. With the open source of ModelNet project at Princeton university, a comprehensive and clear three-dimensional object model set is provided for researchers, and various methods are developed in the field of three-dimensional object recognition. The three-dimensional object identification method can be divided into three categories according to different input data types: point cloud based three-dimensional object recognition, voxel based three-dimensional object recognition, and multi-view based three-dimensional object recognition.
The method for identifying the three-dimensional object based on the point cloud generally comprises the steps of directly performing convolution processing on unordered point cloud collected by data acquisition equipment to obtain category information of the three-dimensional object; the voxel-based three-dimensional object identification method generally divides disordered point clouds into blocks, forms voxel data, and then obtains the category information of a three-dimensional object by using a convolution processing method. The two methods have the problems of expensive data acquisition equipment, high data dimension, high processing cost and the like, and are difficult to be widely applied to daily life. The multi-view-based method obtains more attention due to the fact that data are easy to obtain and convenient to process, and the multi-view-based three-dimensional object recognition method obtains an optimal recognition result due to the fact that a large-scale data set such as ImageNet is used for CNN model pre-training and the like, and becomes a mainstream method.
The multi-view-based three-dimensional object identification method generally renders a three-dimensional object model from multiple viewing angles, further obtains multiple views of a three-dimensional object to be identified, and classifies the obtained multiple views by applying a convolutional network. For example, Su et al propose a multi-view-based three-dimensional object recognition method, named MVCNN, which is superior to most point cloud and voxel-based methods. However, the MVCNN method uses the maximum pooling method, and most of the view information of the three-dimensional object is lost, so that the multi-view-based three-dimensional object identification method needs to be further researched and researched.
Disclosure of Invention
The invention provides a three-dimensional object recognition method combining a view importance network and a self-attention mechanism, aiming at overcoming the defects of the existing three-dimensional object recognition method based on multiple views, the method firstly calculates the importance score of each view in the multiple views through the view importance network, enhances the importance scores in different degrees according to the corresponding importance scores, strengthens the expression beneficial to the three-dimensional object recognition view through the view importance network, and then fuses non-local information among different views through the self-attention mechanism to further enhance the feature expression of the multiple views. The feature expression of the multiple views of the three-dimensional object is enhanced by combining the view importance network with the self-attention mechanism, and experimental results show that the accuracy is effectively improved by utilizing the enhanced multiple views for identification and classification, and the method is proved to have good performance.
In order to realize the aim, the technical scheme of the invention is as follows: step 1, projecting a three-dimensional object to be identified from n different visual angles to obtain n different two-dimensional views, wherein n is greater than or equal to two; step 2, extracting the features of the n views through a basic CNN model to obtain feature maps of the corresponding views; step 3, outputting the importance scores of the n views for the three-dimensional object recognition through a view importance network, wherein the higher the score is, the richer key information used for object recognition contained in the view is represented, and the features are reinforced according to the importance scores to obtain a view enhanced feature map; step 4, processing the view enhancement feature map by using a self-attention mechanism to obtain a cross-view enhancement feature map; and 5, inputting the three-dimensional shape descriptor into a full-connection network to perform multi-view object identification, so as to realize three-dimensional object identification.
The three-dimensional object identification method combining the view importance network and the self-attention mechanism provided by the invention can also have the following characteristics: wherein, step 1 includes: modeling a three-dimensional object from nProjecting from each view angle, and acquiring n rendering views V ═ V of the object1,v2,...,vnIn which v isiIs the ith view of the object.
The three-dimensional object identification method combining the view importance network and the self-attention mechanism provided by the invention can also have the following characteristics: wherein, step 2 includes: rendering view V ═ V1,v2,...,vnExtracting initial visual characteristic graphs Z ═ Z of n views through a basic CNN model1,z2,...,znIn which z isiIs the ith view of the object, zi∈RC×H×W,Z∈Rn×C×H×WWhere n represents the number of multiple views, C represents the number of channels per visual feature map, H represents the height of each visual feature map, and W represents the width of each visual feature map.
The three-dimensional object identification method combining the view importance network and the self-attention mechanism provided by the invention can also have the following characteristics: wherein, step 3 includes: the initial visual feature map Z of n views is set as Z1,z2,...,znThe input is to the view importance network, which will score each view, as in equation (1),
Score=Softmax{f(z1),f(z2),...,f(zn)}, (1)
in the formula (1), f represents a network layer for scoring the importance of the view, and through training, the network layer can score the importance of the view according to the information richness of the view characteristics, so that the characteristics containing visual characteristics and rich information can be highlighted; the Softmax function ensures that the sum of the importance of each view is 1, and avoids the occurrence of great difference of the importance scores of the views; the initial profile of the view will be multiplied by its importance and added to its initial profile, as in equation (2),
pi=zi+Scorei*zi, (2)
in the formula (2), ziIs the ith view of the objectInitial visual feature map of (1), ScoreiRepresenting the scoring of the view importance network to the ith view importance. Multiplying the initial characteristic map of each view by the importance of the initial characteristic map, and adding the initial characteristic map to obtain n view enhanced characteristic maps P { P } of the three-dimensional object1,p2,...,pn},pi∈RC×H×W,P∈Rn×C×H×W。
The three-dimensional object identification method combining the view importance network and the self-attention mechanism provided by the invention can also have the following characteristics: wherein, step 4 comprises the following substeps:
step 4-1, view enhancement feature map P ═ { P ═ P1,p2,...,pnIs respectively input into three convolution networks to generate a new feature mapping Pq,PkAnd Pv,Pq,Pk,Pv∈Rn×C×H×W. Will PkPerforming transposition operation and matching with PqMatrix multiplication is carried out to obtain the incidence relation of the characteristic diagram on the space, such as formula (3),
in formula (3), S represents the similarity, i and m are the index of the viewing angle, where i, m is the [1, n ]]N is the number of viewing angles, L is the same as W2All spatial positions in a single view profile are represented, and in summary, SimThe relationship between the view enhancement features of any spatial position under any view angle and the features of any spatial position in all view angles is included, and the stronger the incidence relationship is, the larger the weight in the matrix is.
Step 4-2, adding SimAnd PvMatrix multiplication is carried out to obtain a cross-view enhancement feature map A ═ a1,a2,...,aN},ai∈RC×H×W,A∈Rn×C×H×W. Through a self-attention mechanism, the locality of the features is broken, the non-local feature enhancement across the visual angles is realized, and any space of any visual angle is enabledThe representation of the characteristics is richer, and the expression of the view characteristics is effectively enhanced.
The three-dimensional object identification method combining the view importance network and the self-attention mechanism provided by the invention can also have the following characteristics: wherein, step 5 includes:
enhancing feature map a across view angles { a ═ a }1,a2,...,aNDimension reduction is carried out through 1 × 1 convolution, wherein the 1 × 1 convolution extracts features in a cross-view mode, and the problem of information loss caused by maximum pooling is avoided. And inputting the features subjected to dimension reduction into the full-connection layer for classification, so as to realize the identification of the three-dimensional object.
Advantageous effects
1) Different views are weighted correspondingly through the view importance network, so that the views beneficial to three-dimensional object recognition can be highlighted, and meanwhile, the expression of non-important views is inhibited; 2) non-local information among different views is fused through a self-attention mechanism, and cross-view spatial feature enhancement is achieved, so that feature expression of multiple views is further enhanced; 3) and 1 × 1 convolution is adopted to replace the maximum pooling operation to perform feature dimension reduction, so that the reduction of identification precision caused by information loss is avoided.
Drawings
FIG. 1 is a schematic diagram of a network framework for the method of the present invention;
FIG. 2 is an example experimental result of a view importance network proposed by the present invention;
FIG. 3 is a schematic diagram of a self-attention mechanism in an embodiment of the present invention;
Detailed Description
The method is realized based on an open source tool Pythrch of deep learning, and a network model is trained by using a GPU processor NVIDIA GTX 3090.
The various block configurations of the method of the present invention are further described in conjunction with the accompanying drawings and the detailed description, it is to be understood that the detailed description is provided for purposes of illustration only and is not intended to limit the scope of the invention, which is defined by the claims appended hereto.
The composition and flow of the network framework of the invention are shown in fig. 1, and the invention specifically comprises the following steps:
step 1, projecting a three-dimensional object model from n viewing angles, and further acquiring n rendering views V ═ V of the object1,v2,...,vnIn which v isiFor the ith view of the object, n is set to 12 in this experiment, i.e., 12 views are used for three-dimensional object recognition.
Step 2, changing the rendering view V to { V ═ V1,v2,...,vnExtracting initial visual characteristic graphs Z ═ Z of n views through a basic CNN model1,z2,...,znIn which z isiIs the ith view of the object. Specifically, a pre-training VGG network for object recognition on a single image is adopted, the last full-connection layer is removed, the rest of networks are reserved, and initial visual feature map extraction is carried out.
Step 3, changing the initial visual characteristic map Z of the n views to Z ═ Z1,z2,...,znThe view importance network will score each view, as in equation (1),
Score=Softmax{f(z1),f(z2),...,f(zn)}, (1)
in the formula (1), f represents a network layer for scoring the importance of the view, and through training, the network layer can score the importance of the view according to the information richness of the view characteristics, so that the characteristics containing rich visual characteristic information can be highlighted; the Sofimax function ensures that the sum of the importance of each view is 1, and avoids the occurrence of great difference of the importance scores of the views; the initial profile of the view will be multiplied by its importance and added to its initial profile, as in equation (2),
pi=zi+Scorei*zi, (2)
in the formula (2), ziIs an initial visual feature map of the ith view of the object, ScoreiRepresenting view importance network versus ith viewScoring of importance. Multiplying the initial characteristic map of each view by the importance of the initial characteristic map, and adding the initial characteristic map to obtain n view enhanced characteristic maps P { P } of the three-dimensional object1,p2,...,pn},pi∈RC×H×W,P∈Rn×C×H×W。
As shown in fig. 2, a sample of importance assignment of a view importance network to twelve different viewing angles after an original three-dimensional object is rendered by an airplane is shown in the figure, and the view importance network is used for enabling a view which is beneficial to three-dimensional object identification and contains rich information of an object to obtain more attention, and simultaneously endowing a view which lacks significant characteristics of the object with a lower weight, thereby reducing interference.
Step 4 comprises the following substeps:
step 4-1, get view enhancement feature map P ═ P1,p2,...,pnIs respectively input into three convolution networks to generate a new feature mapping Pq,PkAnd Pv,Pq,Pk,Pv∈Rn×C×H×W. Will PkPerforming transposition operation and matching with PqMatrix multiplication is carried out to obtain the incidence relation of the characteristic diagram on the space, such as formula (3),
in formula (3), S represents the similarity, i and m are the index of the viewing angle, where i, m is the [1, n ]]N is the number of viewing angles, L is the same as W2All spatial positions in a single view profile are represented, and in summary, SimThe relationship between the view enhancement features of any spatial position under any view angle and the features of any spatial position in all view angles is included, and the stronger the incidence relationship is, the larger the weight in the matrix is.
Step 4-2, adding SimAnd PvMatrix multiplication is carried out to obtain a cross-view enhancement feature map A ═ a1,a2,...,aN},ai∈RC×H×W,A∈Rn×C×H×W. Through a self-attention mechanism, the locality of the features is broken, the non-local feature enhancement across the visual angles is realized, the feature representation on any space of any visual angle is richer, and the expression of the view features is effectively enhanced.
As shown in fig. 3, which shows a sample of the non-local feature enhancement across viewing angles from the attention mechanism, the features for the input N viewing angles will be input to theta,and g, carrying out feature mapping on the three convolution layer networks to respectively obtain Pq,PkAnd Pv。PqWith P after inversionkAnd carrying out matrix multiplication to obtain a similarity matrix, wherein the similarity matrix comprises the relationship between the characteristic of each spatial position and the characteristics of other spatial positions. By associating a similarity matrix with PvAnd multiplying to realize the non-local characteristic enhancement across the visual angles and outputting the characteristics of N visual angles.
Step 5, the cross-view enhancement feature map A is set as { a ═ a1,a2,...,aNDimension reduction is carried out through 1 × 1 convolution, wherein the 1 × 1 convolution extracts features in a cross-view mode, and the problem of information loss caused by maximum pooling is avoided. And inputting the features subjected to dimension reduction into the full-connection layer for classification, so as to realize the identification of the three-dimensional object.
In this embodiment, a comparison experiment is also performed on the three-dimensional object recognition method combining the view importance network and the self-attention mechanism to evaluate the classification recognition effect. We selected the ModelNet40 dataset, commonly used for identifying three-dimensional objects at Princeton university, for experiment and evaluation, and the ModelNet40 dataset contains models of 12311 three-dimensional objects of 40 categories, of which 9843 are classified as training sets and 2468 are classified as test sets. The number of samples in the ModelNet40 dataset is unequal between different classes, so we obey the two indicators of average instance precision (instacc) and average Class precision (Class Acc) reported in other works, where the average instance precision (instacc) calculates the percentage of correct predictions in all samples, and the average Class precision (Class Acc) is the average of the precision for each Class.
Claims (1)
1. A three-dimensional object identification method combining a view importance network and a self-attention mechanism is characterized in that:
the step 1 comprises the following steps: projecting a three-dimensional object model from n visual angles, and acquiring n rendering views V ═ V of the object1,v2,...,vnIn which v isiIs the ith view of the object;
the step 2 comprises the following steps: rendering view V ═ V1,v2,...,vnExtracting initial visual characteristic graphs Z ═ Z of n views through a basic CNN model1,z2,...,znIn which z isiIs the ith view of the object, zi∈RC×H×W,Z∈Rn×C×H×WWherein n represents the number of multiple views, C represents the number of channels per visual feature map, H represents the height of each visual feature map, and W represents the width of each visual feature map;
the step 3 comprises the following steps: the initial visual feature map Z of n views is set as Z1,z2,...,znThe input is to the view importance network, which will score each view, as in equation (1),
Score=Softmax{f(z1),f(z2),...,f(zn)}, (1)
in formula (1), f represents a network layer that scores the importance of the view; the Softmax function ensures that the sum of the importance of each view is 1, and avoids the occurrence of great difference of the importance scores of the views; the initial profile of the view will be multiplied by its importance and added to its initial profile, as in equation (2),
pi=zi+Scorei*zi, (2)
in the formula (2), ziIs an initial visual feature map of the ith view of the object, ScoreiRepresenting importance of view importance network to ith viewGrading; multiplying the initial characteristic map of each view by the importance of the initial characteristic map, and adding the initial characteristic map to obtain n view enhanced characteristic maps P { P } of the three-dimensional object1,p2,…,pn},pi∈RC×H×W,P∈Rn×C×H×W;
Step 4 comprises the following substeps:
step 4-1, view enhancement feature map P ═ { P ═ P1,p2,...,pnIs respectively input into three convolution networks to generate a new feature mapping Pq,PkAnd Pv,Pq,Pk,Pv∈Rn×C×H×W(ii) a Will PkPerforming transposition operation and matching with PqMatrix multiplication is carried out to obtain the incidence relation of the characteristic diagram on the space, such as formula (3),
in formula (3), S represents the similarity, i and m are the index of the viewing angle, where i, m is the [1, n ]]N is the number of viewing angles, L is the same as W2Representing all spatial positions in a single view profile;
step 4-2, adding SimAnd PvMatrix multiplication is carried out to obtain a cross-view enhancement feature map A ═ a1,a2,...,aN},ai∈RC ×H×W,A∈Rn×C×H×W(ii) a Through a self-attention mechanism, the locality of the features is broken, and the non-local feature enhancement across the visual angles is realized;
the step 5 comprises the following steps:
enhancing feature map a across view angles { a ═ a }1,a2,...,aNAnd dimension reduction is carried out through 1 × 1 convolution, wherein the 1 × 1 convolution extracts features in a view angle crossing mode, and the features subjected to dimension reduction are input into a full-connection layer to be classified, so that the identification of the three-dimensional object is realized.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210143670.3A CN114550162B (en) | 2022-02-16 | 2022-02-16 | Three-dimensional object recognition method combining view importance network and self-attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210143670.3A CN114550162B (en) | 2022-02-16 | 2022-02-16 | Three-dimensional object recognition method combining view importance network and self-attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114550162A true CN114550162A (en) | 2022-05-27 |
CN114550162B CN114550162B (en) | 2024-04-02 |
Family
ID=81675286
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210143670.3A Active CN114550162B (en) | 2022-02-16 | 2022-02-16 | Three-dimensional object recognition method combining view importance network and self-attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114550162B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB201912054D0 (en) * | 2018-11-13 | 2019-10-09 | Adobe Inc | Object detection in images |
CN110610129A (en) * | 2019-08-05 | 2019-12-24 | 华中科技大学 | Deep learning face recognition system and method based on self-attention mechanism |
CN112784782A (en) * | 2021-01-28 | 2021-05-11 | 上海理工大学 | Three-dimensional object identification method based on multi-view double-attention network |
CN113065450A (en) * | 2021-03-29 | 2021-07-02 | 重庆邮电大学 | Human body action recognition method based on separable three-dimensional residual error attention network |
CN113159232A (en) * | 2021-05-21 | 2021-07-23 | 西南大学 | Three-dimensional target classification and segmentation method |
EP3923183A1 (en) * | 2020-06-11 | 2021-12-15 | Tata Consultancy Services Limited | Method and system for video analysis |
-
2022
- 2022-02-16 CN CN202210143670.3A patent/CN114550162B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB201912054D0 (en) * | 2018-11-13 | 2019-10-09 | Adobe Inc | Object detection in images |
CN110610129A (en) * | 2019-08-05 | 2019-12-24 | 华中科技大学 | Deep learning face recognition system and method based on self-attention mechanism |
EP3923183A1 (en) * | 2020-06-11 | 2021-12-15 | Tata Consultancy Services Limited | Method and system for video analysis |
CN112784782A (en) * | 2021-01-28 | 2021-05-11 | 上海理工大学 | Three-dimensional object identification method based on multi-view double-attention network |
CN113065450A (en) * | 2021-03-29 | 2021-07-02 | 重庆邮电大学 | Human body action recognition method based on separable three-dimensional residual error attention network |
CN113159232A (en) * | 2021-05-21 | 2021-07-23 | 西南大学 | Three-dimensional target classification and segmentation method |
Non-Patent Citations (1)
Title |
---|
董帅;李文生;张文强;邹昆;: "基于多视图循环神经网络的三维物体识别", 电子科技大学学报, no. 02, 30 March 2020 (2020-03-30) * |
Also Published As
Publication number | Publication date |
---|---|
CN114550162B (en) | 2024-04-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110533045B (en) | Luggage X-ray contraband image semantic segmentation method combined with attention mechanism | |
CN107742102B (en) | Gesture recognition method based on depth sensor | |
CN112446591A (en) | Evaluation system for student comprehensive capacity evaluation and zero sample evaluation method | |
CN106845430A (en) | Pedestrian detection and tracking based on acceleration region convolutional neural networks | |
CN110069656A (en) | A method of threedimensional model is retrieved based on the two-dimension picture for generating confrontation network | |
CN106844620B (en) | View-based feature matching three-dimensional model retrieval method | |
CN109492596B (en) | Pedestrian detection method and system based on K-means clustering and regional recommendation network | |
CN112784782B (en) | Three-dimensional object identification method based on multi-view double-attention network | |
CN105320764A (en) | 3D model retrieval method and 3D model retrieval apparatus based on slow increment features | |
CN112560967A (en) | Multi-source remote sensing image classification method, storage medium and computing device | |
CN111931790A (en) | Laser point cloud extraction method and device | |
CN113392244A (en) | Three-dimensional model retrieval method and system based on depth measurement learning | |
CN111563408A (en) | High-resolution image landslide automatic detection method with multi-level perception characteristics and progressive self-learning | |
CN114120067A (en) | Object identification method, device, equipment and medium | |
CN104751463A (en) | Three-dimensional model optimal visual angle selection method based on sketch outline features | |
CN104699781B (en) | SAR image search method based on double-deck anchor figure hash | |
CN114510594A (en) | Traditional pattern subgraph retrieval method based on self-attention mechanism | |
CN117274388B (en) | Unsupervised three-dimensional visual positioning method and system based on visual text relation alignment | |
CN114299339A (en) | Three-dimensional point cloud model classification method and system based on regional correlation modeling | |
CN113989291A (en) | Building roof plane segmentation method based on PointNet and RANSAC algorithm | |
CN112132137A (en) | FCN-SPP-Focal Net-based method for identifying correct direction of abstract picture image | |
CN114550162A (en) | Three-dimensional object identification method combining view importance network and self-attention mechanism | |
CN110738194A (en) | three-dimensional object identification method based on point cloud ordered coding | |
CN114241470A (en) | Natural scene character detection method based on attention mechanism | |
CN107908999A (en) | A kind of tired expression recognition method of architectural feature stratification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |