CN112487927A - Indoor scene recognition implementation method and system based on object associated attention - Google Patents
Indoor scene recognition implementation method and system based on object associated attention Download PDFInfo
- Publication number
- CN112487927A CN112487927A CN202011344887.8A CN202011344887A CN112487927A CN 112487927 A CN112487927 A CN 112487927A CN 202011344887 A CN202011344887 A CN 202011344887A CN 112487927 A CN112487927 A CN 112487927A
- Authority
- CN
- China
- Prior art keywords
- feature
- objects
- expression
- module
- input image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 239000013598 vector Substances 0.000 claims abstract description 83
- 230000014509 gene expression Effects 0.000 claims abstract description 53
- 230000002776 aggregation Effects 0.000 claims abstract description 31
- 238000004220 aggregation Methods 0.000 claims abstract description 31
- 230000011218 segmentation Effects 0.000 claims abstract description 13
- 239000013604 expression vector Substances 0.000 claims description 13
- 238000013528 artificial neural network Methods 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 description 7
- 230000004931 aggregating effect Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000003909 pattern recognition Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/35—Categorising the entire scene, e.g. birthday party or wedding scene
- G06V20/36—Indoor scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an indoor scene recognition realization method and system based on object associated attention, wherein the method comprises the following steps: extracting semantic feature vectors of each spatial position in an input image through a backbone network; forming a feature map by the semantic feature vectors of all the spatial positions according to the spatial positions of the semantic feature vectors, and transmitting the feature map to a segmentation module to calculate the probability that each spatial position in the input image belongs to different objects; and calculating the feature vector of each object through an object feature aggregation module, multiplying the feature vectors of all spatial positions of each object by the probability that the spatial positions belong to the object, and performing weighted average to obtain the feature vector expression of each object. According to the method and the system for identifying the indoor scene based on the object associated attention, aiming at the fact that objects are different in different scenes, the object feature aggregation module is used for detecting all object features on the input image, and therefore information contained in the image is better expressed.
Description
Technical Field
The invention relates to an intelligent identification method and a software system, in particular to an identification method and system improvement aiming at object associated attention characteristics during indoor scene identification.
Background
In the prior art, the perception capability of environmental information is an indispensable capability of a robot, and accurate perception of surrounding scenes is helpful for the robot to make correct judgment and behaviors.
As technology and computing power have advanced, a number of deep learning based scene recognition algorithms have been proposed. Herranze et al have found that feature extraction needs to adapt to different Scales of images, and perform multi-scale fusion on features obtained from models trained on different datasets to identify scenes, see page 571-579 of CVPR 2016, Scene registration with CNNs: Objects, Scales and Dataset Bias (CVPR is an abbreviation for IEEE Conference on Computer Vision and Pattern registration, IEEE International Conference on Computer Vision and Pattern Recognition).
However, enhancement of scene recognition effects based only on picture global information is limited because these methods are not only semantically difficult to interpret, but also are easily disturbed by common objects existing across scenes.
Therefore, some scholars attempt to implement scene recognition in conjunction with contextual information and local object associations. Lopez-cities et al obtain context information by Semantic segmentation to help eliminate the divergence of common objects in different scenes, see Pattern Recognition, vol.102, page 107 and 256, and Semantic-Aware Scene Recognition.
Wang et al train PatchNet based on the weak Supervised training mode, and guide Local feature extraction based on the training mode, and finally aggregate Local features based on semantic probability to realize Scene Recognition, please refer to Pattern Recognition, vol.26, page 2028 + 2041, Weakly Supervised Patchnets: Describing and Aggregating Local patterns for Scene Recognition.
Meanwhile, there are also many studies to improve the scene comprehension capability of the model by combining the multi-modal features. However, most of the indoor scene recognition methods in the prior art are realized by combining manually set features with global features, which not only has a large calculation amount, but also cannot effectively learn the relationship between objects so as to accurately recognize scenes.
Accordingly, the prior art is yet to be improved and developed.
Disclosure of Invention
The invention aims to provide an indoor scene recognition implementation method and system based on object association attention, and provides a quick and accurate object association recognition implementation method and system aiming at the problems of inaccurate overall recognition and redundant network structure in the prior art.
The technical scheme of the invention is as follows:
an indoor scene recognition implementation method based on object associated attention comprises the following steps:
A. extracting semantic feature vectors of each spatial position in an input image through a backbone network;
B. forming a feature map by the semantic feature vectors of all the spatial positions according to the spatial positions of the semantic feature vectors, and transmitting the feature map to a segmentation module to calculate the probability that each spatial position in the input image belongs to different objects;
C. calculating a feature vector of each object through an object feature aggregation module, multiplying the feature vectors of all spatial positions of each object by the probability that the spatial positions belong to the object, and performing weighted average to obtain feature vector expression of each object;
D. and splicing the feature vectors of all the objects to form the object feature expression of the input image.
The method for realizing indoor scene identification based on object associated attention comprises the steps that the backbone network and the object feature aggregation module can calculate feature expressions of different objects based on different spatial position feature hidden vectors.
The method for realizing indoor scene recognition based on object-related attention further comprises, after the step D:
E. and inputting the object feature expression into a light weight object association attention module, wherein the light weight object association attention module is realized by adopting a neural network and is used for calculating the relation between the objects.
The method for identifying an indoor scene based on object-related attention includes:
e1, calculating the relation characteristic vector expression of each object and all other objects based on the neural network and cosine similarity, and splicing the relation characteristic vector expression into the characteristic vector expression of the object.
The method for realizing indoor scene recognition based on object-related attention comprises the following steps:
F. and inputting the feature vector expression and the relation feature vector expression of the objects into a global association aggregation module so as to aggregate the relation among all the objects and form a common feature expression vector of all the objects.
The method for realizing indoor scene recognition based on object-related attention comprises the following steps:
G. and inputting the common characteristic expression vector to a classification identification module of a neural network full-connection layer for identifying the scene to which the input image belongs.
An indoor scene recognition implementation system based on object-associated attention, comprising:
a backbone network for extracting semantic feature vectors of each spatial position from the input image;
the segmentation module is used for forming a feature map from the semantic feature vectors of all the spatial positions according to the spatial positions of the semantic feature vectors to the segmentation module and calculating the probability that each spatial position in the input image belongs to different objects;
an object feature aggregation module, configured to calculate a feature vector of each object, and multiply the feature vectors of all spatial positions of each object by a probability that the spatial position belongs to the object and perform weighted average, thereby obtaining a feature vector expression of each object;
and the object feature aggregation module is also used for splicing the feature vectors of all the objects to form the object feature expression of the input image.
The system for realizing indoor scene recognition based on object-related attention further comprises: and the light weight object association attention module calculates the relation characteristic vector expression of each object and all other objects based on the neural network and cosine similarity, and splices the relation characteristic vector expression into the characteristic vector expression of the object.
The system for realizing indoor scene recognition based on object-related attention further comprises: and the global association aggregation module is used for taking the feature vector expression of the object and the relation feature vector expression as input, and aggregating the relations among all the objects to form a common feature expression vector of all the objects.
The system for realizing indoor scene recognition based on object-related attention further comprises: and the classification identification module is used for inputting the common characteristic expression vector and identifying the scene to which the input image belongs.
According to the method and the system for identifying the indoor scene based on the object associated attention, provided by the invention, aiming at different objects in different scenes, the object feature aggregation module is used for detecting all object features on the input image so as to better express the information contained in the image. Meanwhile, aiming at different distribution of coexisting objects in different scenes, the lightweight object association attention module and the global association aggregation module are used for learning the relation with the aggregation objects, and finally, a common feature expression vector is generated to facilitate a subsequent classification module to identify different scenes. The method is high in efficiency, accurate in identification and suitable for identifying and judging different indoor scenes.
Drawings
Fig. 1 is a module and a flow diagram of an implementation method and system for identifying an indoor scene based on object-related attention according to a preferred embodiment of the present invention.
Fig. 2 is a schematic view illustrating an object feature aggregation module according to a preferred embodiment of the method and system for identifying an indoor scene based on object-related attention.
Fig. 3 is a schematic diagram illustrating an example of a lightweight object-related attention module according to a preferred embodiment of the method and system for identifying an indoor scene based on object-related attention.
Fig. 4 is a schematic diagram illustrating an exemplary global association aggregation module according to a preferred embodiment of the method and system for identifying an indoor scene based on object association attention.
Detailed Description
The following describes in detail preferred embodiments of the present invention.
According to the method and the system for identifying the indoor scene based on the object associated attention, in the identification process of the neural network, the distribution of the coexisting objects in different scenes is found to be different through analysis, so that the indoor scene identification performance can be improved through learning the object relationship. Therefore, the invention provides an object feature aggregation module for detecting and extracting the features of all objects on a picture, learning the relationship between the objects through the proposed light-weight object association attention module, aggregating the object features and the object relationships through a global association aggregation module, and realizing the identification of the indoor scene through full connection. The method realizes scene recognition at a brand new angle, and is more effective than the method in the prior art.
As shown in fig. 1, in the preferred embodiment of the method and system for recognizing an indoor scene based on object-related attention, first, according to an input image, the input image may be a still image obtained by a camera or one of frames of images captured from a video. And then, forming a feature map according to the spatial positions of the semantic feature vectors of all the spatial positions, and transmitting the feature map to a segmentation module to calculate the probability that each spatial position of the upper input image belongs to different objects.
Based on the feature map calculated by the backbone network and the object attribution probability map calculated by the segmentation module, a newly proposed object feature aggregation module is then used for calculating the feature vector of each object, and the implementation process of the module is to multiply the feature vectors of all spatial positions of each object by the probability that the spatial position belongs to the object and then perform weighted average, so as to obtain the feature vector expression of each object. Finally, the feature vectors of all the objects are spliced to form the object feature expression of the picture.
Then inputting the object feature expression into a lightweight object association attention module newly proposed by the invention for calculating the relationship between the objects, wherein the lightweight object association attention module calculates the relationship feature vector expression of each object and all other objects based on the neural network and cosine similarity, and splices the relationship feature vector expression into the feature vector expression of the object, thereby enriching the object features.
The invention further inputs the expression of the feature vectors of the objects and the expression of the object relation feature vectors into a newly proposed global association aggregation module for aggregating the relations among all the objects, thereby forming a common feature expression vector of all the objects on the input image. Finally, the feature expression vector is input into a classification identification module formed by a neural network full-connection layer to identify which scene the picture belongs to.
Specifically, after the camera acquires an input image of an indoor scene, object feature analysis is performed on all spatial positions of the input image, so that scene judgment is performed according to all object features contained in the image. The specific judgment process does not simply depend on local object characteristics, and simultaneously judges all object relationships in the input image, so that indoor scenes can be judged more accurately and effectively, for example, a kitchen, a bedroom, a living room or a dining room, and the interference of object characteristics which commonly appear between different scenes is prevented by identifying the relationship characteristics among the objects, so that the scenes are identified more accurately.
Fig. 2 shows a preferred implementation example of an object feature aggregation module in the method and system for identifying an indoor scene based on object-related attention according to the present invention. In order to effectively extract the object features in the input image, the invention proposes the implementation scheme of the object feature aggregation module in fig. 2. Firstly, a space position feature map F and an object attribution probability map S are calculated according to an input image and based on a backbone network of scene segmentation, and then all space position feature vectors of each object and the attribution probability of the object at the corresponding position are weighted and summed to obtain a feature vector expression O of the object.
Finally, the feature vector expressions of all the objects are spliced to obtain the object feature expression of the input image, wherein the non-existing object is an all-zero vector, and different feature vectors exist for the existing object, specifically as shown in the example in fig. 2, and the final feature dimension is 1024x150x 1.
The method for calculating the feature vector of each object in the object feature aggregation module is shown in the following figure, wherein Oj represents the feature vector of an object j, Bij represents whether the ith pixel position belongs to the object j with the maximum probability, Sij represents the probability that the ith pixel position belongs to the object j, and Fi represents the feature vector of the ith pixel position. Finally, the following calculation formula is adopted to determine the feature vector expression of each object:
therefore, the calculation of the expression of the object feature vector of each divided region of the input image can realize the judgment of different objects, but the coexistence relationship of the objects is difficult to express only through the object features.
Thus, the present invention further provides a lightweight object association attention module for calculating the coexistence relationship between objects, as shown in FIG. 3.
In order to effectively transfer the characteristics of the objects in the scene segmentation for scene identification and learn the potential relationship between the objects, the invention further provides a light-weight object association attention module. The lightweight object associated attention module is composed of one or more lightweight object associated attention blocks in cascade as shown in fig. 3, which is implemented as a neural network. Compared with the existing method in the prior art, the method has the advantages that the calculation of K and V based on Q reduces the calculation amount by 50%, and the dimensions of Q, K, V and the output features can be simultaneously controlled by only adjusting the value of alpha. K and V can obtain the relation expression of each object through matrix multiplication, and finally the object relation and the original object characteristics are spliced and output to the next module, so that the object relation and the object characteristics are subjected to feature vector expression.
In order to aggregate object features and relationships into hidden vector expressions with as few parameters and computation amounts as possible (too complicated parameters and computation amounts cause too low processing efficiency and difficulty in extracting key information), as shown in fig. 4, the invention provides a global association aggregation module, and the module adopts strip-shaped deep convolution, so that compared with the block convolution in the traditional deep convolution, the strip-shaped deep convolution can model object features without position relationships. The module first aggregates the information of all objects in each channel with a 150x1 stripe depth convolution for each channel of the feature.
However, at this time, information between channels is not circulated, so that information between all channels is aggregated by using point convolution of 1x1, and a high-level semantic feature expression vector is generated to express scene information. And finally, transmitting the final scene expression vector, namely the object characteristic vector expression to a universal full-connection layer, namely a classification identification module, so as to obtain the final scene identification result of the input image or picture.
In the method and the system for realizing the indoor scene recognition based on the object associated attention, a brand-new object feature aggregation method is adopted in an object feature aggregation module, namely, a space position feature map F and an object attribution probability map S are calculated based on a scene segmentation algorithm, and then the feature vectors of all space positions of each object and the object attribution probabilities of the corresponding space positions are weighted and summed to obtain a feature vector expression O of the object. Finally, the feature vectors of all the objects are spliced to obtain the object feature expression of the image.
Secondly, the invention further provides an object-associated attention module, which adopts a brand-new lightweight network structure, compared with the traditional attention module, the lightweight object-associated attention module is lighter while learning the object relationship, and the number of output characteristic channels can be controlled at will.
In the method and the system for realizing the indoor scene recognition based on the object associated attention, the global associated aggregation module is further provided, and a strip depth convolution mode is adopted, so that compared with the block convolution in the traditional depth convolution, the strip depth convolution aggregates the characteristics and the relation of all objects, and even if the input of the strip depth convolution has no space position information, the strip depth convolution can be aggregated to form the final characteristic vector expression of all objects.
The method and the system for realizing the indoor scene recognition based on the object associated attention realize the calculation efficiency and accuracy meeting the actual calculation quantity requirement and facilitate the indoor scene recognition and judgment process of the input image by adopting the object characteristic aggregation and light weight associated attention processing method and module.
It will be understood that modifications and variations can be made by persons skilled in the art in light of the above teachings and all such modifications and variations are intended to be included within the scope of the invention as defined in the appended claims.
Claims (10)
1. An indoor scene recognition implementation method based on object associated attention comprises the following steps:
A. extracting semantic feature vectors of each spatial position in an input image through a backbone network;
B. forming a feature map by the semantic feature vectors of all the spatial positions according to the spatial positions of the semantic feature vectors, and transmitting the feature map to a segmentation module to calculate the probability that each spatial position in the input image belongs to different objects;
C. calculating a feature vector of each object through an object feature aggregation module, multiplying the feature vectors of all spatial positions of each object by the probability that the spatial positions belong to the object, and performing weighted average to obtain feature vector expression of each object;
D. and splicing the feature vectors of all the objects to form the object feature expression of the input image.
2. The method of claim 1, wherein the backbone network and the object feature aggregation module are configured to compute the feature expressions of different objects based on different spatial locality feature steganographic vectors.
3. The method for realizing object-related attention-based indoor scene recognition according to claim 2, further comprising, after the step D:
E. and inputting the object feature expression into a light weight object association attention module, wherein the light weight object association attention module is realized by adopting a neural network and is used for calculating the relation between the objects.
4. The method for realizing indoor scene recognition based on object-related attention according to claim 3, wherein the step E further comprises:
e1, calculating the relation characteristic vector expression of each object and all other objects based on the neural network and cosine similarity, and splicing the relation characteristic vector expression into the characteristic vector expression of the object.
5. The method for realizing indoor scene recognition based on object-related attention according to claim 4, wherein the step E is further followed by:
F. and inputting the feature vector expression and the relation feature vector expression of the objects into a global association aggregation module so as to aggregate the relation among all the objects and form a common feature expression vector of all the objects.
6. The method for realizing object-related attention-based indoor scene recognition according to claim 5, wherein the step F is further followed by:
G. and inputting the common characteristic expression vector to a classification identification module of a neural network full-connection layer for identifying the scene to which the input image belongs.
7. An indoor scene recognition implementation system based on object-associated attention, comprising:
a backbone network for extracting semantic feature vectors of each spatial position from the input image;
the segmentation module is used for forming a feature map from the semantic feature vectors of all the spatial positions according to the spatial positions of the semantic feature vectors to the segmentation module and calculating the probability that each spatial position in the input image corresponds to different objects;
an object feature aggregation module, configured to calculate a feature vector of each object, and multiply the feature vectors of all spatial positions of each object by the probability that the spatial positions are the object, and perform weighted average, thereby obtaining a feature vector expression of each object;
and the object feature aggregation module is also used for splicing the feature vectors of all the objects to form the object feature expression of the input image.
8. The system of claim 7, further comprising: and the light weight object association attention module calculates the relation characteristic vector expression of each object and all other objects based on the neural network and cosine similarity, and splices the relation characteristic vector expression into the characteristic vector expression of the object.
9. The system of claim 8, further comprising: and the global association aggregation module takes the feature vector expression of the objects and the relation feature vector expression as input, and aggregates the relation among all the objects to form a common feature expression vector of all the objects.
10. The system of claim 9, further comprising: and the classification identification module is used for inputting the common characteristic expression vector and identifying the scene to which the input image belongs.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011344887.8A CN112487927B (en) | 2020-11-26 | 2020-11-26 | Method and system for realizing indoor scene recognition based on object associated attention |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011344887.8A CN112487927B (en) | 2020-11-26 | 2020-11-26 | Method and system for realizing indoor scene recognition based on object associated attention |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112487927A true CN112487927A (en) | 2021-03-12 |
CN112487927B CN112487927B (en) | 2024-02-13 |
Family
ID=74934952
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011344887.8A Active CN112487927B (en) | 2020-11-26 | 2020-11-26 | Method and system for realizing indoor scene recognition based on object associated attention |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112487927B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113470048A (en) * | 2021-07-06 | 2021-10-01 | 北京深睿博联科技有限责任公司 | Scene segmentation method, device, equipment and computer readable storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110084128A (en) * | 2019-03-29 | 2019-08-02 | 安徽艾睿思智能科技有限公司 | Scene chart generation method based on semantic space constraint and attention mechanism |
CN110245665A (en) * | 2019-05-13 | 2019-09-17 | 天津大学 | Image, semantic dividing method based on attention mechanism |
CN111932553A (en) * | 2020-07-27 | 2020-11-13 | 北京航空航天大学 | Remote sensing image semantic segmentation method based on area description self-attention mechanism |
-
2020
- 2020-11-26 CN CN202011344887.8A patent/CN112487927B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110084128A (en) * | 2019-03-29 | 2019-08-02 | 安徽艾睿思智能科技有限公司 | Scene chart generation method based on semantic space constraint and attention mechanism |
CN110245665A (en) * | 2019-05-13 | 2019-09-17 | 天津大学 | Image, semantic dividing method based on attention mechanism |
CN111932553A (en) * | 2020-07-27 | 2020-11-13 | 北京航空航天大学 | Remote sensing image semantic segmentation method based on area description self-attention mechanism |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113470048A (en) * | 2021-07-06 | 2021-10-01 | 北京深睿博联科技有限责任公司 | Scene segmentation method, device, equipment and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN112487927B (en) | 2024-02-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108133188B (en) | Behavior identification method based on motion history image and convolutional neural network | |
CN107563372B (en) | License plate positioning method based on deep learning SSD frame | |
WO2020228446A1 (en) | Model training method and apparatus, and terminal and storage medium | |
CN108960141B (en) | Pedestrian re-identification method based on enhanced deep convolutional neural network | |
CN109558823B (en) | Vehicle identification method and system for searching images by images | |
CN111709311B (en) | Pedestrian re-identification method based on multi-scale convolution feature fusion | |
US20210342643A1 (en) | Method, apparatus, and electronic device for training place recognition model | |
CN108537147B (en) | Gesture recognition method based on deep learning | |
CN110717411A (en) | Pedestrian re-identification method based on deep layer feature fusion | |
CN111639564B (en) | Video pedestrian re-identification method based on multi-attention heterogeneous network | |
Xia et al. | Loop closure detection for visual SLAM using PCANet features | |
Kang et al. | Deep learning-based weather image recognition | |
CN110390308B (en) | Video behavior identification method based on space-time confrontation generation network | |
CN114463677B (en) | Safety helmet wearing detection method based on global attention | |
WO2021243947A1 (en) | Object re-identification method and apparatus, and terminal and storage medium | |
CN117252904B (en) | Target tracking method and system based on long-range space perception and channel enhancement | |
CN113312973A (en) | Method and system for extracting features of gesture recognition key points | |
CN114333062B (en) | Pedestrian re-recognition model training method based on heterogeneous dual networks and feature consistency | |
Alsanad et al. | Real-time fuel truck detection algorithm based on deep convolutional neural network | |
CN114519863A (en) | Human body weight recognition method, human body weight recognition apparatus, computer device, and medium | |
CN106650814B (en) | Outdoor road self-adaptive classifier generation method based on vehicle-mounted monocular vision | |
CN112487927B (en) | Method and system for realizing indoor scene recognition based on object associated attention | |
CN116824641A (en) | Gesture classification method, device, equipment and computer storage medium | |
CN116310128A (en) | Dynamic environment monocular multi-object SLAM method based on instance segmentation and three-dimensional reconstruction | |
CN111428567A (en) | Pedestrian tracking system and method based on affine multi-task regression |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |