CN112487927A - Indoor scene recognition implementation method and system based on object associated attention - Google Patents

Indoor scene recognition implementation method and system based on object associated attention Download PDF

Info

Publication number
CN112487927A
CN112487927A CN202011344887.8A CN202011344887A CN112487927A CN 112487927 A CN112487927 A CN 112487927A CN 202011344887 A CN202011344887 A CN 202011344887A CN 112487927 A CN112487927 A CN 112487927A
Authority
CN
China
Prior art keywords
feature
objects
expression
module
input image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011344887.8A
Other languages
Chinese (zh)
Other versions
CN112487927B (en
Inventor
苗博
周立广
林天麟
徐扬生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Artificial Intelligence and Robotics
Chinese University of Hong Kong CUHK
Original Assignee
Shenzhen Institute of Artificial Intelligence and Robotics
Chinese University of Hong Kong CUHK
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Artificial Intelligence and Robotics, Chinese University of Hong Kong CUHK filed Critical Shenzhen Institute of Artificial Intelligence and Robotics
Priority to CN202011344887.8A priority Critical patent/CN112487927B/en
Publication of CN112487927A publication Critical patent/CN112487927A/en
Application granted granted Critical
Publication of CN112487927B publication Critical patent/CN112487927B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/35Categorising the entire scene, e.g. birthday party or wedding scene
    • G06V20/36Indoor scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an indoor scene recognition realization method and system based on object associated attention, wherein the method comprises the following steps: extracting semantic feature vectors of each spatial position in an input image through a backbone network; forming a feature map by the semantic feature vectors of all the spatial positions according to the spatial positions of the semantic feature vectors, and transmitting the feature map to a segmentation module to calculate the probability that each spatial position in the input image belongs to different objects; and calculating the feature vector of each object through an object feature aggregation module, multiplying the feature vectors of all spatial positions of each object by the probability that the spatial positions belong to the object, and performing weighted average to obtain the feature vector expression of each object. According to the method and the system for identifying the indoor scene based on the object associated attention, aiming at the fact that objects are different in different scenes, the object feature aggregation module is used for detecting all object features on the input image, and therefore information contained in the image is better expressed.

Description

Indoor scene recognition implementation method and system based on object associated attention
Technical Field
The invention relates to an intelligent identification method and a software system, in particular to an identification method and system improvement aiming at object associated attention characteristics during indoor scene identification.
Background
In the prior art, the perception capability of environmental information is an indispensable capability of a robot, and accurate perception of surrounding scenes is helpful for the robot to make correct judgment and behaviors.
As technology and computing power have advanced, a number of deep learning based scene recognition algorithms have been proposed. Herranze et al have found that feature extraction needs to adapt to different Scales of images, and perform multi-scale fusion on features obtained from models trained on different datasets to identify scenes, see page 571-579 of CVPR 2016, Scene registration with CNNs: Objects, Scales and Dataset Bias (CVPR is an abbreviation for IEEE Conference on Computer Vision and Pattern registration, IEEE International Conference on Computer Vision and Pattern Recognition).
However, enhancement of scene recognition effects based only on picture global information is limited because these methods are not only semantically difficult to interpret, but also are easily disturbed by common objects existing across scenes.
Therefore, some scholars attempt to implement scene recognition in conjunction with contextual information and local object associations. Lopez-cities et al obtain context information by Semantic segmentation to help eliminate the divergence of common objects in different scenes, see Pattern Recognition, vol.102, page 107 and 256, and Semantic-Aware Scene Recognition.
Wang et al train PatchNet based on the weak Supervised training mode, and guide Local feature extraction based on the training mode, and finally aggregate Local features based on semantic probability to realize Scene Recognition, please refer to Pattern Recognition, vol.26, page 2028 + 2041, Weakly Supervised Patchnets: Describing and Aggregating Local patterns for Scene Recognition.
Meanwhile, there are also many studies to improve the scene comprehension capability of the model by combining the multi-modal features. However, most of the indoor scene recognition methods in the prior art are realized by combining manually set features with global features, which not only has a large calculation amount, but also cannot effectively learn the relationship between objects so as to accurately recognize scenes.
Accordingly, the prior art is yet to be improved and developed.
Disclosure of Invention
The invention aims to provide an indoor scene recognition implementation method and system based on object association attention, and provides a quick and accurate object association recognition implementation method and system aiming at the problems of inaccurate overall recognition and redundant network structure in the prior art.
The technical scheme of the invention is as follows:
an indoor scene recognition implementation method based on object associated attention comprises the following steps:
A. extracting semantic feature vectors of each spatial position in an input image through a backbone network;
B. forming a feature map by the semantic feature vectors of all the spatial positions according to the spatial positions of the semantic feature vectors, and transmitting the feature map to a segmentation module to calculate the probability that each spatial position in the input image belongs to different objects;
C. calculating a feature vector of each object through an object feature aggregation module, multiplying the feature vectors of all spatial positions of each object by the probability that the spatial positions belong to the object, and performing weighted average to obtain feature vector expression of each object;
D. and splicing the feature vectors of all the objects to form the object feature expression of the input image.
The method for realizing indoor scene identification based on object associated attention comprises the steps that the backbone network and the object feature aggregation module can calculate feature expressions of different objects based on different spatial position feature hidden vectors.
The method for realizing indoor scene recognition based on object-related attention further comprises, after the step D:
E. and inputting the object feature expression into a light weight object association attention module, wherein the light weight object association attention module is realized by adopting a neural network and is used for calculating the relation between the objects.
The method for identifying an indoor scene based on object-related attention includes:
e1, calculating the relation characteristic vector expression of each object and all other objects based on the neural network and cosine similarity, and splicing the relation characteristic vector expression into the characteristic vector expression of the object.
The method for realizing indoor scene recognition based on object-related attention comprises the following steps:
F. and inputting the feature vector expression and the relation feature vector expression of the objects into a global association aggregation module so as to aggregate the relation among all the objects and form a common feature expression vector of all the objects.
The method for realizing indoor scene recognition based on object-related attention comprises the following steps:
G. and inputting the common characteristic expression vector to a classification identification module of a neural network full-connection layer for identifying the scene to which the input image belongs.
An indoor scene recognition implementation system based on object-associated attention, comprising:
a backbone network for extracting semantic feature vectors of each spatial position from the input image;
the segmentation module is used for forming a feature map from the semantic feature vectors of all the spatial positions according to the spatial positions of the semantic feature vectors to the segmentation module and calculating the probability that each spatial position in the input image belongs to different objects;
an object feature aggregation module, configured to calculate a feature vector of each object, and multiply the feature vectors of all spatial positions of each object by a probability that the spatial position belongs to the object and perform weighted average, thereby obtaining a feature vector expression of each object;
and the object feature aggregation module is also used for splicing the feature vectors of all the objects to form the object feature expression of the input image.
The system for realizing indoor scene recognition based on object-related attention further comprises: and the light weight object association attention module calculates the relation characteristic vector expression of each object and all other objects based on the neural network and cosine similarity, and splices the relation characteristic vector expression into the characteristic vector expression of the object.
The system for realizing indoor scene recognition based on object-related attention further comprises: and the global association aggregation module is used for taking the feature vector expression of the object and the relation feature vector expression as input, and aggregating the relations among all the objects to form a common feature expression vector of all the objects.
The system for realizing indoor scene recognition based on object-related attention further comprises: and the classification identification module is used for inputting the common characteristic expression vector and identifying the scene to which the input image belongs.
According to the method and the system for identifying the indoor scene based on the object associated attention, provided by the invention, aiming at different objects in different scenes, the object feature aggregation module is used for detecting all object features on the input image so as to better express the information contained in the image. Meanwhile, aiming at different distribution of coexisting objects in different scenes, the lightweight object association attention module and the global association aggregation module are used for learning the relation with the aggregation objects, and finally, a common feature expression vector is generated to facilitate a subsequent classification module to identify different scenes. The method is high in efficiency, accurate in identification and suitable for identifying and judging different indoor scenes.
Drawings
Fig. 1 is a module and a flow diagram of an implementation method and system for identifying an indoor scene based on object-related attention according to a preferred embodiment of the present invention.
Fig. 2 is a schematic view illustrating an object feature aggregation module according to a preferred embodiment of the method and system for identifying an indoor scene based on object-related attention.
Fig. 3 is a schematic diagram illustrating an example of a lightweight object-related attention module according to a preferred embodiment of the method and system for identifying an indoor scene based on object-related attention.
Fig. 4 is a schematic diagram illustrating an exemplary global association aggregation module according to a preferred embodiment of the method and system for identifying an indoor scene based on object association attention.
Detailed Description
The following describes in detail preferred embodiments of the present invention.
According to the method and the system for identifying the indoor scene based on the object associated attention, in the identification process of the neural network, the distribution of the coexisting objects in different scenes is found to be different through analysis, so that the indoor scene identification performance can be improved through learning the object relationship. Therefore, the invention provides an object feature aggregation module for detecting and extracting the features of all objects on a picture, learning the relationship between the objects through the proposed light-weight object association attention module, aggregating the object features and the object relationships through a global association aggregation module, and realizing the identification of the indoor scene through full connection. The method realizes scene recognition at a brand new angle, and is more effective than the method in the prior art.
As shown in fig. 1, in the preferred embodiment of the method and system for recognizing an indoor scene based on object-related attention, first, according to an input image, the input image may be a still image obtained by a camera or one of frames of images captured from a video. And then, forming a feature map according to the spatial positions of the semantic feature vectors of all the spatial positions, and transmitting the feature map to a segmentation module to calculate the probability that each spatial position of the upper input image belongs to different objects.
Based on the feature map calculated by the backbone network and the object attribution probability map calculated by the segmentation module, a newly proposed object feature aggregation module is then used for calculating the feature vector of each object, and the implementation process of the module is to multiply the feature vectors of all spatial positions of each object by the probability that the spatial position belongs to the object and then perform weighted average, so as to obtain the feature vector expression of each object. Finally, the feature vectors of all the objects are spliced to form the object feature expression of the picture.
Then inputting the object feature expression into a lightweight object association attention module newly proposed by the invention for calculating the relationship between the objects, wherein the lightweight object association attention module calculates the relationship feature vector expression of each object and all other objects based on the neural network and cosine similarity, and splices the relationship feature vector expression into the feature vector expression of the object, thereby enriching the object features.
The invention further inputs the expression of the feature vectors of the objects and the expression of the object relation feature vectors into a newly proposed global association aggregation module for aggregating the relations among all the objects, thereby forming a common feature expression vector of all the objects on the input image. Finally, the feature expression vector is input into a classification identification module formed by a neural network full-connection layer to identify which scene the picture belongs to.
Specifically, after the camera acquires an input image of an indoor scene, object feature analysis is performed on all spatial positions of the input image, so that scene judgment is performed according to all object features contained in the image. The specific judgment process does not simply depend on local object characteristics, and simultaneously judges all object relationships in the input image, so that indoor scenes can be judged more accurately and effectively, for example, a kitchen, a bedroom, a living room or a dining room, and the interference of object characteristics which commonly appear between different scenes is prevented by identifying the relationship characteristics among the objects, so that the scenes are identified more accurately.
Fig. 2 shows a preferred implementation example of an object feature aggregation module in the method and system for identifying an indoor scene based on object-related attention according to the present invention. In order to effectively extract the object features in the input image, the invention proposes the implementation scheme of the object feature aggregation module in fig. 2. Firstly, a space position feature map F and an object attribution probability map S are calculated according to an input image and based on a backbone network of scene segmentation, and then all space position feature vectors of each object and the attribution probability of the object at the corresponding position are weighted and summed to obtain a feature vector expression O of the object.
Finally, the feature vector expressions of all the objects are spliced to obtain the object feature expression of the input image, wherein the non-existing object is an all-zero vector, and different feature vectors exist for the existing object, specifically as shown in the example in fig. 2, and the final feature dimension is 1024x150x 1.
The method for calculating the feature vector of each object in the object feature aggregation module is shown in the following figure, wherein Oj represents the feature vector of an object j, Bij represents whether the ith pixel position belongs to the object j with the maximum probability, Sij represents the probability that the ith pixel position belongs to the object j, and Fi represents the feature vector of the ith pixel position. Finally, the following calculation formula is adopted to determine the feature vector expression of each object:
Figure BDA0002799602770000061
therefore, the calculation of the expression of the object feature vector of each divided region of the input image can realize the judgment of different objects, but the coexistence relationship of the objects is difficult to express only through the object features.
Thus, the present invention further provides a lightweight object association attention module for calculating the coexistence relationship between objects, as shown in FIG. 3.
In order to effectively transfer the characteristics of the objects in the scene segmentation for scene identification and learn the potential relationship between the objects, the invention further provides a light-weight object association attention module. The lightweight object associated attention module is composed of one or more lightweight object associated attention blocks in cascade as shown in fig. 3, which is implemented as a neural network. Compared with the existing method in the prior art, the method has the advantages that the calculation of K and V based on Q reduces the calculation amount by 50%, and the dimensions of Q, K, V and the output features can be simultaneously controlled by only adjusting the value of alpha. K and V can obtain the relation expression of each object through matrix multiplication, and finally the object relation and the original object characteristics are spliced and output to the next module, so that the object relation and the object characteristics are subjected to feature vector expression.
In order to aggregate object features and relationships into hidden vector expressions with as few parameters and computation amounts as possible (too complicated parameters and computation amounts cause too low processing efficiency and difficulty in extracting key information), as shown in fig. 4, the invention provides a global association aggregation module, and the module adopts strip-shaped deep convolution, so that compared with the block convolution in the traditional deep convolution, the strip-shaped deep convolution can model object features without position relationships. The module first aggregates the information of all objects in each channel with a 150x1 stripe depth convolution for each channel of the feature.
However, at this time, information between channels is not circulated, so that information between all channels is aggregated by using point convolution of 1x1, and a high-level semantic feature expression vector is generated to express scene information. And finally, transmitting the final scene expression vector, namely the object characteristic vector expression to a universal full-connection layer, namely a classification identification module, so as to obtain the final scene identification result of the input image or picture.
In the method and the system for realizing the indoor scene recognition based on the object associated attention, a brand-new object feature aggregation method is adopted in an object feature aggregation module, namely, a space position feature map F and an object attribution probability map S are calculated based on a scene segmentation algorithm, and then the feature vectors of all space positions of each object and the object attribution probabilities of the corresponding space positions are weighted and summed to obtain a feature vector expression O of the object. Finally, the feature vectors of all the objects are spliced to obtain the object feature expression of the image.
Secondly, the invention further provides an object-associated attention module, which adopts a brand-new lightweight network structure, compared with the traditional attention module, the lightweight object-associated attention module is lighter while learning the object relationship, and the number of output characteristic channels can be controlled at will.
In the method and the system for realizing the indoor scene recognition based on the object associated attention, the global associated aggregation module is further provided, and a strip depth convolution mode is adopted, so that compared with the block convolution in the traditional depth convolution, the strip depth convolution aggregates the characteristics and the relation of all objects, and even if the input of the strip depth convolution has no space position information, the strip depth convolution can be aggregated to form the final characteristic vector expression of all objects.
The method and the system for realizing the indoor scene recognition based on the object associated attention realize the calculation efficiency and accuracy meeting the actual calculation quantity requirement and facilitate the indoor scene recognition and judgment process of the input image by adopting the object characteristic aggregation and light weight associated attention processing method and module.
It will be understood that modifications and variations can be made by persons skilled in the art in light of the above teachings and all such modifications and variations are intended to be included within the scope of the invention as defined in the appended claims.

Claims (10)

1. An indoor scene recognition implementation method based on object associated attention comprises the following steps:
A. extracting semantic feature vectors of each spatial position in an input image through a backbone network;
B. forming a feature map by the semantic feature vectors of all the spatial positions according to the spatial positions of the semantic feature vectors, and transmitting the feature map to a segmentation module to calculate the probability that each spatial position in the input image belongs to different objects;
C. calculating a feature vector of each object through an object feature aggregation module, multiplying the feature vectors of all spatial positions of each object by the probability that the spatial positions belong to the object, and performing weighted average to obtain feature vector expression of each object;
D. and splicing the feature vectors of all the objects to form the object feature expression of the input image.
2. The method of claim 1, wherein the backbone network and the object feature aggregation module are configured to compute the feature expressions of different objects based on different spatial locality feature steganographic vectors.
3. The method for realizing object-related attention-based indoor scene recognition according to claim 2, further comprising, after the step D:
E. and inputting the object feature expression into a light weight object association attention module, wherein the light weight object association attention module is realized by adopting a neural network and is used for calculating the relation between the objects.
4. The method for realizing indoor scene recognition based on object-related attention according to claim 3, wherein the step E further comprises:
e1, calculating the relation characteristic vector expression of each object and all other objects based on the neural network and cosine similarity, and splicing the relation characteristic vector expression into the characteristic vector expression of the object.
5. The method for realizing indoor scene recognition based on object-related attention according to claim 4, wherein the step E is further followed by:
F. and inputting the feature vector expression and the relation feature vector expression of the objects into a global association aggregation module so as to aggregate the relation among all the objects and form a common feature expression vector of all the objects.
6. The method for realizing object-related attention-based indoor scene recognition according to claim 5, wherein the step F is further followed by:
G. and inputting the common characteristic expression vector to a classification identification module of a neural network full-connection layer for identifying the scene to which the input image belongs.
7. An indoor scene recognition implementation system based on object-associated attention, comprising:
a backbone network for extracting semantic feature vectors of each spatial position from the input image;
the segmentation module is used for forming a feature map from the semantic feature vectors of all the spatial positions according to the spatial positions of the semantic feature vectors to the segmentation module and calculating the probability that each spatial position in the input image corresponds to different objects;
an object feature aggregation module, configured to calculate a feature vector of each object, and multiply the feature vectors of all spatial positions of each object by the probability that the spatial positions are the object, and perform weighted average, thereby obtaining a feature vector expression of each object;
and the object feature aggregation module is also used for splicing the feature vectors of all the objects to form the object feature expression of the input image.
8. The system of claim 7, further comprising: and the light weight object association attention module calculates the relation characteristic vector expression of each object and all other objects based on the neural network and cosine similarity, and splices the relation characteristic vector expression into the characteristic vector expression of the object.
9. The system of claim 8, further comprising: and the global association aggregation module takes the feature vector expression of the objects and the relation feature vector expression as input, and aggregates the relation among all the objects to form a common feature expression vector of all the objects.
10. The system of claim 9, further comprising: and the classification identification module is used for inputting the common characteristic expression vector and identifying the scene to which the input image belongs.
CN202011344887.8A 2020-11-26 2020-11-26 Method and system for realizing indoor scene recognition based on object associated attention Active CN112487927B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011344887.8A CN112487927B (en) 2020-11-26 2020-11-26 Method and system for realizing indoor scene recognition based on object associated attention

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011344887.8A CN112487927B (en) 2020-11-26 2020-11-26 Method and system for realizing indoor scene recognition based on object associated attention

Publications (2)

Publication Number Publication Date
CN112487927A true CN112487927A (en) 2021-03-12
CN112487927B CN112487927B (en) 2024-02-13

Family

ID=74934952

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011344887.8A Active CN112487927B (en) 2020-11-26 2020-11-26 Method and system for realizing indoor scene recognition based on object associated attention

Country Status (1)

Country Link
CN (1) CN112487927B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113470048A (en) * 2021-07-06 2021-10-01 北京深睿博联科技有限责任公司 Scene segmentation method, device, equipment and computer readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110084128A (en) * 2019-03-29 2019-08-02 安徽艾睿思智能科技有限公司 Scene chart generation method based on semantic space constraint and attention mechanism
CN110245665A (en) * 2019-05-13 2019-09-17 天津大学 Image, semantic dividing method based on attention mechanism
CN111932553A (en) * 2020-07-27 2020-11-13 北京航空航天大学 Remote sensing image semantic segmentation method based on area description self-attention mechanism

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110084128A (en) * 2019-03-29 2019-08-02 安徽艾睿思智能科技有限公司 Scene chart generation method based on semantic space constraint and attention mechanism
CN110245665A (en) * 2019-05-13 2019-09-17 天津大学 Image, semantic dividing method based on attention mechanism
CN111932553A (en) * 2020-07-27 2020-11-13 北京航空航天大学 Remote sensing image semantic segmentation method based on area description self-attention mechanism

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113470048A (en) * 2021-07-06 2021-10-01 北京深睿博联科技有限责任公司 Scene segmentation method, device, equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN112487927B (en) 2024-02-13

Similar Documents

Publication Publication Date Title
CN108133188B (en) Behavior identification method based on motion history image and convolutional neural network
CN107563372B (en) License plate positioning method based on deep learning SSD frame
WO2020228446A1 (en) Model training method and apparatus, and terminal and storage medium
CN108960141B (en) Pedestrian re-identification method based on enhanced deep convolutional neural network
CN109558823B (en) Vehicle identification method and system for searching images by images
CN111709311B (en) Pedestrian re-identification method based on multi-scale convolution feature fusion
US20210342643A1 (en) Method, apparatus, and electronic device for training place recognition model
CN108537147B (en) Gesture recognition method based on deep learning
CN110717411A (en) Pedestrian re-identification method based on deep layer feature fusion
CN111639564B (en) Video pedestrian re-identification method based on multi-attention heterogeneous network
Xia et al. Loop closure detection for visual SLAM using PCANet features
Kang et al. Deep learning-based weather image recognition
CN110390308B (en) Video behavior identification method based on space-time confrontation generation network
CN114463677B (en) Safety helmet wearing detection method based on global attention
WO2021243947A1 (en) Object re-identification method and apparatus, and terminal and storage medium
CN117252904B (en) Target tracking method and system based on long-range space perception and channel enhancement
CN113312973A (en) Method and system for extracting features of gesture recognition key points
CN114333062B (en) Pedestrian re-recognition model training method based on heterogeneous dual networks and feature consistency
Alsanad et al. Real-time fuel truck detection algorithm based on deep convolutional neural network
CN114519863A (en) Human body weight recognition method, human body weight recognition apparatus, computer device, and medium
CN106650814B (en) Outdoor road self-adaptive classifier generation method based on vehicle-mounted monocular vision
CN112487927B (en) Method and system for realizing indoor scene recognition based on object associated attention
CN116824641A (en) Gesture classification method, device, equipment and computer storage medium
CN116310128A (en) Dynamic environment monocular multi-object SLAM method based on instance segmentation and three-dimensional reconstruction
CN111428567A (en) Pedestrian tracking system and method based on affine multi-task regression

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant