CN108921051B - Pedestrian Attribute Recognition Network and Technology Based on Recurrent Neural Network Attention Model - Google Patents
Pedestrian Attribute Recognition Network and Technology Based on Recurrent Neural Network Attention Model Download PDFInfo
- Publication number
- CN108921051B CN108921051B CN201810616398.XA CN201810616398A CN108921051B CN 108921051 B CN108921051 B CN 108921051B CN 201810616398 A CN201810616398 A CN 201810616398A CN 108921051 B CN108921051 B CN 108921051B
- Authority
- CN
- China
- Prior art keywords
- pedestrian
- attribute
- neural network
- attributes
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 37
- 230000000306 recurrent effect Effects 0.000 title claims abstract description 26
- 238000005516 engineering process Methods 0.000 title abstract description 11
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 18
- 125000004122 cyclic group Chemical group 0.000 claims abstract description 7
- 238000000034 method Methods 0.000 claims description 26
- 238000012549 training Methods 0.000 claims description 26
- 230000006870 function Effects 0.000 claims description 13
- 238000004364 calculation method Methods 0.000 claims description 8
- 230000000694 effects Effects 0.000 claims description 4
- 238000010276 construction Methods 0.000 claims description 2
- 230000001537 neural effect Effects 0.000 claims 1
- 238000012360 testing method Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Molecular Biology (AREA)
- Evolutionary Biology (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本发明属于神经网络和图像识别技术领域,尤其涉及一种基于循环神经网络注意力模型的行人属性识别网络及技术。The invention belongs to the technical field of neural network and image recognition, and in particular relates to a pedestrian attribute recognition network and technology based on a cyclic neural network attention model.
背景技术Background technique
行人属性识别技术能够帮助人们自动完成从海量的图像和视频数据中搜寻特定人员的任务。但是由于监控视频的图像质量低、有标注的行人属性数据集较小、难以获得等因素的影响,极大地增加了从监控视频图像中进行行人属性识别的难度。现有的基于深度神经网络的行人属性识别方法分为卷积神经网络(CNN)方法和卷积神经网络与循环神经网络结合方法(CNN-RNN)两大类。现有的CNN方法如DeepMAR方法尝试孤立地从整张图像的特征中识别每一种行人的属性,虽然这种方法取得了一定的效果,但是它忽视了行人属性的空间局部性和属性之间的关联关系,难以得到更高的识别精度。现有的CNN-RNN方法如JRL方法试图使用循环神经网络逐步挖掘行人属性之间的语义关联关系,如穿裙子的一般是女人等,在识别精度上较纯CNN方法有一定的提高。然而,这种方法只考虑了行人属性间的语义联系却忽视了属性的空间局部性。行人的很多属性的焦点集中在图像的某一个区域里,例如是否戴眼镜和是否留长发都只决定于行人头部区域的视觉特征,其他区域用处不大。如果将这种空间的局部性考虑到行人属性识别模型的构建过程当中去,在识别头部属性的时候高亮头部区域,忽视背景噪声的干扰,就可以大大提高行人属性识别精度。Pedestrian attribute recognition technology can help people automatically complete the task of searching for specific people from massive image and video data. However, due to the low image quality of surveillance video, the small dataset of annotated pedestrian attributes, and the difficulty in obtaining them, it has greatly increased the difficulty of pedestrian attribute recognition from surveillance video images. Existing pedestrian attribute recognition methods based on deep neural networks can be divided into two categories: convolutional neural network (CNN) method and convolutional neural network and recurrent neural network combination method (CNN-RNN). Existing CNN methods such as DeepMAR try to identify each pedestrian attribute from the features of the entire image in isolation. Although this method has achieved certain results, it ignores the spatial locality of pedestrian attributes and the relationship between attributes. It is difficult to obtain higher recognition accuracy. Existing CNN-RNN methods such as JRL methods try to use recurrent neural networks to gradually mine the semantic relationship between pedestrian attributes, such as women wearing skirts, etc., and the recognition accuracy is improved compared with pure CNN methods. However, this method only considers the semantic connections among pedestrian attributes but ignores the spatial locality of attributes. Many attributes of pedestrians focus on a certain area of the image, such as whether to wear glasses and whether to have long hair, which only depends on the visual characteristics of the pedestrian's head area, and other areas are of little use. If the locality of this space is taken into account in the construction process of the pedestrian attribute recognition model, the head area is highlighted when the head attribute is identified, and the interference of background noise is ignored, the pedestrian attribute recognition accuracy can be greatly improved.
发明内容SUMMARY OF THE INVENTION
为了解决上述技术问题,本发明提供一种基于循环神经网络注意力模型的行人属性识别网络,包括:In order to solve the above-mentioned technical problems, the present invention provides a pedestrian attribute recognition network based on a recurrent neural network attention model, including:
使用行人原始全身图像作为输入提取行人全身图像特征N(x)的第一卷积神经网络;A first convolutional neural network that uses the original full-body image of the pedestrian as input to extract the feature N(x) of the full-body image of the pedestrian;
使用行人全身图像特征N(x)作为第一输入,上一时刻关注的属性组别的注意力热图At-1(x)作为第二输入,输出当前时刻所关注的属性组别的注意力热图At(x)和经过局部高亮的行人特征Ht(x)的循环神经网络;Use the pedestrian whole body image feature N(x) as the first input, the attention heatmap A t-1 (x) of the attribute group concerned at the previous moment as the second input, and output the attention of the attribute group concerned at the current moment A recurrent neural network with heatmap A t (x) and locally highlighted pedestrian features H t (x);
使用经过局部高亮的行人特征Ht(x)作为输入,输出当前关注组别的属性预测概率的第二卷积神经网络。Using the locally highlighted pedestrian feature H t (x) as input, the second convolutional neural network that outputs the attribute prediction probability of the current attention group.
进一步的,所述经过局部高亮的行人特征Ht(x)是使用上一时刻关注的属性组别的注意力热图At-1(x)作用在行人全身图像特征N(x)上得到的,计算公式如下:Further, the partially highlighted pedestrian feature H t (x) uses the attention heatmap A t-1 (x) of the attribute group concerned at the last moment to act on the pedestrian whole body image feature N (x) Obtained, the calculation formula is as follows:
进一步的,对所述属性预测概率输出使用批正则化操作,以对抗属性正负例样本不平衡带来的识别误差。Further, a batch regularization operation is used on the attribute prediction probability output to combat the identification error caused by the imbalance of positive and negative samples of the attribute.
进一步的,所述行人属性识别网络包括:Further, the pedestrian attribute identification network includes:
对于同一张行人原始全身图像的每一个不同的属性组别,循环神经网络的记忆单元状态由所有已经预测过的属性组别的局部高亮过的行人特征共同决定;For each different attribute group of the original full-body image of the same pedestrian, the memory unit state of the RNN is jointly determined by the locally highlighted pedestrian features of all the predicted attribute groups;
对于不同的预测时刻第一卷积神经网络共享权值;For different prediction moments, the first convolutional neural network shares weights;
对于不同的预测时刻第二卷积神经网络共享权值。The second convolutional neural network shares weights for different prediction moments.
进一步的,所述行人属性识别网络使用加权Sigmoid交叉熵损失函数进行训练,所述损失函数如下:Further, the pedestrian attribute recognition network is trained using a weighted Sigmoid cross-entropy loss function, and the loss function is as follows:
wf=exp(pj)w f =exp(p j )
上述公式中,pj代表属性j的正例数量在训练集中的占比,wj代表正例样本的学习权重,表示模型输出模型对第i个样本预测是否包含第j个属性的概率,yij为第i个样本的第j个属性的标签,N为训练样本总数,K为待识别的属性总数。In the above formula, p j represents the proportion of positive examples of attribute j in the training set, w j represents the learning weight of positive examples, Represents the probability that the model output model predicts whether the ith sample contains the jth attribute, y ij is the label of the jth attribute of the ith sample, N is the total number of training samples, and K is the total number of attributes to be identified.
本发明还提供一种基于循环神经网络注意力模型的行人属性识别技术,包括:The present invention also provides a pedestrian attribute recognition technology based on the cyclic neural network attention model, including:
S1.获取一定数量的具有待识别属性的行人图像,并对图像是否具有某种或某些属性进行标注,获取可以用来训练行人属性识别效果的数据集;并对标注的所有属性按照语义和空间近邻关系进行分组;S1. Obtain a certain number of pedestrian images with attributes to be identified, and mark whether the images have certain or certain attributes, and obtain a data set that can be used to train pedestrian attribute recognition effects; Grouping by spatial neighbor relationship;
S2.利用Inception网络和卷积循环神经网络相结合,构建基于卷积循环神经网络注意力模型的行人属性识别网络;S2. Use the combination of the Inception network and the convolutional recurrent neural network to construct a pedestrian attribute recognition network based on the attention model of the convolutional recurrent neural network;
S3.定义训练行人属性识别网络所需的损失函数,并使用步骤S1获取的训练数据集对步骤S2中构建的行人属性识别网络进行训练;S3. Define the loss function required for training the pedestrian attribute recognition network, and use the training data set obtained in step S1 to train the pedestrian attribute recognition network constructed in step S2;
S4.使用经步骤S3训练得到的行人属性识别网络对待识别行人图像中的属性进行识别。S4. Use the pedestrian attribute identification network trained in step S3 to identify the attributes in the image of the pedestrian to be identified.
进一步的,所述步骤S2包括:Further, the step S2 includes:
S2-1.使用Inception网络对行人原始全身图像进行抽取得到行人全身图像特征N(x);S2-1. Use the Inception network to extract the original full-body image of the pedestrian to obtain the full-body image feature N(x) of the pedestrian;
S2-2.在时刻i,利用行人全身图像特征N(x)使用卷积循环神经网络计算当前时刻所关注的属性组别的注意力热图At(x),并将历史信息存储在卷积循环神经网络的记忆单元中;S2-2. At time i, use the pedestrian whole body image feature N(x) to use the convolutional recurrent neural network to calculate the attention heatmap A t (x) of the attribute group concerned at the current time, and store the historical information in the volume In the memory unit of the product recurrent neural network;
S2-3.使用注意力热图At(x)作用在行人全身图像特征N(x)上得到经过局部高亮的行人特征Ht(x),计算公式如下所示:S2-3. Use the attention heatmap A t (x) to act on the pedestrian whole body image feature N(x) to obtain the locally highlighted pedestrian feature H t (x), the calculation formula is as follows:
S2-4.使用经过局部高亮的特征Ht(x)对第t组属性进行属性识别,输出本组属性预测概率。S2-4. Use the locally highlighted feature H t (x) to identify attributes of the t-th group of attributes, and output the predicted probability of this group of attributes.
进一步的,所述步骤S3定义的损失函数如下所示:Further, the loss function defined in step S3 is as follows:
wj=exp(pj)w j =exp(p j )
上述公式中,pj代表属性j的正例数量在训练集中的占比,wj代表正例样本的学习权重,表示模型输出模型对第i个样本预测是否包含第j个属性的概率,yij为第i个样本的第j个属性的标签,N为训练样本总数,K为待识别的属性总数。In the above formula, p j represents the proportion of positive examples of attribute j in the training set, w j represents the learning weight of positive examples, Indicates the probability that the model output model predicts whether the ith sample contains the jth attribute, y ij is the label of the jth attribute of the ith sample, N is the total number of training samples, and K is the total number of attributes to be identified.
与现有技术相比,本发明的有益效果在于:Compared with the prior art, the beneficial effects of the present invention are:
本发明利用卷积循环神经网络注意力模型挖掘行人属性区域空间位置的关联关系,更加准确地高亮图像中属性对应区域的位置,实现了更高的行人属性识别精度。The invention utilizes the convolutional cyclic neural network attention model to mine the relationship between the spatial positions of the pedestrian attribute regions, more accurately highlights the positions of the corresponding regions of the attributes in the image, and realizes higher pedestrian attribute recognition accuracy.
附图说明Description of drawings
图1是基于循环神经网络注意力模型的行人属性识别网络的结构图。Figure 1 is a structural diagram of a pedestrian attribute recognition network based on a recurrent neural network attention model.
具体实施方式Detailed ways
实施例1Example 1
一种基于循环神经网络注意力模型的行人属性识别网络,如图1所示,包括:A pedestrian attribute recognition network based on a recurrent neural network attention model, as shown in Figure 1, includes:
使用行人原始全身图像作为输入提取行人全身图像特征N(x)的第一卷积神经网络;A first convolutional neural network that uses the original full-body image of the pedestrian as input to extract the feature N(x) of the full-body image of the pedestrian;
使用行人全身图像特征N(x)作为第一输入,上一时刻关注的属性组别的注意力热图At-1(x)作为第二输入,输出当前时刻所关注的属性组别的注意力热图At(x)和经过局部高亮的行人特征Ht(x)的循环神经网络;Use the pedestrian whole body image feature N(x) as the first input, the attention heatmap A t-1 (x) of the attribute group concerned at the previous moment as the second input, and output the attention of the attribute group concerned at the current moment A recurrent neural network with heatmap A t (x) and locally highlighted pedestrian features H t (x);
使用经过局部高亮的行人特征Ht(x)作为输入,输出当前关注组别的属性预测概率的第二卷积神经网络。Using the locally highlighted pedestrian feature H t (x) as input, the second convolutional neural network that outputs the attribute prediction probability of the current attention group.
在本实施例提供的行人属性识别网络中,所述经过局部高亮的行人特征Ht(x)是使用上一时刻关注的属性组别的注意力热图At-1(x)作用在行人全身图像特征N(x)上得到的,计算公式如下:In the pedestrian attribute recognition network provided in this embodiment, the partially highlighted pedestrian feature H t (x) uses the attention heatmap A t-1 (x) of the attribute group concerned at the last moment to act on The calculation formula is as follows:
在本实施例提供的行人属性识别网络中,对所述属性预测概率输出使用批正则化操作,以对抗属性正负例样本不平衡带来的识别误差。In the pedestrian attribute identification network provided in this embodiment, a batch regularization operation is used for the attribute prediction probability output to counteract the identification error caused by the imbalance of positive and negative attribute samples.
在本实施例提供的行人属性识别网络中,还包括:In the pedestrian attribute identification network provided by this embodiment, it also includes:
对于同一张行人原始全身图像的每一个不同的属性组别,循环神经网络的记忆单元状态由所有已经预测过的属性组别的局部高亮过的行人特征共同决定;For each different attribute group of the original full-body image of the same pedestrian, the memory unit state of the RNN is jointly determined by the locally highlighted pedestrian features of all the predicted attribute groups;
对于不同的预测时刻第一卷积神经网络共享权值;For different prediction moments, the first convolutional neural network shares weights;
对于不同的预测时刻第二卷积神经网络共享权值。The second convolutional neural network shares weights for different prediction moments.
在本实施例提供的行人属性识别网络中,所述行人属性识别网络使用加权Sigmoid交叉熵损失函数进行训练,所述损失函数如下:In the pedestrian attribute identification network provided in this embodiment, the pedestrian attribute identification network is trained using a weighted Sigmoid cross-entropy loss function, and the loss function is as follows:
wj=exp(pj)w j =exp(p j )
上述公式中,pj代表属性j的正例数量在训练集中的占比,wj代表正例样本的学习权重,表示模型输出模型对第i个样本预测是否包含第j个属性的概率,yij为第i个样本的第j个属性的标签,N为训练样本总数,K为待识别的属性总数。In the above formula, p j represents the proportion of positive examples of attribute j in the training set, w j represents the learning weight of positive examples, Represents the probability that the model output model predicts whether the ith sample contains the jth attribute, y ij is the label of the jth attribute of the ith sample, N is the total number of training samples, and K is the total number of attributes to be identified.
实施例2Example 2
一种基于循环神经网络注意力模型的行人属性识别技术,包括:A pedestrian attribute recognition technology based on a recurrent neural network attention model, including:
S1.获取一定数量的具有待识别属性的行人图像,并对图像是否具有某种或某些属性进行标注,获取可以用来训练行人属性识别效果的数据集;然后对标注的所有属性进行筛选,再将筛选获得的属性按照语义和空间近邻关系分组;S1. Obtain a certain number of pedestrian images with attributes to be identified, and mark whether the images have certain or certain attributes, and obtain a data set that can be used to train the pedestrian attribute recognition effect; then filter all the marked attributes, Then, the attributes obtained by screening are grouped according to semantic and spatial neighbor relationships;
S2.利用Inception网络和卷积循环神经网络相结合,构建基于卷积循环神经网络注意力模型的行人属性识别网络,具体包括:S2. Use the combination of the Inception network and the convolutional recurrent neural network to construct a pedestrian attribute recognition network based on the attention model of the convolutional recurrent neural network, including:
S2-1.使用Inception网络对行人原始全身图像进行抽取得到行人全身图像特征N(x);S2-1. Use the Inception network to extract the original full-body image of the pedestrian to obtain the full-body image feature N(x) of the pedestrian;
S2-2.在时刻i,利用行人全身图像特征N(x)使用卷积循环神经网络计算当前时刻所关注的属性组别的注意力热图At(x),并将历史信息存储在卷积循环神经网络的记忆单元中;S2-2. At time i, use the pedestrian whole body image feature N(x) to calculate the attention heatmap A t (x) of the attribute group concerned at the current time using the convolutional recurrent neural network, and store the historical information in the volume In the memory unit of the product recurrent neural network;
S2-3.使用注意力热图At(x)作用在行人全身图像特征N(x)上得到经过局部高亮的行人特征Ht(x),计算公式如下所示:S2-3. Use the attention heatmap A t (x) to act on the pedestrian whole body image feature N(x) to obtain the locally highlighted pedestrian feature H t (x), the calculation formula is as follows:
S2-4.使用经过局部高亮的特征Ht(x)对第t组属性进行属性识别,输出本组属性预测概率;S2-4. Use the locally highlighted feature H t (x) to identify attributes of the t-th group of attributes, and output the predicted probability of this group of attributes;
S3.定义训练行人属性识别网络所需的损失函数,损失函数如下所示:S3. Define the loss function required to train the pedestrian attribute recognition network. The loss function is as follows:
wj=exp(pj)w j =exp(p j )
上述公式中,pj代表属性j的正例数量在训练集中的占比,wj代表正例样本的学习权重,表示模型输出模型对第i个样本预测是否包含第j个属性的概率,yij为第i个样本的第j个属性的标签,N为训练样本总数,K为待识别的属性总数;In the above formula, p j represents the proportion of positive examples of attribute j in the training set, w j represents the learning weight of positive examples, Indicates the probability of whether the model output model predicts the ith sample to include the jth attribute, y ij is the label of the jth attribute of the ith sample, N is the total number of training samples, and K is the total number of attributes to be identified;
使用步骤S1获取的训练数据集对步骤S2中构建的行人属性识别网络进行训练;同时利用测试集对训练得到的行人属性识别网络进行测试;Use the training data set obtained in step S1 to train the pedestrian attribute recognition network constructed in step S2; meanwhile, use the test set to test the pedestrian attribute recognition network obtained by training;
S4.使用经步骤S3训练得到的行人属性识别网络在实际应用场景中对待识别行人图像中的属性进行识别。S4. Use the pedestrian attribute identification network trained in step S3 to identify attributes in the image of pedestrians to be identified in practical application scenarios.
下面以行人属性识别RAP数据集为基础对本发明提供的行人属性识别技术进行详细说明。The following describes the pedestrian attribute identification technology provided by the present invention in detail based on the pedestrian attribute identification RAP data set.
(1)以行人属性识别RAP数据集作为用来训练和测试行人属性识别效果的数据集。RAP数据集是由中科院自动化所团队整理得到的行人属性数据集,该数据集使用26个摄像头对商场内的行人监控视频进行图像采集,通过对行人属性的上下文信息以及环境因素的分析,最终筛选出41,585张行人图像加入到该数据集中;并且对每张图像都标注了72个属性,包括视角信息、是否存在遮挡、身体部位信息等。(1) The RAP dataset for pedestrian attribute recognition is used as the dataset for training and testing the effect of pedestrian attribute recognition. The RAP data set is a pedestrian attribute data set compiled by the team of the Institute of Automation, Chinese Academy of Sciences. The data set uses 26 cameras to collect images of pedestrian surveillance videos in shopping malls. Through the analysis of the context information of pedestrian attributes and environmental factors, the final selection is made. 41,585 pedestrian images are added to this dataset; and 72 attributes are annotated for each image, including perspective information, whether there is occlusion, body part information, etc.
(2)对RAP数据集中的72个属性进行筛选,筛选出需要使用的属性51个,并按照语义和空间近邻关系分为10组,具体如表1所示。(2) The 72 attributes in the RAP dataset are screened, and 51 attributes that need to be used are screened out, and they are divided into 10 groups according to the semantic and spatial neighbor relationship, as shown in Table 1.
表1 RAP数据集中的51个属性以及对应的组别Table 1 51 attributes and corresponding groups in the RAP dataset
(3)构建如图1所示的行人属性识别网络,该网络利用卷积循环神经网络进行不同分组下行人属性注意力模型的训练,利用注意力模型结合Inception卷积神经网络结合进行行人属性识别。(3) Construct the pedestrian attribute recognition network shown in Figure 1. The network uses the convolutional recurrent neural network to train the attention models of pedestrian attributes in different groups, and uses the attention model combined with the Inception convolutional neural network for pedestrian attribute recognition. .
(4)在训练集中计算每个属性标签,正例样本占所有样本的比例pj。(4) Calculate each attribute label in the training set, and the proportion p j of positive samples to all samples.
(5)定义训练行人属性识别网络所需的损失函数,并将(4)中计算得到的pj带入计算,具体如下:(5) Define the loss function required for training the pedestrian attribute recognition network, and bring the p j calculated in (4) into the calculation, as follows:
wj=exp(pj)w j =exp(p j )
(6)使用随机梯度下降算法行人属性识别网络,训练过程的超参数设置如下:(6) Using the stochastic gradient descent algorithm for pedestrian attribute recognition network, the hyperparameters of the training process are set as follows:
初始学习率:0.1,批大小(Batch Size):64,每隔10000轮学习率下降到初始学习率的1/10,使用在Imagenet图像分类任务上预训练好的深度模型作为行人属性识别模型的初值。The initial learning rate: 0.1, the batch size (Batch Size): 64, the learning rate is reduced to 1/10 of the initial learning rate every 10,000 rounds, and the deep model pre-trained on the Imagenet image classification task is used as the pedestrian attribute recognition model. initial value.
(7)在实际测试场景下,将待检测图像输入经步骤(6)训练得到的行人属性识别网络,该网络分10次输出对应步骤(2)中分组属性的预测概率向量,共51个。对于每一个属性对应的概率输出,如果该概率值大于0.5,则认为具有该属性,否则认为不具有该属性。对每一个属性的概率输出依次进行判断,最终输出一个对行人的所有51个属性的识别结果。(7) In the actual test scenario, the image to be detected is input into the pedestrian attribute recognition network trained in step (6), and the network outputs 51 prediction probability vectors corresponding to the grouped attributes in step (2) in 10 times. For the probability output corresponding to each attribute, if the probability value is greater than 0.5, it is considered to have the attribute, otherwise it is considered not to have the attribute. The probability output of each attribute is judged in turn, and finally a recognition result of all 51 attributes of pedestrians is output.
本发明提供的基于循环神经网络注意力模型的行人属性识别技术与现有的行人属性识别方法相比具有更高的识别精度。本发明提供的行人属性识别技术在目前两个主流的行人属性识别公开数据集上进行评测获得了比现有CNN方法和CNN-RNN方法更高的测评精度。Compared with the existing pedestrian attribute identification methods, the pedestrian attribute identification technology based on the cyclic neural network attention model provided by the present invention has higher identification accuracy. The pedestrian attribute recognition technology provided by the present invention obtains a higher evaluation accuracy than the existing CNN method and the CNN-RNN method by evaluating on two current mainstream pedestrian attribute recognition public data sets.
对行人属性识别精度一般采用mA(mean accuracy,平均准确率)衡量属性识别算法的优劣,由于属性分布不均衡的特点,为了保证准确率计算结果的合理性,mA会针对每个属性分别计算正例和负例的准确率,取平均值作为属性识别的准确率,然后还会综合全部属性的准确率平均值,计算得到该属性最终的mA值。mA的计算公式如下:For pedestrian attribute recognition accuracy, mA (mean accuracy, average accuracy) is generally used to measure the pros and cons of attribute recognition algorithms. Due to the uneven distribution of attributes, in order to ensure the rationality of the accuracy calculation results, mA will be calculated separately for each attribute. The accuracy of positive and negative examples is taken as the average of the accuracy of attribute recognition, and then the average of the accuracy of all attributes is integrated to calculate the final mA value of the attribute. The formula for calculating mA is as follows:
其中,L代表属性的数量;Pi代表正例的数量,TPi代表正确预测的正例的数量;Ni代表负例的数量,TNi代表正确预测的负例的数量。Among them, L represents the number of attributes; Pi represents the number of positive examples, TP i represents the number of correctly predicted positive examples; Ni represents the number of negative examples, and T i represents the number of correctly predicted negative examples.
本发明提出的行人属性识别技术的mA值相比于背景技术中提出的DeepMAR方法提高8.76%,相比于JRL方法提高3.35%。此外,本发明提出的行人属性识别技术是一个端到端训练预测的方法,在模型训练和属性预测的过程中非常简单、易用和高效,这是JRL方法所不具备的优势。Compared with the DeepMAR method proposed in the background art, the mA value of the pedestrian attribute recognition technology proposed by the present invention is increased by 8.76%, and compared with the JRL method by 3.35%. In addition, the pedestrian attribute recognition technology proposed in the present invention is an end-to-end training prediction method, which is very simple, easy to use and efficient in the process of model training and attribute prediction, which is an advantage that the JRL method does not have.
最后应说明的是,以上实施例仅用以说明本发明的技术方案而非限制,尽管参照较佳实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,可以对本发明的技术方案进行修改或者等同替换,而不脱离本发明技术方案的精神和范围,其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present invention can be Modifications or equivalent substitutions are made without departing from the spirit and scope of the technical solutions of the present invention, and they should all be included in the scope of the claims of the present invention.
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810616398.XA CN108921051B (en) | 2018-06-15 | 2018-06-15 | Pedestrian Attribute Recognition Network and Technology Based on Recurrent Neural Network Attention Model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810616398.XA CN108921051B (en) | 2018-06-15 | 2018-06-15 | Pedestrian Attribute Recognition Network and Technology Based on Recurrent Neural Network Attention Model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108921051A CN108921051A (en) | 2018-11-30 |
CN108921051B true CN108921051B (en) | 2022-05-20 |
Family
ID=64421633
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810616398.XA Active CN108921051B (en) | 2018-06-15 | 2018-06-15 | Pedestrian Attribute Recognition Network and Technology Based on Recurrent Neural Network Attention Model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108921051B (en) |
Families Citing this family (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109711386B (en) * | 2019-01-10 | 2020-10-09 | 北京达佳互联信息技术有限公司 | Method and device for obtaining recognition model, electronic equipment and storage medium |
CN109815902B (en) * | 2019-01-24 | 2021-04-27 | 北京邮电大学 | A kind of pedestrian attribute area information acquisition method, device and equipment |
CN109886154A (en) * | 2019-01-30 | 2019-06-14 | 电子科技大学 | Pedestrian appearance attribute recognition method based on multi-dataset joint training based on Inception V3 |
CN109886241A (en) * | 2019-03-05 | 2019-06-14 | 天津工业大学 | Driver fatigue detection based on long short-term memory network |
CN110032952B (en) * | 2019-03-26 | 2020-11-10 | 西安交通大学 | Road boundary point detection method based on deep learning |
CN110110601B (en) * | 2019-04-04 | 2023-04-25 | 深圳久凌软件技术有限公司 | Video pedestrian re-recognition method and device based on multi-time space attention model |
CN109978077B (en) * | 2019-04-08 | 2021-03-12 | 南京旷云科技有限公司 | Visual recognition method, device and system and storage medium |
CN110163296B (en) * | 2019-05-29 | 2020-12-18 | 北京达佳互联信息技术有限公司 | Image recognition method, device, equipment and storage medium |
CN110287836B (en) * | 2019-06-14 | 2021-10-15 | 北京迈格威科技有限公司 | Image classification method and device, computer equipment and storage medium |
CN110458215B (en) * | 2019-07-30 | 2023-03-24 | 天津大学 | Pedestrian attribute identification method based on multi-temporal attention model |
CN110688888B (en) * | 2019-08-02 | 2022-08-05 | 杭州未名信科科技有限公司 | Pedestrian attribute identification method and system based on deep learning |
CN110569779B (en) * | 2019-08-28 | 2022-10-04 | 西北工业大学 | Pedestrian attribute identification method based on pedestrian local and overall attribute joint learning |
CN110633421B (en) * | 2019-09-09 | 2020-08-11 | 北京瑞莱智慧科技有限公司 | Feature extraction, recommendation, and prediction methods, devices, media, and apparatuses |
CN110598631B (en) * | 2019-09-12 | 2021-04-02 | 合肥工业大学 | Pedestrian attribute identification method and system based on sequence context learning |
CN110705474B (en) * | 2019-09-30 | 2022-05-03 | 清华大学 | Pedestrian attribute identification method and device |
CN111539341B (en) * | 2020-04-26 | 2023-09-22 | 香港中文大学(深圳) | Targeting methods, devices, electronic devices and media |
CN113706437B (en) * | 2020-05-21 | 2024-03-15 | 国网智能科技股份有限公司 | Method and system for diagnosing defects of fine-granularity bolts of power transmission line |
CN112580494A (en) * | 2020-12-16 | 2021-03-30 | 北京影谱科技股份有限公司 | Method and device for identifying and tracking personnel in monitoring video based on deep learning |
CN114067261A (en) * | 2021-10-25 | 2022-02-18 | 神思电子技术股份有限公司 | Pedestrian attribute identification method and system based on spatialization structural relationship |
CN114694177B (en) * | 2022-03-10 | 2023-04-28 | 电子科技大学 | Fine-grained person attribute recognition method based on multi-scale feature and attribute association mining |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104050685A (en) * | 2014-06-10 | 2014-09-17 | 西安理工大学 | Moving target detection method based on particle filtering visual attention model |
CN106971154A (en) * | 2017-03-16 | 2017-07-21 | 天津大学 | Pedestrian's attribute forecast method based on length memory-type recurrent neural network |
CN107341462A (en) * | 2017-06-28 | 2017-11-10 | 电子科技大学 | A kind of video classification methods based on notice mechanism |
CN107704838A (en) * | 2017-10-19 | 2018-02-16 | 北京旷视科技有限公司 | The attribute recognition approach and device of destination object |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016065534A1 (en) * | 2014-10-28 | 2016-05-06 | 中国科学院自动化研究所 | Deep learning-based gait recognition method |
US9830529B2 (en) * | 2016-04-26 | 2017-11-28 | Xerox Corporation | End-to-end saliency mapping via probability distribution prediction |
-
2018
- 2018-06-15 CN CN201810616398.XA patent/CN108921051B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104050685A (en) * | 2014-06-10 | 2014-09-17 | 西安理工大学 | Moving target detection method based on particle filtering visual attention model |
CN106971154A (en) * | 2017-03-16 | 2017-07-21 | 天津大学 | Pedestrian's attribute forecast method based on length memory-type recurrent neural network |
CN107341462A (en) * | 2017-06-28 | 2017-11-10 | 电子科技大学 | A kind of video classification methods based on notice mechanism |
CN107704838A (en) * | 2017-10-19 | 2018-02-16 | 北京旷视科技有限公司 | The attribute recognition approach and device of destination object |
Also Published As
Publication number | Publication date |
---|---|
CN108921051A (en) | 2018-11-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108921051B (en) | Pedestrian Attribute Recognition Network and Technology Based on Recurrent Neural Network Attention Model | |
CN109993072B (en) | Low-resolution pedestrian re-identification system and method based on super-resolution image generation | |
Xia et al. | A deep Siamese postclassification fusion network for semantic change detection | |
CN106022229B (en) | Abnormal Behavior Recognition Method Based on Video Motion Information Feature Extraction and Adaptive Enhancement Algorithm Error Backpropagation Network | |
CN109858390A (en) | The Activity recognition method of human skeleton based on end-to-end space-time diagram learning neural network | |
CN110879982B (en) | A crowd counting system and method | |
CN112884742A (en) | Multi-algorithm fusion-based multi-target real-time detection, identification and tracking method | |
CN111382686A (en) | A lane line detection method based on semi-supervised generative adversarial network | |
CN111797814A (en) | Unsupervised cross-domain action recognition method based on channel fusion and classifier confrontation | |
CN114037056A (en) | Method and device for generating neural network, computer equipment and storage medium | |
CN112598165A (en) | Private car data-based urban functional area transfer flow prediction method and device | |
CN107301376A (en) | A kind of pedestrian detection method stimulated based on deep learning multilayer | |
CN112818849A (en) | Crowd density detection algorithm based on context attention convolutional neural network of counterstudy | |
CN112052795B (en) | Video behavior identification method based on multi-scale space-time feature aggregation | |
CN113642596A (en) | A brain network classification method based on community detection and two-way autoencoding | |
CN113783715A (en) | Opportunistic network topology prediction method adopting causal convolutional neural network | |
CN116310812B (en) | Semantic change detection method for high-resolution remote sensing images based on semi-supervised semantic segmentation and contrastive learning | |
CN116434010A (en) | Multi-view pedestrian attribute identification method | |
Sun et al. | Automatic building age prediction from street view images | |
CN110163130B (en) | A feature pre-aligned random forest classification system and method for gesture recognition | |
CN116052254A (en) | Visual continuous emotion recognition method based on extended Kalman filtering neural network | |
CN112529025A (en) | Data processing method and device | |
CN108154199B (en) | High-precision rapid single-class target detection method based on deep learning | |
Zhang et al. | Semisupervised change detection based on bihierarchical feature aggregation and extraction network | |
Xu et al. | An improved multi-scale and knowledge distillation method for efficient pedestrian detection in dense scenes |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20181130 Assignee: CSIC PRIDE(Nanjing)Intelligent Equipment System Co.,Ltd Assignor: TSINGHUA University Contract record no.: X2023320000119 Denomination of invention: Pedestrian attribute recognition network and technology based on recurrent neural network attention model Granted publication date: 20220520 License type: Common License Record date: 20230323 |
|
EE01 | Entry into force of recordation of patent licensing contract |