CN108921051B

CN108921051B - Pedestrian Attribute Recognition Network and Technology Based on Recurrent Neural Network Attention Model

Info

Publication number: CN108921051B
Application number: CN201810616398.XA
Authority: CN
Inventors: 丁贵广; 赵鑫
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2018-06-15
Filing date: 2018-06-15
Publication date: 2022-05-20
Anticipated expiration: 2038-06-15
Also published as: CN108921051A

Abstract

The invention provides a pedestrian attribute identification network based on a recurrent neural network attention model and a pedestrian attribute identification technology. The pedestrian attribute identification network comprises a first convolution neural network for extracting the characteristics of a pedestrian whole-body image by using a pedestrian original whole-body image as an input; using the pedestrian whole-body image feature as a first input, using the attention heat map of the attribute group concerned at the previous moment as a second input, and outputting the attention heat map of the attribute group concerned at the current moment and a recurrent neural network passing through the partially highlighted pedestrian feature; and outputting a second convolutional neural network of the attribute prediction probability of the current attention group by using the pedestrian feature subjected to local highlight as an input. The invention utilizes the convolution cyclic neural network attention model to excavate the incidence relation of the spatial positions of the pedestrian attribute regions, more accurately highlights the positions of the regions corresponding to the attributes in the image, and realizes higher pedestrian attribute identification precision.

Description

Pedestrian Attribute Recognition Network and Technology Based on Recurrent Neural Network Attention Model

技术领域technical field

本发明属于神经网络和图像识别技术领域，尤其涉及一种基于循环神经网络注意力模型的行人属性识别网络及技术。The invention belongs to the technical field of neural network and image recognition, and in particular relates to a pedestrian attribute recognition network and technology based on a cyclic neural network attention model.

背景技术Background technique

行人属性识别技术能够帮助人们自动完成从海量的图像和视频数据中搜寻特定人员的任务。但是由于监控视频的图像质量低、有标注的行人属性数据集较小、难以获得等因素的影响，极大地增加了从监控视频图像中进行行人属性识别的难度。现有的基于深度神经网络的行人属性识别方法分为卷积神经网络(CNN)方法和卷积神经网络与循环神经网络结合方法(CNN-RNN)两大类。现有的CNN方法如DeepMAR方法尝试孤立地从整张图像的特征中识别每一种行人的属性，虽然这种方法取得了一定的效果，但是它忽视了行人属性的空间局部性和属性之间的关联关系，难以得到更高的识别精度。现有的CNN-RNN方法如JRL方法试图使用循环神经网络逐步挖掘行人属性之间的语义关联关系，如穿裙子的一般是女人等，在识别精度上较纯CNN方法有一定的提高。然而，这种方法只考虑了行人属性间的语义联系却忽视了属性的空间局部性。行人的很多属性的焦点集中在图像的某一个区域里，例如是否戴眼镜和是否留长发都只决定于行人头部区域的视觉特征，其他区域用处不大。如果将这种空间的局部性考虑到行人属性识别模型的构建过程当中去，在识别头部属性的时候高亮头部区域，忽视背景噪声的干扰，就可以大大提高行人属性识别精度。Pedestrian attribute recognition technology can help people automatically complete the task of searching for specific people from massive image and video data. However, due to the low image quality of surveillance video, the small dataset of annotated pedestrian attributes, and the difficulty in obtaining them, it has greatly increased the difficulty of pedestrian attribute recognition from surveillance video images. Existing pedestrian attribute recognition methods based on deep neural networks can be divided into two categories: convolutional neural network (CNN) method and convolutional neural network and recurrent neural network combination method (CNN-RNN). Existing CNN methods such as DeepMAR try to identify each pedestrian attribute from the features of the entire image in isolation. Although this method has achieved certain results, it ignores the spatial locality of pedestrian attributes and the relationship between attributes. It is difficult to obtain higher recognition accuracy. Existing CNN-RNN methods such as JRL methods try to use recurrent neural networks to gradually mine the semantic relationship between pedestrian attributes, such as women wearing skirts, etc., and the recognition accuracy is improved compared with pure CNN methods. However, this method only considers the semantic connections among pedestrian attributes but ignores the spatial locality of attributes. Many attributes of pedestrians focus on a certain area of the image, such as whether to wear glasses and whether to have long hair, which only depends on the visual characteristics of the pedestrian's head area, and other areas are of little use. If the locality of this space is taken into account in the construction process of the pedestrian attribute recognition model, the head area is highlighted when the head attribute is identified, and the interference of background noise is ignored, the pedestrian attribute recognition accuracy can be greatly improved.

发明内容SUMMARY OF THE INVENTION

为了解决上述技术问题，本发明提供一种基于循环神经网络注意力模型的行人属性识别网络，包括：In order to solve the above-mentioned technical problems, the present invention provides a pedestrian attribute recognition network based on a recurrent neural network attention model, including:

使用行人原始全身图像作为输入提取行人全身图像特征N(x)的第一卷积神经网络；A first convolutional neural network that uses the original full-body image of the pedestrian as input to extract the feature N(x) of the full-body image of the pedestrian;

使用行人全身图像特征N(x)作为第一输入，上一时刻关注的属性组别的注意力热图A_t-1(x)作为第二输入，输出当前时刻所关注的属性组别的注意力热图A_t(x)和经过局部高亮的行人特征H_t(x)的循环神经网络；Use the pedestrian whole body image feature N(x) as the first input, the attention heatmap A _t-1 (x) of the attribute group concerned at the previous moment as the second input, and output the attention of the attribute group concerned at the current moment A recurrent neural network with heatmap A _t (x) and locally highlighted pedestrian features H _t (x);

使用经过局部高亮的行人特征H_t(x)作为输入，输出当前关注组别的属性预测概率的第二卷积神经网络。Using the locally highlighted pedestrian feature H _t (x) as input, the second convolutional neural network that outputs the attribute prediction probability of the current attention group.

进一步的，所述经过局部高亮的行人特征H_t(x)是使用上一时刻关注的属性组别的注意力热图A_t-1(x)作用在行人全身图像特征N(x)上得到的，计算公式如下：Further, the partially highlighted pedestrian feature H _t (x) uses the attention heatmap A _t-1 (x) of the attribute group concerned at the last moment to act on the pedestrian whole body image feature N (x) Obtained, the calculation formula is as follows:

进一步的，对所述属性预测概率输出使用批正则化操作，以对抗属性正负例样本不平衡带来的识别误差。Further, a batch regularization operation is used on the attribute prediction probability output to combat the identification error caused by the imbalance of positive and negative samples of the attribute.

进一步的，所述行人属性识别网络包括：Further, the pedestrian attribute identification network includes:

对于同一张行人原始全身图像的每一个不同的属性组别，循环神经网络的记忆单元状态由所有已经预测过的属性组别的局部高亮过的行人特征共同决定；For each different attribute group of the original full-body image of the same pedestrian, the memory unit state of the RNN is jointly determined by the locally highlighted pedestrian features of all the predicted attribute groups;

对于不同的预测时刻第一卷积神经网络共享权值；For different prediction moments, the first convolutional neural network shares weights;

对于不同的预测时刻第二卷积神经网络共享权值。The second convolutional neural network shares weights for different prediction moments.

进一步的，所述行人属性识别网络使用加权Sigmoid交叉熵损失函数进行训练，所述损失函数如下：Further, the pedestrian attribute recognition network is trained using a weighted Sigmoid cross-entropy loss function, and the loss function is as follows:

w_f＝exp(p_j)w _f =exp(p _j )

上述公式中，p_j代表属性j的正例数量在训练集中的占比，w_j代表正例样本的学习权重，

表示模型输出模型对第i个样本预测是否包含第j个属性的概率，y_ij为第i个样本的第j个属性的标签，N为训练样本总数，K为待识别的属性总数。In the above formula, p _j represents the proportion of positive examples of attribute j in the training set, w _j represents the learning weight of positive examples,

Represents the probability that the model output model predicts whether the ith sample contains the jth attribute, y _ij is the label of the jth attribute of the ith sample, N is the total number of training samples, and K is the total number of attributes to be identified.

本发明还提供一种基于循环神经网络注意力模型的行人属性识别技术，包括：The present invention also provides a pedestrian attribute recognition technology based on the cyclic neural network attention model, including:

S1.获取一定数量的具有待识别属性的行人图像，并对图像是否具有某种或某些属性进行标注，获取可以用来训练行人属性识别效果的数据集；并对标注的所有属性按照语义和空间近邻关系进行分组；S1. Obtain a certain number of pedestrian images with attributes to be identified, and mark whether the images have certain or certain attributes, and obtain a data set that can be used to train pedestrian attribute recognition effects; Grouping by spatial neighbor relationship;

S2.利用Inception网络和卷积循环神经网络相结合，构建基于卷积循环神经网络注意力模型的行人属性识别网络；S2. Use the combination of the Inception network and the convolutional recurrent neural network to construct a pedestrian attribute recognition network based on the attention model of the convolutional recurrent neural network;

S3.定义训练行人属性识别网络所需的损失函数，并使用步骤S1获取的训练数据集对步骤S2中构建的行人属性识别网络进行训练；S3. Define the loss function required for training the pedestrian attribute recognition network, and use the training data set obtained in step S1 to train the pedestrian attribute recognition network constructed in step S2;

S4.使用经步骤S3训练得到的行人属性识别网络对待识别行人图像中的属性进行识别。S4. Use the pedestrian attribute identification network trained in step S3 to identify the attributes in the image of the pedestrian to be identified.

进一步的，所述步骤S2包括：Further, the step S2 includes:

S2-1.使用Inception网络对行人原始全身图像进行抽取得到行人全身图像特征N(x)；S2-1. Use the Inception network to extract the original full-body image of the pedestrian to obtain the full-body image feature N(x) of the pedestrian;

S2-2.在时刻i，利用行人全身图像特征N(x)使用卷积循环神经网络计算当前时刻所关注的属性组别的注意力热图A_t(x)，并将历史信息存储在卷积循环神经网络的记忆单元中；S2-2. At time i, use the pedestrian whole body image feature N(x) to use the convolutional recurrent neural network to calculate the attention heatmap A _t (x) of the attribute group concerned at the current time, and store the historical information in the volume In the memory unit of the product recurrent neural network;

S2-3.使用注意力热图A_t(x)作用在行人全身图像特征N(x)上得到经过局部高亮的行人特征H_t(x)，计算公式如下所示：S2-3. Use the attention heatmap A _t (x) to act on the pedestrian whole body image feature N(x) to obtain the locally highlighted pedestrian feature H _t (x), the calculation formula is as follows:

S2-4.使用经过局部高亮的特征H_t(x)对第t组属性进行属性识别，输出本组属性预测概率。S2-4. Use the locally highlighted feature H _t (x) to identify attributes of the t-th group of attributes, and output the predicted probability of this group of attributes.

进一步的，所述步骤S3定义的损失函数如下所示：Further, the loss function defined in step S3 is as follows:

w_j＝exp(p_j)w _j =exp(p _j )

Indicates the probability that the model output model predicts whether the ith sample contains the jth attribute, y _ij is the label of the jth attribute of the ith sample, N is the total number of training samples, and K is the total number of attributes to be identified.

与现有技术相比，本发明的有益效果在于：Compared with the prior art, the beneficial effects of the present invention are:

本发明利用卷积循环神经网络注意力模型挖掘行人属性区域空间位置的关联关系，更加准确地高亮图像中属性对应区域的位置，实现了更高的行人属性识别精度。The invention utilizes the convolutional cyclic neural network attention model to mine the relationship between the spatial positions of the pedestrian attribute regions, more accurately highlights the positions of the corresponding regions of the attributes in the image, and realizes higher pedestrian attribute recognition accuracy.

附图说明Description of drawings

图1是基于循环神经网络注意力模型的行人属性识别网络的结构图。Figure 1 is a structural diagram of a pedestrian attribute recognition network based on a recurrent neural network attention model.

具体实施方式Detailed ways

实施例1Example 1

一种基于循环神经网络注意力模型的行人属性识别网络，如图1所示，包括：A pedestrian attribute recognition network based on a recurrent neural network attention model, as shown in Figure 1, includes:

在本实施例提供的行人属性识别网络中，所述经过局部高亮的行人特征H_t(x)是使用上一时刻关注的属性组别的注意力热图A_t-1(x)作用在行人全身图像特征N(x)上得到的，计算公式如下：In the pedestrian attribute recognition network provided in this embodiment, the partially highlighted pedestrian feature H _t (x) uses the attention heatmap A _t-1 (x) of the attribute group concerned at the last moment to act on The calculation formula is as follows:

在本实施例提供的行人属性识别网络中，对所述属性预测概率输出使用批正则化操作，以对抗属性正负例样本不平衡带来的识别误差。In the pedestrian attribute identification network provided in this embodiment, a batch regularization operation is used for the attribute prediction probability output to counteract the identification error caused by the imbalance of positive and negative attribute samples.

在本实施例提供的行人属性识别网络中，还包括：In the pedestrian attribute identification network provided by this embodiment, it also includes:

在本实施例提供的行人属性识别网络中，所述行人属性识别网络使用加权Sigmoid交叉熵损失函数进行训练，所述损失函数如下：In the pedestrian attribute identification network provided in this embodiment, the pedestrian attribute identification network is trained using a weighted Sigmoid cross-entropy loss function, and the loss function is as follows:

w_j＝exp(p_j)w _j =exp(p _j )

实施例2Example 2

一种基于循环神经网络注意力模型的行人属性识别技术，包括：A pedestrian attribute recognition technology based on a recurrent neural network attention model, including:

S1.获取一定数量的具有待识别属性的行人图像，并对图像是否具有某种或某些属性进行标注，获取可以用来训练行人属性识别效果的数据集；然后对标注的所有属性进行筛选，再将筛选获得的属性按照语义和空间近邻关系分组；S1. Obtain a certain number of pedestrian images with attributes to be identified, and mark whether the images have certain or certain attributes, and obtain a data set that can be used to train the pedestrian attribute recognition effect; then filter all the marked attributes, Then, the attributes obtained by screening are grouped according to semantic and spatial neighbor relationships;

S2.利用Inception网络和卷积循环神经网络相结合，构建基于卷积循环神经网络注意力模型的行人属性识别网络，具体包括：S2. Use the combination of the Inception network and the convolutional recurrent neural network to construct a pedestrian attribute recognition network based on the attention model of the convolutional recurrent neural network, including:

S2-2.在时刻i，利用行人全身图像特征N(x)使用卷积循环神经网络计算当前时刻所关注的属性组别的注意力热图A_t(x)，并将历史信息存储在卷积循环神经网络的记忆单元中；S2-2. At time i, use the pedestrian whole body image feature N(x) to calculate the attention heatmap A _t (x) of the attribute group concerned at the current time using the convolutional recurrent neural network, and store the historical information in the volume In the memory unit of the product recurrent neural network;

S2-4.使用经过局部高亮的特征H_t(x)对第t组属性进行属性识别，输出本组属性预测概率；S2-4. Use the locally highlighted feature H _t (x) to identify attributes of the t-th group of attributes, and output the predicted probability of this group of attributes;

S3.定义训练行人属性识别网络所需的损失函数，损失函数如下所示：S3. Define the loss function required to train the pedestrian attribute recognition network. The loss function is as follows:

w_j＝exp(p_j)w _j =exp(p _j )

表示模型输出模型对第i个样本预测是否包含第j个属性的概率，y_ij为第i个样本的第j个属性的标签，N为训练样本总数，K为待识别的属性总数；In the above formula, p _j represents the proportion of positive examples of attribute j in the training set, w _j represents the learning weight of positive examples,

Indicates the probability of whether the model output model predicts the ith sample to include the jth attribute, y _ij is the label of the jth attribute of the ith sample, N is the total number of training samples, and K is the total number of attributes to be identified;

使用步骤S1获取的训练数据集对步骤S2中构建的行人属性识别网络进行训练；同时利用测试集对训练得到的行人属性识别网络进行测试；Use the training data set obtained in step S1 to train the pedestrian attribute recognition network constructed in step S2; meanwhile, use the test set to test the pedestrian attribute recognition network obtained by training;

S4.使用经步骤S3训练得到的行人属性识别网络在实际应用场景中对待识别行人图像中的属性进行识别。S4. Use the pedestrian attribute identification network trained in step S3 to identify attributes in the image of pedestrians to be identified in practical application scenarios.

下面以行人属性识别RAP数据集为基础对本发明提供的行人属性识别技术进行详细说明。The following describes the pedestrian attribute identification technology provided by the present invention in detail based on the pedestrian attribute identification RAP data set.

(1)以行人属性识别RAP数据集作为用来训练和测试行人属性识别效果的数据集。RAP数据集是由中科院自动化所团队整理得到的行人属性数据集，该数据集使用26个摄像头对商场内的行人监控视频进行图像采集，通过对行人属性的上下文信息以及环境因素的分析，最终筛选出41,585张行人图像加入到该数据集中；并且对每张图像都标注了72个属性，包括视角信息、是否存在遮挡、身体部位信息等。(1) The RAP dataset for pedestrian attribute recognition is used as the dataset for training and testing the effect of pedestrian attribute recognition. The RAP data set is a pedestrian attribute data set compiled by the team of the Institute of Automation, Chinese Academy of Sciences. The data set uses 26 cameras to collect images of pedestrian surveillance videos in shopping malls. Through the analysis of the context information of pedestrian attributes and environmental factors, the final selection is made. 41,585 pedestrian images are added to this dataset; and 72 attributes are annotated for each image, including perspective information, whether there is occlusion, body part information, etc.

(2)对RAP数据集中的72个属性进行筛选，筛选出需要使用的属性51个，并按照语义和空间近邻关系分为10组，具体如表1所示。(2) The 72 attributes in the RAP dataset are screened, and 51 attributes that need to be used are screened out, and they are divided into 10 groups according to the semantic and spatial neighbor relationship, as shown in Table 1.

表1 RAP数据集中的51个属性以及对应的组别Table 1 51 attributes and corresponding groups in the RAP dataset

(3)构建如图1所示的行人属性识别网络，该网络利用卷积循环神经网络进行不同分组下行人属性注意力模型的训练，利用注意力模型结合Inception卷积神经网络结合进行行人属性识别。(3) Construct the pedestrian attribute recognition network shown in Figure 1. The network uses the convolutional recurrent neural network to train the attention models of pedestrian attributes in different groups, and uses the attention model combined with the Inception convolutional neural network for pedestrian attribute recognition. .

(4)在训练集中计算每个属性标签，正例样本占所有样本的比例p_j。(4) Calculate each attribute label in the training set, and the proportion p _j of positive samples to all samples.

(5)定义训练行人属性识别网络所需的损失函数，并将(4)中计算得到的p_j带入计算，具体如下：(5) Define the loss function required for training the pedestrian attribute recognition network, and bring the p _j calculated in (4) into the calculation, as follows:

w_j＝exp(p_j)w _j =exp(p _j )

(6)使用随机梯度下降算法行人属性识别网络，训练过程的超参数设置如下：(6) Using the stochastic gradient descent algorithm for pedestrian attribute recognition network, the hyperparameters of the training process are set as follows:

初始学习率：0.1，批大小(Batch Size)：64，每隔10000轮学习率下降到初始学习率的1/10，使用在Imagenet图像分类任务上预训练好的深度模型作为行人属性识别模型的初值。The initial learning rate: 0.1, the batch size (Batch Size): 64, the learning rate is reduced to 1/10 of the initial learning rate every 10,000 rounds, and the deep model pre-trained on the Imagenet image classification task is used as the pedestrian attribute recognition model. initial value.

(7)在实际测试场景下，将待检测图像输入经步骤(6)训练得到的行人属性识别网络，该网络分10次输出对应步骤(2)中分组属性的预测概率向量，共51个。对于每一个属性对应的概率输出，如果该概率值大于0.5，则认为具有该属性，否则认为不具有该属性。对每一个属性的概率输出依次进行判断，最终输出一个对行人的所有51个属性的识别结果。(7) In the actual test scenario, the image to be detected is input into the pedestrian attribute recognition network trained in step (6), and the network outputs 51 prediction probability vectors corresponding to the grouped attributes in step (2) in 10 times. For the probability output corresponding to each attribute, if the probability value is greater than 0.5, it is considered to have the attribute, otherwise it is considered not to have the attribute. The probability output of each attribute is judged in turn, and finally a recognition result of all 51 attributes of pedestrians is output.

本发明提供的基于循环神经网络注意力模型的行人属性识别技术与现有的行人属性识别方法相比具有更高的识别精度。本发明提供的行人属性识别技术在目前两个主流的行人属性识别公开数据集上进行评测获得了比现有CNN方法和CNN-RNN方法更高的测评精度。Compared with the existing pedestrian attribute identification methods, the pedestrian attribute identification technology based on the cyclic neural network attention model provided by the present invention has higher identification accuracy. The pedestrian attribute recognition technology provided by the present invention obtains a higher evaluation accuracy than the existing CNN method and the CNN-RNN method by evaluating on two current mainstream pedestrian attribute recognition public data sets.

对行人属性识别精度一般采用mA(mean accuracy，平均准确率)衡量属性识别算法的优劣，由于属性分布不均衡的特点，为了保证准确率计算结果的合理性，mA会针对每个属性分别计算正例和负例的准确率，取平均值作为属性识别的准确率，然后还会综合全部属性的准确率平均值，计算得到该属性最终的mA值。mA的计算公式如下：For pedestrian attribute recognition accuracy, mA (mean accuracy, average accuracy) is generally used to measure the pros and cons of attribute recognition algorithms. Due to the uneven distribution of attributes, in order to ensure the rationality of the accuracy calculation results, mA will be calculated separately for each attribute. The accuracy of positive and negative examples is taken as the average of the accuracy of attribute recognition, and then the average of the accuracy of all attributes is integrated to calculate the final mA value of the attribute. The formula for calculating mA is as follows:

其中，L代表属性的数量；P_i代表正例的数量，TP_i代表正确预测的正例的数量；N_i代表负例的数量，TN_i代表正确预测的负例的数量。Among them, L represents the number of attributes; Pi represents the number of positive examples, TP _i represents the number of correctly predicted positive examples; _Ni represents the number of negative examples, and T _i _represents the number of correctly predicted negative examples.

本发明提出的行人属性识别技术的mA值相比于背景技术中提出的DeepMAR方法提高8.76％，相比于JRL方法提高3.35％。此外，本发明提出的行人属性识别技术是一个端到端训练预测的方法，在模型训练和属性预测的过程中非常简单、易用和高效，这是JRL方法所不具备的优势。Compared with the DeepMAR method proposed in the background art, the mA value of the pedestrian attribute recognition technology proposed by the present invention is increased by 8.76%, and compared with the JRL method by 3.35%. In addition, the pedestrian attribute recognition technology proposed in the present invention is an end-to-end training prediction method, which is very simple, easy to use and efficient in the process of model training and attribute prediction, which is an advantage that the JRL method does not have.

最后应说明的是，以上实施例仅用以说明本发明的技术方案而非限制，尽管参照较佳实施例对本发明进行了详细说明，本领域的普通技术人员应当理解，可以对本发明的技术方案进行修改或者等同替换，而不脱离本发明技术方案的精神和范围，其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present invention can be Modifications or equivalent substitutions are made without departing from the spirit and scope of the technical solutions of the present invention, and they should all be included in the scope of the claims of the present invention.

Claims

1. A construction method of a pedestrian attribute identification network based on a recurrent neural network attention model is characterized by comprising the following steps:

a first convolution neural network for extracting the pedestrian whole-body image characteristics N (x) by using the pedestrian original whole-body image as input;

using the pedestrian whole-body image feature N (x) as the first inputIn, attention heat map A of the last time focused attribute group_t-1(x) As a second input, an attention heat map A of the set of attributes of interest at the current time is output_t(x) And pedestrian feature H passing through local highlight_t(x) A recurrent neural network of (a);

using partially highlighted pedestrian features H_t(x) Outputting a second convolutional neural network of the attribute prediction probability of the current attention group as input;

the partially highlighted pedestrian feature H_t(x) Attention heat map A using the set of attributes that were focused on at the previous time_t-1(x) The calculation formula is obtained by acting on the pedestrian whole-body image characteristic N (x) as follows:

H_t(x)＝A_t(x) oN (x) + N (x), and t represents the t-th group.

2. The method of constructing a pedestrian attribute identification network of claim 1 wherein a batch regularization operation is used on the attribute prediction probability outputs.

3. The method for constructing a pedestrian attribute recognition network according to any one of claims 1 to 2, comprising: for each different attribute group of the same original pedestrian whole-body image, the state of a memory unit of the recurrent neural network is determined by the characteristics of the locally highlighted pedestrians of all the predicted attribute groups; sharing the weight value of the first convolution neural network at different prediction moments; the second convolutional neural network shares the weights for different predicted times.

4. The method of claim 3, wherein the pedestrian attribute recognition network is trained using a weighted Sigmoid cross entropy loss function, wherein the loss function is as follows:

W_j＝exp(p_j)

in the above formula, p_jRepresenting the proportion of the positive examples number of the attribute j in the training set, w representing the learning weight of the positive examples,

probability, y, representing whether the model output model includes the jth attribute for the ith sample prediction_ijThe label is the label of the jth attribute of the ith sample, N is the total number of training samples, and K is the total number of attributes to be identified.

5. A pedestrian attribute identification method based on a recurrent neural network attention model is characterized by comprising the following steps:

s1, acquiring a certain number of pedestrian images with attributes to be identified, marking whether the images have certain attributes or not, and acquiring a data set which can be used for training the identification effect of the attributes of the pedestrians; grouping all the labeled attributes according to semantic and spatial neighbor relations;

s2, constructing a pedestrian attribute identification network based on the recurrent neural network attention model according to any one of claims 1 to 4 by combining an increment network and a convolutional recurrent neural network;

s3, defining a loss function required by training the pedestrian attribute recognition network, and training the pedestrian attribute recognition network constructed in the step S2 by using the training data set obtained in the step S1;

and S4, identifying the attributes in the pedestrian image to be identified by using the pedestrian attribute identification network trained in the step S3.

6. The pedestrian attribute identification method according to claim 5, wherein the step S2 includes:

s2-1, extracting the original pedestrian whole-body image by using an inclusion network to obtain the pedestrian whole-body image characteristic N (x);

s2-2, at the moment i, calculating the relation of the current moment by using the characteristics N (x) of the whole-body image of the pedestrian and a convolution cyclic neural networkAttention heatmap A of property groups of notes_t(x) And storing the historical information in a memory unit of the convolution cyclic neural network;

s2-3, use attention heatmap A_t(x) Acting on the pedestrian whole-body image feature N (x) to obtain the partially highlighted pedestrian feature H_t(x) The calculation formula is as follows:

H_t(x)＝A_t(x) oN (x) + N (x); t represents the t-th group;

s2-4, using the locally highlighted feature H_t(x) And carrying out attribute identification on the t-th group of attributes and outputting the prediction probability of the group of attributes.

7. The pedestrian attribute identification method according to claim 5 or 6, wherein the loss function defined in the step S3 is as follows:

W_j＝exp(p_j)