CN111507272A

CN111507272A - Method and system for identifying pedestrian attributes in monitoring scene

Info

Publication number: CN111507272A
Application number: CN202010310527.XA
Authority: CN
Inventors: 黄凯奇; 陈晓棠; 贾健
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2020-04-20
Filing date: 2020-04-20
Publication date: 2020-08-07
Anticipated expiration: 2040-04-20
Also published as: CN111507272B

Abstract

The invention relates to a pedestrian attribute identification method and a system in a monitoring scene, wherein the attribute identification method comprises the following steps: acquiring an image of a pedestrian to be detected in a monitoring scene; preprocessing the pedestrian image to be detected to obtain a processed image; obtaining the convolution image characteristics of the pedestrian image to be detected through a deep neural network; determining the weight parameters of each attribute classifier according to the characteristics of the full-connection layer and the convolution image; determining network attribute values of the pedestrian image to be detected under different attribute classifiers based on the convolution image features and the weight parameters; determining a predicted value of the corresponding attribute based on each network attribute value; and determining the attribute type of the pedestrian image to be detected according to the predicted values. Extracting the convolution image characteristics of the pedestrian image to be detected through a deep neural network, and determining the weight parameters of each attribute classifier; and obtaining network attribute values under different attribute classifiers, and further obtaining a predicted value of the corresponding attribute so as to accurately determine the attribute type of the pedestrian image to be detected.

Description

Method and system for identifying pedestrian attributes in monitoring scene

Technical Field

The invention relates to the technical field of visual scene processing and analysis, in particular to a pedestrian attribute identification method and system in a monitoring scene.

Background

In recent years, the fields of computer vision, artificial intelligence, machine perception and the like are rapidly developed. With the wide deployment of the placement of the cameras, how to perform efficient pedestrian attribute identification in a monitoring scene is widely concerned.

The pedestrian attribute identification in the monitoring scene is to utilize a computer algorithm to process and analyze the pedestrian pictures in the video, and automatically obtain the attribute categories contained by a certain pedestrian, such as age, gender, backpack, clothing and the like. Thereby providing support and assistance for the pedestrian picture retrieval and pedestrian re-identification technology at the downstream.

In the traditional method, the feature expression of the pedestrian picture is obtained by constructing the manually designed picture features, and the performance of the traditional method is not enough to meet the application requirements in the actual scene. With the wide use of deep learning in recent years, many pedestrian attribute algorithms start from two aspects of better feature expression and attribute relationship modeling, the pedestrian attribute identification method in the monitoring scene is continuously improved, and the development of the pedestrian attribute identification field is promoted.

Although a great deal of work is carried out before, the performance of pedestrian attribute identification is obviously improved by learning the relationship between visual feature expressions with higher discriminability and better modeling attributes, the parameter quantity of the model and the complexity of calculation are inevitably increased by each method, and the difficulty of pedestrian attribute identification is increased.

Disclosure of Invention

In order to solve the above problems in the prior art, that is, to improve the identification of the pedestrian attribute, the present invention aims to provide a method and a system for identifying the pedestrian attribute in a monitoring scene.

In order to solve the technical problems, the invention provides the following scheme:

a method for identifying attributes of pedestrians in a monitoring scene comprises the following steps:

acquiring an image of a pedestrian to be detected in a monitoring scene;

preprocessing the pedestrian image to be detected to obtain a processed image;

obtaining the convolution image characteristics of the pedestrian image to be detected through a deep neural network according to the processed image;

determining the weight parameters of each attribute classifier according to the full connection layer and the characteristics of the convolution image;

determining network attribute values of the pedestrian image to be detected under different attribute classifiers based on the convolution image features and the weight parameters;

determining a predicted value of a corresponding attribute based on the network attribute value of each to-be-detected pedestrian image;

and determining the attribute type of the pedestrian image to be detected according to the predicted value of each attribute.

Optionally, the preprocessing the image of the pedestrian to be detected to obtain a processed image specifically includes:

zooming the to-be-detected pedestrian image to obtain a zoomed image;

randomly and horizontally overturning the zoomed image to obtain an overturned image;

and filling zero in the overturning image to obtain a preprocessed image.

Optionally, obtaining a convolution image feature X of the pedestrian image to be detected according to the following formula_img：

X_img＝f_cnn(I_img；θ_cnn)；

wherein ,

a real number space in which the pedestrian features are located, C_featNumber of layers being convolution features, i.e. dimension of pedestrian feature space, f_cnnAs a deep neural network, I_imgTo process images, I_img∈R^H×W×CH is the height of the convolution signature,w is the width of the convolution feature map, C is the number of processed image layers input by the deep neural network, and theta_cnnIs a learnable parameter of a deep neural network.

Optionally, the determining the weight parameter of each attribute classifier according to the full connection layer and the convolution image feature specifically includes:

establishing an attribute classifier Cls (X) for identifying the attribute of the pedestrian image according to the full connection layer and the characteristics of the convolution image_img；θ_cls)； wherein ,X_imgFor convolution of image features, θ_clsIs a weight parameter of the attribute classifier,

for the real space in which the weight parameters of the attribute classifier are located, C_featNumber of layers, N, characteristic of convolution_attrMarking the number of attributes pre-stored in a database;

determining a weight parameter of each attribute classifier based on the attribute classifier identified by the pedestrian image attribute

wherein ,

i is the serial number of the current attribute classifier, the number of the attribute classifiers is consistent with the attribute class, i is 1, 2, …, N_attr。

Optionally, the determining, based on the convolution image features and the weight parameters, a network attribute value of the pedestrian image to be detected under different attribute classifiers includes:

respectively carrying out normalization processing on the convolution image characteristics and the weight parameters to obtain corresponding normalization characteristics and normalization weight parameters;

and determining the network attribute values of the pedestrian images to be detected under different attribute classifiers according to the normalization features and the normalization weight parameters.

Optionally, determining the network attribute values of the pedestrian image to be detected under different attribute classifiers according to the following formula:

wherein ,N_attrI is the serial number of the current attribute classifier for the number of labeled attributes pre-stored in the database,

the network attribute value of the pedestrian image to be detected under the ith attribute classifier is α which is a scaling factor,

in order to normalize the characteristics of the features,

normalized weight parameters for the i attribute classifiers.

Optionally, the predicted value of the corresponding attribute is determined according to the following formula:

the predicted value of the ith attribute of the pedestrian image to be detected,

and BN (boron nitride) is a batch normalization layer processing function, and Sigmoid (germanium) is a neural network activation function, wherein BN (boron nitride) is a network attribute value of the pedestrian image to be detected under the ith attribute classifier.

In order to solve the technical problems, the invention also provides the following scheme:

a system for identifying attributes of pedestrians in a monitored scene, the system comprising:

the acquiring unit is used for acquiring an image of a pedestrian to be detected in a monitoring scene;

the preprocessing unit is used for preprocessing the pedestrian image to be detected to obtain a processed image;

the characteristic determining unit is used for obtaining the convolution image characteristic of the pedestrian image to be detected according to the processed image through the deep neural network;

the parameter determining unit is used for determining the weight parameters of the attribute classifiers according to the full-connection layer and the convolution image characteristics;

the calculation unit is used for determining the network attribute values of the pedestrian images to be detected under different attribute classifiers based on the convolution image characteristics and the weight parameters;

the prediction unit is used for determining a prediction value of the corresponding attribute based on the network attribute value of each pedestrian image to be detected;

and the attribute determining unit is used for determining the attribute type of the pedestrian image to be detected according to the predicted value of each attribute.

a system for identifying pedestrian attributes in a monitored scene, comprising:

a processor; and

a memory arranged to store computer executable instructions that, when executed, cause the processor to:

acquiring an image of a pedestrian to be detected in a monitoring scene;

preprocessing the pedestrian image to be detected to obtain a processed image;

a computer-readable storage medium storing one or more programs that, when executed by an electronic device including a plurality of application programs, cause the electronic device to:

acquiring an image of a pedestrian to be detected in a monitoring scene;

preprocessing the pedestrian image to be detected to obtain a processed image;

According to the embodiment of the invention, the invention discloses the following technical effects:

extracting the convolution image characteristics of the pedestrian image to be detected from the preprocessed processed image through a deep neural network, and further obtaining the weight parameters of each attribute classifier; and determining the network attribute values of the pedestrian images to be detected under different attribute classifiers based on the convolution image characteristics and the weight parameters, and further obtaining the predicted values of the corresponding attributes, so that the attribute types of the pedestrian images to be detected can be accurately determined.

Drawings

FIG. 1 is a flow chart of a method for pedestrian attribute identification in a surveillance scene according to the present invention;

FIG. 2 is a schematic block diagram of a pedestrian attribute identification system in a monitoring scene according to the present invention.

Description of the symbols:

the system comprises an acquisition unit-1, a preprocessing unit-2, a feature determination unit-3, a parameter determination unit-4, a calculation unit-5, a prediction unit-6 and an attribute determination unit-7.

Detailed Description

Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are only for explaining the technical principle of the present invention, and are not intended to limit the scope of the present invention.

The invention aims to provide a pedestrian attribute identification method in a monitoring scene, which extracts the convolution image characteristics of a pedestrian image to be detected from a preprocessed processed image through a deep neural network, and further obtains the weight parameters of each attribute classifier; and determining the network attribute values of the pedestrian image to be detected under different attribute classifiers based on the convolution image characteristics and the weight parameters, and further obtaining the predicted values of the corresponding attributes, so that the attribute type of the pedestrian image to be detected can be accurately determined.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

As shown in fig. 1, the method for identifying the pedestrian attributes in the monitoring scene of the present invention includes:

step 100: acquiring an image of a pedestrian to be detected in a monitoring scene;

step 200: preprocessing the pedestrian image to be detected to obtain a processed image;

step 300: obtaining the convolution image characteristics of the pedestrian image to be detected through a deep neural network according to the processed image;

step 400: determining the weight parameters of each attribute classifier according to the full connection layer and the characteristics of the convolution image;

step 500: determining network attribute values of the pedestrian image to be detected under different attribute classifiers based on the convolution image features and the weight parameters;

step 600: determining a predicted value of a corresponding attribute based on the network attribute value of each to-be-detected pedestrian image;

step 700: and determining the attribute type of the pedestrian image to be detected according to the predicted value of each attribute.

Further, in step 200, the preprocessing the image of the pedestrian to be detected to obtain a processed image specifically includes:

step 201: and carrying out zooming processing on the pedestrian image to be detected to obtain a zoomed image. For example, by scaling, so that the image I of the pedestrian to be tested_pedesIs 0.75, but not limited thereto.

Step 202: and carrying out random horizontal turning on the zoomed image to obtain a turned image.

And filling zero in the overturning image to obtain a preprocessed image.

Optionally, in step 300, obtaining a convolution image feature X of the pedestrian image to be detected according to the following formula_img：

X_img＝f_cnn(I_img；θ_cnn)；

wherein ,

a real number space in which the pedestrian features are located, C_featNumber of layers, f, characteristic of convolution_cnnAs a deep neural network, I_imgTo process images, I_img∈R^H×W×CH is the height of the convolution feature map, W is the width of the convolution feature map, and C is the processed image input by the deep neural networkNumber of layers, theta_cnnIs a learnable parameter of a deep neural network.

Preferably, in step 400, the determining a weight parameter of each attribute classifier according to the full-connected layer and the feature of the convolution image specifically includes:

step 401: establishing an attribute classifier Cls (X) for identifying the attribute of the pedestrian image according to the full connection layer and the characteristics of the convolution image_img；θ_cls)。

wherein ,X_imgFor convolution of image features, θ_clsIs a weight parameter of the attribute classifier,

for the real space in which the weight parameters of the attribute classifier are located, C_featNumber of layers, N, characteristic of convolution_attrThe number of attributes is noted for pre-storage in the database.

Step 402: determining a weight parameter of each attribute classifier based on the attribute classifier identified by the pedestrian image attribute

wherein ,

i is the serial number of the current attribute classifier, the number of the attribute classifiers is consistent with the attribute classes, and i is 1, 2_attr。

Further, in step 500, the determining, based on the convolution image features and the weight parameters, the network attribute values of the pedestrian image to be detected under different attribute classifiers specifically includes:

step 501: and respectively carrying out normalization processing on the convolution image characteristics and the weight parameters to obtain corresponding normalization characteristics and normalization weight parameters.

wherein ,

in order to normalize the characteristics of the features,

After normalization, the normalized features and the modulus length of each normalized weight parameter are 1.

Step 502: and determining the network attribute values of the pedestrian images to be detected under different attribute classifiers according to the normalization features and the normalization weight parameters.

Determining the network attribute values of the pedestrian image to be detected under different attribute classifiers according to the following formula:

in order to normalize the characteristics of the features,

normalized weight parameters for the i attribute classifiers.

The loss distribution of the positive and negative samples of the pedestrian attribute can be balanced by introducing and introducing the scaling factor, and in the embodiment, α is 20, but the value is not limited to the value and can be adjusted according to actual needs.

The predicted value of a certain attribute obtained by the invention can effectively improve the performance of the attribute, and the performance conditions of all the attributes are obtained by averaging the predicted values of all the attributes.

Specifically, the prediction threshold is an average value of the prediction values of all the attributes, the prediction values of all the attributes are compared with the prediction threshold, and the attribute corresponding to the prediction value larger than the prediction threshold is selected as the attribute type of the pedestrian image to be detected. Wherein the attribute types are age, gender, backpack, clothing, etc.

The pedestrian attribute identification in the monitoring scene is realized by a simple and efficient method, so that the pedestrian attribute identification is more suitable for being deployed in hardware facilities in the monitoring scene; 2) the method includes the steps that the pedestrian attribute picture features and the weights of different attribute classifiers are normalized and then calculated to solve the problem that the weights of the pedestrian attribute classifiers depend on prior distribution of pedestrian attributes in a scene; 3) according to the invention, the problem of unbalanced distribution of the positive and negative samples is further solved by introducing the scaling factor, so that the network model is easier to optimize, and the performance of the network model is further improved.

In particular, the present invention has several distinct advantages over the prior art:

1) the calculation amount and the model parameter amount of all the algorithms are obviously higher than those of the invention, and the invention realizes the pedestrian attribute identification performance equivalent to that of the current best method under the condition of using only 63.18 percent of parameter amount and 46.18 percent of calculation amount.

2) The method solves the problem of unbalance of positive and negative samples in pedestrian attribute identification, and realizes that the prediction performance of the pedestrian attributes does not depend on the distribution prior of the pedestrian attributes in a scene by normalizing the pedestrian picture characteristics and the weight of each attribute classifier.

3) The more efficient algorithm and the more lightweight model enable the algorithm to be better applied to monitoring hardware facilities in a scene than other algorithms.

In addition, the invention also provides a pedestrian attribute identification system in the monitoring scene, which can improve the identification of the pedestrian attribute.

As shown in fig. 2, the system for identifying the attributes of pedestrians in the monitoring scene of the present invention includes an obtaining unit 1, a preprocessing unit 2, a feature determining unit 3, a parameter determining unit 4, a calculating unit 5, a predicting unit 6, and an attribute determining unit 7.

The acquiring unit 1 is used for acquiring an image of a pedestrian to be detected in a monitoring scene; the preprocessing unit 2 is used for preprocessing the pedestrian image to be detected to obtain a processed image; the characteristic determining unit 3 is used for obtaining the convolution image characteristic of the pedestrian image to be detected through the deep neural network according to the processed image; the parameter determining unit 4 is configured to determine a weight parameter of each attribute classifier according to the full connection layer and the convolution image feature; the calculating unit 5 is configured to determine network attribute values of the pedestrian image to be detected under different attribute classifiers based on the convolution image features and the weight parameters; the prediction unit 6 is configured to determine a prediction value of a corresponding attribute based on the network attribute value of each to-be-detected pedestrian image; the attribute determining unit 7 is configured to determine the attribute type of the pedestrian image to be detected according to the predicted value of each attribute.

The invention further provides the following scheme:

a processor; and

acquiring an image of a pedestrian to be detected in a monitoring scene;

preprocessing the pedestrian image to be detected to obtain a processed image;

The invention further provides the following scheme:

acquiring an image of a pedestrian to be detected in a monitoring scene;

preprocessing the pedestrian image to be detected to obtain a processed image;

Compared with the prior art, the beneficial effects of the computer-readable storage medium, the pedestrian attribute identification system in the monitoring scene and the pedestrian attribute identification method in the monitoring scene are the same, and are not repeated herein.

Compared with the prior art, the image retrieval system and the computer readable storage medium have the same beneficial effects as the image retrieval method, and are not repeated herein.

So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

Claims

1. A method for identifying attributes of pedestrians in a monitoring scene is characterized by comprising the following steps:

acquiring an image of a pedestrian to be detected in a monitoring scene;

preprocessing the pedestrian image to be detected to obtain a processed image;

2. The method for identifying the attributes of the pedestrians in the monitored scene according to claim 1, wherein the preprocessing the image of the pedestrian to be detected to obtain a processed image specifically comprises:

zooming the to-be-detected pedestrian image to obtain a zoomed image;

and filling zero in the overturning image to obtain a preprocessed image.

3. The method for identifying the attributes of pedestrians in the monitored scene according to claim 1, wherein the convolution image feature X of the image of the pedestrian to be detected is obtained according to the following formula_img：

X_img＝f_cnn(I_img；θ_cnn)；

wherein ,

a real number space in which the pedestrian features are located, C_featNumber of layers, f, characteristic of convolution_cnnAs a deep neural network, I_imgTo process images, I_img∈R^H×W×CH is the height of the convolution feature map, W is the width of the convolution feature map, C is the number of processed image layers input by the deep neural network, and theta_cnnIs a learnable parameter of a deep neural network.

4. The method for identifying the attributes of the pedestrian in the monitored scene according to claim 1, wherein the determining the weight parameters of each attribute classifier according to the full-connected layer and the convolution image features specifically comprises:

wherein ,

5. The method for identifying the attributes of the pedestrians in the monitored scene according to claim 1, wherein the determining the network attribute values of the images of the pedestrians to be detected under different attribute classifiers based on the features of the convolution images and the weight parameters specifically comprises:

6. The method for identifying the attributes of the pedestrians in the monitored scene according to the claim 5 is characterized in that the network attribute values of the images of the pedestrians to be detected under different attribute classifiers are determined according to the following formula:

in order to normalize the characteristics of the features,

normalized weight parameters for the i attribute classifiers.

7. The method of claim 1, wherein the predicted value of the corresponding attribute is determined according to the following formula:

8. A system for identifying attributes of pedestrians in a monitored scene, the system comprising:

9. A system for identifying pedestrian attributes in a monitored scene, comprising:

a processor; and

acquiring an image of a pedestrian to be detected in a monitoring scene;

preprocessing the pedestrian image to be detected to obtain a processed image;

10. A computer-readable storage medium storing one or more programs that, when executed by an electronic device including a plurality of application programs, cause the electronic device to:

acquiring an image of a pedestrian to be detected in a monitoring scene;

preprocessing the pedestrian image to be detected to obtain a processed image;