CN110866426A

CN110866426A - Pedestrian identification method based on light field camera and deep learning

Info

Publication number: CN110866426A
Application number: CN201810985730.XA
Authority: CN
Inventors: 石凡; 赵宇峰; 赵萌; 贾晨; 栾昊; 陈胜勇; 冯洋博
Original assignee: Tianjin University of Technology
Current assignee: Tianjin University of Technology
Priority date: 2018-08-28
Filing date: 2018-08-28
Publication date: 2020-03-06

Abstract

A pedestrian recognition method based on a light field camera and a deep learning technology comprises the following steps of ① obtaining a plurality of pedestrian images by using the light field camera, ② processing original pedestrian images obtained in the step ① by using Lytro desktop software to obtain color pedestrian images and deep pedestrian images, ③ preprocessing the color images and the depth images obtained in the step ② and classifying the color images and the depth images into uniform sizes, dividing the images into positive and negative samples to obtain a light field image data set, ④ initializing a model based on the deep learning, ⑤ constructing a joint convolution characteristic of the color images and the depth images, ② 0 extracting a convolution characteristic of a training sample by using an existing ResNet50 image classification model trained on an ImageNet data set by using a method in ④, and ⑦ conducting repeated training of a neural network according to the characteristic obtained in the step ⑥ to obtain a new classification model.

Description

Pedestrian identification method based on light field camera and deep learning

Technical Field

The invention belongs to the field of computer vision, and particularly relates to a pedestrian identification method based on a light field camera and deep learning.

Background

Pedestrian identification is an important part in the field of computer vision research, and plays an important role in the fields of intelligent transportation, video monitoring, artificial intelligence, automatic driving and the like. In recent years, the industry has placed more stringent demands on the performance and accuracy of pedestrian recognition based on the rapid development of computer hardware devices and new photographic techniques.

Due to the explosive development of the automatic driving technology in recent years, the accuracy of the identification of pedestrians is particularly important. Due to the characteristics of rigid objects and non-rigid objects of pedestrians, such as the variability of the shooting angle of pedestrians, the existence of factors such as illumination and shielding, and the occurrence of a large number of human bodies in traffic signs and street view advertising signboards, the false detection of pedestrians is always a key problem affecting the detection performance of pedestrians. Therefore, in recent years, researchers have carried out a lot of work in the aspects of acquiring pedestrian features and optimizing detection methods, and the pedestrian features are extracted by comprehensively using a multi-sensor method, so that the false detection rate is reduced, and the pedestrian detection rate is improved.

Wuyi Keren created a "light field Camera" with other researchers at the laboratory of Hanran professor, Stanford university, USA. It is understood that a "light field camera" body is comparable to a typical digital camera, but the internal structure is quite different. In general, a camera captures light through a main lens and focuses the light on a film or a photoreceptor behind the lens, and the sum of all the light forms a small spot on a photograph to display an image. The special camera is arranged between a main lens and a photoreceptor, and is provided with a microscope array which is full of 9 ten thousand micro lenses, and each small lens array receives light from a main lens neck and then transmits the light to the photoreceptor, so that focused light is separated out, light data is converted, and the light data is recorded in a digital mode. The built-in software of the camera operates the 'expanded light field', tracks the falling point of each ray on the images at different distances, and can take perfect photos after digital refocusing.

In addition, the light field camera is contrary to the traditional method, the aperture size and the depth of field of a lens are reduced, extra light is controlled by a small mirror array, the depth of field of each image is revealed, then tiny secondary images are projected onto a photoreceptor, all dim light rings around the focused images become clear, the conditions of increasing luminosity, reducing photographing time and graining brought by the large aperture of the traditional camera are kept, and the depth of field and the definition of the images are not sacrificed.

Compared with digital cameras, light field cameras have several remarkable characteristics.

1. Taking a picture first and then focusing: the digital camera only captures a smooth surface for focusing imaging, the center is clear, and the focus is fuzzy; the light field camera records the data of light beams in all directions, the focus is selected in a computer according to needs in the later period, and the final imaging effect of the photo is processed and finished on the computer.

2. Small volume, fast speed: because of adopting an imaging technology different from that of the digital camera, the light field camera has no complex focusing system of the digital camera, the whole volume is smaller, and the operation is simpler; meanwhile, the shooting speed is higher because focusing is not selected.

The current common target identification method is to take a picture to obtain a sample based on a color camera and identify a target by a machine learning or deep learning method. When the methods face the pedestrian identification problem in a two-dimensional plane, the reflected accuracy and robustness are poor.

A machine learning-based method learns the rule of a human body from a training sample to obtain a model, and then tests are performed on a test set. If the data and the characteristics can be reasonably selected and trained by adding a reasonable algorithm, the problem of false detection of the two-dimensional plane of the pedestrian can be well solved.

Machine learning-based methods generally include three components, feature extraction, classifier training, and detection. The most common feature in the field of pedestrian identification is Histogram of Oriented Gradients (HOG). The HOG feature is applied to pedestrian recognition by a Support Vector Machine (SVM) to achieve good results. However, HOG is a typical manual feature, and the detection effect on image classification and recognition and objects such as pedestrians, animals, buildings and the like in any posture is not satisfactory. And such a HOG-like manual feature design requires the designer to have excellent visual research ability and rich research experience. Reviewing the research process of ten-year object recognition, it can be seen that the proposed models and algorithms are both feature-based manual designs and are slow to progress.

In recent years, with the development of scientific technology, deep learning has become one of the hottest research directions in the field of computer vision. For image recognition tasks and other tasks such as detection, segmentation, etc., research teams have also achieved very good performance in later research. Therefore, in terms of the current situation, the application of deep learning to pedestrian recognition is a trend, and has very broad research significance and application prospect.

Disclosure of Invention

The invention provides a pedestrian identification method based on a light field camera and deep learning, aiming at overcoming the defects of the existing pedestrian identification method, and the accuracy and the robustness of the pedestrian detection method can be effectively improved by using the method.

As conceived above, the technical scheme of the invention is as follows: a pedestrian identification method based on a light field camera and deep learning is characterized in that: the method comprises the following steps:

① acquiring a plurality of pedestrian images with a light field camera;

②, processing the original pedestrian image obtained in the step ① by utilizing a Lytro desktop software to obtain a color pedestrian image and a depth pedestrian image;

③ preprocessing the color image and the depth image obtained in step ②, normalizing the preprocessed color image and depth image into a uniform size, and dividing the image into positive and negative samples to obtain a light field image data set;

④ initializing the model based on deep learning, namely freezing all convolutional layer model parameters based on deep learning, carrying out convolution operation on the original image through all convolutional layers on the basis to obtain convolution characteristics, and continuing training on the basis;

⑤, constructing a joint convolution characteristic of the color image and the depth image, namely processing the color pedestrian image and the depth pedestrian image in ① by the neural network in ④ respectively to obtain a fused convolution characteristic;

⑥ extracting convolution characteristics of training samples by using the method in ④ and the existing ResNet50 image classification model trained on the ImageNet data set;

⑦ according to ⑥, training neural network repeatedly to get a new classification model.

The deep learning employs a Keras framework.

The invention has the following advantages and positive effects:

1. the invention has better effect on the false identification of the pedestrian on the two-dimensional plane. As a supplement to the existing pedestrian identification method, the method has the advantages of small operation amount, small required data amount, low requirement on machine hardware and low experimental cost, and can be applied to the actual industrial environment.

2. The accuracy and the robustness of the pedestrian detection method can be effectively improved by using the method.

Drawings

FIG. 1 is a flow chart of a pedestrian recognition method based on a light field camera and deep learning according to the invention.

Fig. 2 is a schematic diagram of the structure of the deep convolutional network used in the present invention.

Detailed Description

The present invention is further illustrated by the following examples, which are intended to be purely exemplary and are not intended to limit the scope of the invention, as various equivalent modifications of the invention will occur to those skilled in the art upon reading the present disclosure and fall within the scope of the appended claims.

The light field camera of the present invention is a light field camera produced by Lytro corporation. The experimental platform used: and (3) a PC.

As shown in fig. 1: a pedestrian identification method based on a light field camera and a deep learning technology comprises the following steps:

① obtaining multiple pedestrian pictures with a light field camera;

②, processing the original light field pedestrian image obtained in the step ① by utilizing a Lytro desktop software to obtain a color pedestrian image and a depth pedestrian image;

④ initializing the model based on transfer learning, namely freezing all convolutional layer model parameters based on transfer learning, carrying out convolution operation on the original image through all convolutional layers on the basis to obtain convolution characteristics, and continuing training on the basis, wherein the aim is to better extract effective and useful convolution characteristics by utilizing the existing neural network parameters obtained by training on a large-scale data set;

Therefore, the pedestrian identification method based on the light field camera and the deep learning is completed.

As shown in fig. 1 and 2: the method specifically adopts the following steps:

1. a plurality of pictures containing pedestrian scenes are firstly shot by a Lytro light field camera, and then the pictures are processed by Lytro official software to obtain color images and depth images of pedestrians.

2. And (3) preprocessing the color image and the depth image obtained in the step (1) and normalizing the color image and the depth image into a uniform size, and dividing the image into 600 positive and negative samples respectively to obtain a light field image data set.

3. The existing ResNet50 image classification model trained on the ImageNet dataset was downloaded and adapted to the format of the Keras framework used by the present invention.

4. Freezing the model parameters of all convolutional layers in the pre-training of ResNet50 by using the model parameters processed in the step 3;

5. before the training of the neural network, an operation of data amplification is required on the data set. Mainly comprising color channel shift, image rotation, mirror image, random cropping and the like. The Keras deep learning framework is utilized, and the image processing operation can be conveniently carried out through the built-in function.

6. And 4, passing the color image through all the convolutional layers of ResNet50 by using the model parameters processed in the step 4 to obtain a feature vector X1.

7. The depth image is input into 1 convolution layer with convolution kernel size of 3 × 3, and a feature vector X2 is obtained after convolution operation.

8. And (4) weighting and adding the convolution characteristics in the step 6 and the step 7 to form a new convolution characteristic, namely X-W1X 1+ W2X 2, wherein X is the new convolution characteristic.

9. And (4) inputting the convolution characteristic X obtained in the step (8) into a full Connected Layer (full Connected Layer) to obtain a final classification result.

10. According to the classification result, the false recognition rate of the pedestrian recognition, particularly the situation that the pedestrian data set contains the pedestrian in the two-dimensional plane, can be reduced.

The invention needs to set the iteration number of the neural network to 80 times, and the optimizer to be SGD.

As described above, a detection method for pedestrian misrecognition in a two-dimensional plane proposed by the present invention has been clearly described in detail.

While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims

1. A pedestrian identification method based on a light field camera and a deep learning technology is characterized in that: the method comprises the following steps:

① acquiring a plurality of pedestrian images with a light field camera;

2. The pedestrian recognition method based on the light field camera and the deep learning technology according to claim 1, wherein: the deep learning employs a Keras framework.