CN110866425A - Pedestrian identification method based on light field camera and depth migration learning - Google Patents
Pedestrian identification method based on light field camera and depth migration learning Download PDFInfo
- Publication number
- CN110866425A CN110866425A CN201810985726.3A CN201810985726A CN110866425A CN 110866425 A CN110866425 A CN 110866425A CN 201810985726 A CN201810985726 A CN 201810985726A CN 110866425 A CN110866425 A CN 110866425A
- Authority
- CN
- China
- Prior art keywords
- pedestrian
- image
- depth
- images
- light field
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 238000013508 migration Methods 0.000 title claims abstract description 18
- 230000005012 migration Effects 0.000 title claims abstract description 18
- 238000013528 artificial neural network Methods 0.000 claims abstract description 16
- 238000013145 classification model Methods 0.000 claims abstract description 9
- 238000012545 processing Methods 0.000 claims abstract description 8
- 230000008014 freezing Effects 0.000 claims abstract description 5
- 238000007710 freezing Methods 0.000 claims abstract description 5
- 238000007781 pre-processing Methods 0.000 claims abstract description 5
- 238000012549 training Methods 0.000 abstract description 6
- 238000001514 detection method Methods 0.000 description 12
- 238000011160 research Methods 0.000 description 8
- 238000013135 deep learning Methods 0.000 description 5
- 108091008695 photoreceptors Proteins 0.000 description 4
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000003384 imaging method Methods 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/53—Recognition of crowd images, e.g. recognition of crowd congestion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/192—Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
- G06V30/194—References adjustable by an adaptive method, e.g. learning
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
A pedestrian recognition method based on a light field camera and depth migration learning comprises the steps of ① obtaining a plurality of pedestrian images through the light field camera, ② obtaining color pedestrian images and depth pedestrian images through Lytro desktop software, ③ preprocessing the color images and the depth images obtained in the step ②, classifying the color images and the depth images into uniform sizes, dividing the images into positive and negative samples to obtain light field image data sets, ④ model initialization, ⑤ freezing previous convolution blocks through an existing VGG16 image classification model trained on an ImageNet data set, reserving parameters of the last convolution block to obtain initial values of a neural network, ② 0 processing the color pedestrian images and the depth pedestrian images in the step ① through the neural network in ④ to obtain mixed convolution characteristics, and ⑦ conducting repeated training of the neural network according to the convolution characteristics obtained in the step ⑥ and conducting fine tuning on the model to obtain a new classification model.
Description
Technical Field
The invention belongs to the field of computer vision, and particularly relates to a pedestrian identification method based on a light field camera and depth migration learning.
Background
Pedestrian identification is an important part in the field of computer vision research, and plays an important role in the fields of intelligent transportation, video monitoring, artificial intelligence, automatic driving and the like. In recent years, the industry has placed more stringent demands on the performance and accuracy of pedestrian recognition based on the rapid development of computer hardware devices and new photographic techniques.
Due to the explosive development of the automatic driving technology in recent years, the accuracy of the identification of pedestrians is particularly important. Due to the characteristics of rigid objects and non-rigid objects of pedestrians, such as the variability of the shooting angle of pedestrians, the existence of factors such as illumination and shielding, and the occurrence of a large number of human bodies in traffic signs and street view advertising signboards, the false detection of pedestrians is always a key problem affecting the detection performance of pedestrians. Therefore, in recent years, researchers have carried out a lot of work in the aspects of acquiring pedestrian features and optimizing detection methods, and the pedestrian features are extracted by comprehensively using a multi-sensor method, so that the false detection rate is reduced, and the pedestrian detection rate is improved.
Wuyi Keren created a "light field Camera" with other researchers at the laboratory of Hanran professor, Stanford university, USA. It is understood that a "light field camera" body is comparable to a typical digital camera, but the internal structure is quite different. In general, a camera captures light through a main lens and focuses the light on a film or a photoreceptor behind the lens, and the sum of all the light forms a small spot on a photograph to display an image. The special camera is arranged between a main lens and a photoreceptor, and is provided with a microscope array which is full of 9 ten thousand micro lenses, and each small lens array receives light from a main lens neck and then transmits the light to the photoreceptor, so that focused light is separated out, light data is converted, and the light data is recorded in a digital mode. The built-in software of the camera operates the 'expanded light field', tracks the falling point of each ray on the images at different distances, and can take perfect photos after digital refocusing.
In addition, the light field camera is contrary to the traditional method, the aperture size and the depth of field of a lens are reduced, extra light is controlled by a small mirror array, the depth of field of each image is revealed, then tiny secondary images are projected onto a photoreceptor, all dim light rings around the focused images become clear, the conditions of increasing luminosity, reducing photographing time and graining brought by the large aperture of the traditional camera are kept, and the depth of field and the definition of the images are not sacrificed.
Compared with digital cameras, light field cameras have several remarkable characteristics.
1. Taking a picture first and then focusing: the digital camera only captures a smooth surface for focusing imaging, the center is clear, and the focus is fuzzy; the light field camera records the data of light beams in all directions, the focus is selected in a computer according to needs in the later period, and the final imaging effect of the photo is processed and finished on the computer.
2. Small volume, fast speed: because of adopting an imaging technology different from that of the digital camera, the light field camera has no complex focusing system of the digital camera, the whole volume is smaller, and the operation is simpler; meanwhile, the shooting speed is higher because focusing is not selected.
The current common target identification method is to take a picture to obtain a sample based on a color camera and identify a target by a machine learning or deep learning method. When the methods face the pedestrian identification problem in a two-dimensional plane, the reflected accuracy and robustness are poor.
A machine learning-based method learns the rule of a human body from a training sample to obtain a model, and then tests are performed on a test set. If the data and the characteristics can be reasonably selected and trained by adding a reasonable algorithm, the problem of false detection of the two-dimensional plane of the pedestrian can be well solved.
Machine learning-based methods generally include three components, feature extraction, classifier training, and detection. The most common feature in the field of pedestrian identification is Histogram of Oriented Gradients (HOG). The HOG feature is applied to pedestrian recognition by a Support Vector Machine (SVM) to achieve good results. However, HOG is a typical manual feature, and the detection effect on image classification and recognition and objects such as pedestrians, animals, buildings and the like in any posture is not satisfactory. And such a HOG-like manual feature design requires the designer to have excellent visual research ability and rich research experience. Reviewing the research process of ten-year object recognition, it can be seen that the proposed models and algorithms are both feature-based manual designs and are slow to progress.
In recent years, with the development of scientific technology, deep learning has become one of the hottest research directions in the field of computer vision. For image recognition tasks and other tasks such as detection, segmentation, etc., research teams have also achieved very good performance in later research. Therefore, in terms of the current situation, the application of deep learning to pedestrian recognition is a trend, and has very broad research significance and application prospect.
Disclosure of Invention
The invention provides a pedestrian identification method based on a light field camera and deep migration learning, aiming at overcoming the defects of the existing pedestrian identification method, and the accuracy and the robustness of the pedestrian detection method can be effectively improved by using the method.
As conceived above, the technical scheme of the invention is as follows: a pedestrian identification method based on a light field camera and depth migration learning is characterized in that: the method comprises the following steps:
① acquiring a plurality of pedestrian images with a light field camera;
②, processing the original pedestrian image obtained in the step ① by utilizing a Lytro desktop software to obtain a color pedestrian image and a depth pedestrian image;
③ preprocessing the color image and the depth image obtained in step ②, normalizing the preprocessed color image and depth image into a uniform size, and dividing the image into positive and negative samples to obtain a light field image data set;
④, initializing a model, namely finely tuning the fine-tuning implementation by adopting a strategy of 'gradual migration' based on the migration learning;
⑤, freezing the previous volume block by using the VGG16 image classification model trained on the ImageNet data set, and reserving the parameters of the last volume block to obtain the initial value of the neural network;
⑥, respectively processing the color pedestrian image and the depth pedestrian image in step ① by the neural network in ④ to obtain a mixed convolution characteristic;
⑦ according to the convolution characteristics obtained in ⑥, the neural network is repeatedly trained and the model is finely adjusted to obtain a new classification model.
The migration learning employs a Keras framework.
The invention has the following advantages and positive effects:
1. the invention has better effect on the false identification of the pedestrian on the two-dimensional plane. As a supplement to the existing pedestrian identification method, the method has the advantages of small computation amount, small required data amount and low requirement on machine hardware, and can be applied to the actual industrial environment.
2. The invention can effectively improve the accuracy and the robustness of the pedestrian detection method.
Drawings
Fig. 1 is a flow chart of a pedestrian recognition method based on light field camera and depth migration learning according to the present invention.
Fig. 2 is a schematic diagram of the structure of the deep convolutional network used in the present invention.
Detailed Description
The present invention is further illustrated by the following examples, which are intended to be purely exemplary and are not intended to limit the scope of the invention, as various equivalent modifications of the invention will occur to those skilled in the art upon reading the present disclosure and fall within the scope of the appended claims.
The light field camera of the present invention is a light field camera produced by Lytro corporation. The experimental platform used: and (3) a PC.
As shown in fig. 1: a pedestrian identification method based on a light field camera and depth migration learning comprises the following steps:
① acquiring a plurality of pedestrian images with a light field camera;
②, processing the original pedestrian image obtained in the step ① by utilizing a Lytro desktop software to obtain a color pedestrian image and a depth pedestrian image;
③ preprocessing the color image and the depth image obtained in step ②, normalizing the preprocessed color image and depth image into a uniform size, and dividing the image into positive and negative samples to obtain a light field image data set;
④ model initialization, namely fine-tuning is realized by adopting a strategy of 'gradual migration' based on migration learning, wherein the 'fine tuning' is to initialize parameters of a target network by utilizing a trained model and continue training on the basis, and the aim is to obtain a good initial value of a neural network.
⑤, freezing the previous volume block by using the VGG16 image classification model trained on the ImageNet data set, and reserving the parameters of the last volume block to obtain the initial value of the neural network;
⑥, respectively processing the color pedestrian image and the depth pedestrian image in step ① by the neural network in ④ to obtain a mixed convolution characteristic;
⑦ according to the convolution characteristics obtained in ⑥, the neural network is repeatedly trained and the model is finely adjusted to obtain a new classification model.
Therefore, the pedestrian identification method based on the light field camera and the deep learning is completed.
As shown in fig. 1 and 2: the invention specifically adopts the following steps:
1. a plurality of pictures containing pedestrian scenes are firstly shot by a Lytro light field camera, and then the pictures are processed by Lytro official software to obtain color images and depth images of pedestrians.
2. And (3) preprocessing the color image and the depth image obtained in the step (1) and normalizing the color image and the depth image into a uniform size, and dividing the image into 500 positive and negative samples respectively to obtain a light field image data set.
3. The existing VGG16 image classification model trained on the ImageNet dataset was downloaded and adapted to the format of the Keras framework used in the present invention.
4. And (4) freezing the parameters of the previous 4 volume blocks and unfreezing the parameters in the 5 th volume block by using the model parameters processed in the step (3), and participating in neural network training and updating.
5. Before the training of the neural network, an operation of data amplification is required on the data set. Mainly comprising color channel shift, image rotation, mirror image, random cropping and the like. The Keras deep learning framework is utilized, and the image processing operation can be conveniently carried out through the built-in function.
6. And (4) passing the color image through all the convolution layers of the VGG16 by using the parameters in the step 4 to obtain a feature vector X1.
7. The depth image is input into 1 convolution layer with convolution kernel size of 3 × 3, and a feature vector X2 is obtained after convolution operation.
8. And (4) weighting and adding the convolution characteristics in the step 6 and the step 7 to form a new convolution characteristic, namely X-W1X 1+ W2X 2, wherein X is the new convolution characteristic.
9. And (4) inputting the convolution characteristic X obtained in the step (8) into a full Connected Layer (full Connected Layer) to obtain a final classification result.
10. According to the classification result, the false recognition rate of the pedestrian recognition, particularly the situation that the pedestrian data set contains the pedestrian in the two-dimensional plane, can be reduced.
The invention needs to set the iteration number of the neural network to be 50 times, and the optimizer is set to Momentum.
As described above, a detection method for pedestrian misrecognition in a two-dimensional plane proposed by the present invention has been clearly described in detail.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
Claims (2)
1. A pedestrian identification method based on a light field camera and depth migration learning is characterized in that: the method comprises the following steps:
① acquiring a plurality of pedestrian images with a light field camera;
②, processing the original pedestrian image obtained in the step ① by utilizing a Lytro desktop software to obtain a color pedestrian image and a depth pedestrian image;
③ preprocessing the color image and the depth image obtained in step ②, normalizing the preprocessed color image and depth image into a uniform size, and dividing the image into positive and negative samples to obtain a light field image data set;
④, initializing a model, namely finely tuning the fine-tuning implementation by adopting a strategy of 'gradual migration' based on the migration learning;
⑤, freezing the previous volume block by using the VGG16 image classification model trained on the ImageNet data set, and reserving the parameters of the last volume block to obtain the initial value of the neural network;
⑥, respectively processing the color pedestrian image and the depth pedestrian image in step ① by the neural network in ④ to obtain a mixed convolution characteristic;
⑦ according to the convolution characteristics obtained in ⑥, the neural network is repeatedly trained and the model is finely adjusted to obtain a new classification model.
2. The pedestrian recognition method based on light field camera and depth migration learning of claim 1, characterized in that: the migration learning employs a Keras framework.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810985726.3A CN110866425A (en) | 2018-08-28 | 2018-08-28 | Pedestrian identification method based on light field camera and depth migration learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810985726.3A CN110866425A (en) | 2018-08-28 | 2018-08-28 | Pedestrian identification method based on light field camera and depth migration learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110866425A true CN110866425A (en) | 2020-03-06 |
Family
ID=69651695
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810985726.3A Pending CN110866425A (en) | 2018-08-28 | 2018-08-28 | Pedestrian identification method based on light field camera and depth migration learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110866425A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102122390A (en) * | 2011-01-25 | 2011-07-13 | 于仕琪 | Method for detecting human body based on range image |
CN106203506A (en) * | 2016-07-11 | 2016-12-07 | 上海凌科智能科技有限公司 | A kind of pedestrian detection method based on degree of depth learning art |
CN106909924A (en) * | 2017-02-18 | 2017-06-30 | 北京工业大学 | A kind of remote sensing image method for quickly retrieving based on depth conspicuousness |
CN107491726A (en) * | 2017-07-04 | 2017-12-19 | 重庆邮电大学 | A kind of real-time expression recognition method based on multi-channel parallel convolutional neural networks |
CN107742099A (en) * | 2017-09-30 | 2018-02-27 | 四川云图睿视科技有限公司 | A kind of crowd density estimation based on full convolutional network, the method for demographics |
US20180129906A1 (en) * | 2016-11-07 | 2018-05-10 | Qualcomm Incorporated | Deep cross-correlation learning for object tracking |
-
2018
- 2018-08-28 CN CN201810985726.3A patent/CN110866425A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102122390A (en) * | 2011-01-25 | 2011-07-13 | 于仕琪 | Method for detecting human body based on range image |
CN106203506A (en) * | 2016-07-11 | 2016-12-07 | 上海凌科智能科技有限公司 | A kind of pedestrian detection method based on degree of depth learning art |
US20180129906A1 (en) * | 2016-11-07 | 2018-05-10 | Qualcomm Incorporated | Deep cross-correlation learning for object tracking |
CN106909924A (en) * | 2017-02-18 | 2017-06-30 | 北京工业大学 | A kind of remote sensing image method for quickly retrieving based on depth conspicuousness |
CN107491726A (en) * | 2017-07-04 | 2017-12-19 | 重庆邮电大学 | A kind of real-time expression recognition method based on multi-channel parallel convolutional neural networks |
CN107742099A (en) * | 2017-09-30 | 2018-02-27 | 四川云图睿视科技有限公司 | A kind of crowd density estimation based on full convolutional network, the method for demographics |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Li et al. | A free lunch for unsupervised domain adaptive object detection without source data | |
CN111310862B (en) | Image enhancement-based deep neural network license plate positioning method in complex environment | |
CN109241982B (en) | Target detection method based on deep and shallow layer convolutional neural network | |
CN113052210B (en) | Rapid low-light target detection method based on convolutional neural network | |
US20180012411A1 (en) | Augmented Reality Methods and Devices | |
Xu et al. | Learning-based shadow recognition and removal from monochromatic natural images | |
Vazquez et al. | Virtual and real world adaptation for pedestrian detection | |
CN109711366B (en) | Pedestrian re-identification method based on group information loss function | |
Wen et al. | Visdrone-sot2018: The vision meets drone single-object tracking challenge results | |
TW202032387A (en) | System and method for computing dominant class of scene | |
CN109635634B (en) | Pedestrian re-identification data enhancement method based on random linear interpolation | |
CN105930822A (en) | Human face snapshot method and system | |
WO2023011013A1 (en) | Splicing seam search method and apparatus for video image, and video image splicing method and apparatus | |
CN107609475B (en) | Pedestrian detection false detection extraction method based on light field camera | |
CN113052170B (en) | Small target license plate recognition method under unconstrained scene | |
CN110866426A (en) | Pedestrian identification method based on light field camera and deep learning | |
Zou et al. | Microarray camera image segmentation with Faster-RCNN | |
CN113159043A (en) | Feature point matching method and system based on semantic information | |
Mirani et al. | Object recognition in different lighting conditions at various angles by deep learning method | |
TWI696958B (en) | Image adaptive feature extraction method and its application | |
CN111241943A (en) | Scene recognition and loopback detection method based on background target detection and triple loss in automatic driving scene | |
CN112347967B (en) | Pedestrian detection method fusing motion information in complex scene | |
CN114743045B (en) | Small sample target detection method based on double-branch area suggestion network | |
CN110866425A (en) | Pedestrian identification method based on light field camera and depth migration learning | |
Hu et al. | Gray spot detection in surveillance video using convolutional neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200306 |
|
RJ01 | Rejection of invention patent application after publication |