CN108549852B - Specific scene downlink person detector automatic learning method based on deep network enhancement - Google Patents

Specific scene downlink person detector automatic learning method based on deep network enhancement Download PDF

Info

Publication number
CN108549852B
CN108549852B CN201810264330.XA CN201810264330A CN108549852B CN 108549852 B CN108549852 B CN 108549852B CN 201810264330 A CN201810264330 A CN 201810264330A CN 108549852 B CN108549852 B CN 108549852B
Authority
CN
China
Prior art keywords
neural network
pedestrian
sample
server side
under
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810264330.XA
Other languages
Chinese (zh)
Other versions
CN108549852A (en
Inventor
郑慧诚
何炜雄
谢晓华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Sun Yat Sen University
Original Assignee
National Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Sun Yat Sen University filed Critical National Sun Yat Sen University
Priority to CN201810264330.XA priority Critical patent/CN108549852B/en
Publication of CN108549852A publication Critical patent/CN108549852A/en
Application granted granted Critical
Publication of CN108549852B publication Critical patent/CN108549852B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands

Abstract

The invention discloses a specific scene descending person detector automatic learning method based on deep network enhancement, which comprises the following steps: training a first neural network and a second neural network by using a universal data set at a server side, wherein the second neural network is deployed in the embedded equipment; capturing an image of a current scene through an embedded device to obtain a newly added image sample, and transmitting the newly added image sample to a server side; testing the newly added image sample by utilizing the first neural network trained before at the server side, and labeling the sample according to the test score; estimating the size of a pedestrian detection frame at the current height, eliminating samples with obvious difference between the detection frame and the estimated size in the positive samples, and reserving the residual samples; the server side optimizes the second neural network; and redeploying the adjusted second neural network model to the embedded equipment from the server side. The pedestrian detection method can quickly obtain an accurate pedestrian detection model in a specific scene.

Description

Specific scene downlink person detector automatic learning method based on deep network enhancement
Technical Field
The invention relates to the field of pedestrian detection research in video monitoring, in particular to a method for automatically learning a pedestrian detector under a specific scene based on deep network enhancement.
Background
With the gradual expansion of the monitoring range of the camera, how to analyze the behaviors, actions and tracks of pedestrians by using data acquired by the camera has become an urgent need in the current society, and the technical basis of the needs is pedestrian detection. Pedestrian detection is completed by a pedestrian detector, the task of the pedestrian detector is to estimate the position of a pedestrian under the current scene, and the pedestrian detector plays a very important role in the camera monitoring fields of pedestrian tracking, pedestrian recognition and the like. Pedestrian detection remains a very challenging problem at present due to factors such as illumination changes, camera angle changes, pedestrian pose changes, and the like.
In recent years, great progress is made in this respect, including that the traditional HOG feature and the application of the SVM classifier have achieved good results in pedestrian detection, and the recent study based on the convolutional neural network has advanced the performance of the pedestrian detector to a new height due to the relatively good learning capability of the sample distribution.
However, although these studies can achieve very good results in the problem of pedestrian detection, the pedestrian detector effect obtained by training the learning-based method depends on the distribution of the training set, so that when working in other specific scenes, the performance of the pedestrian detector will become very poor due to the very large differences between the distribution of the test set and the training set, which may be caused by occlusion, image quality, etc. of the scene. On the other hand, if a manual labeling method is adopted to collect data in each specific scene to train the model, the method is very labor-consuming, and when the number of pedestrian detectors is very large, the method is not preferable. Therefore, how to utilize the automatic learning method to improve the adaptability of the pedestrian detector to a specific scene is a critical issue.
The existing methods mainly comprise the following methods:
(1) a method based on context information, pedestrian size. See Xiaoogang Wang, Meng Wang, and WeiLi, Scene-Specific Peerstrian Detection for Static Video Surveillance, IEEETPAMI 36(2014) 361-. In the method, a current scene and the size of the pedestrian are modeled to obtain the probability that a current detection frame is a positive sample and a negative sample, and the positive sample and the negative sample obtained by the method are used for training an SVM classifier.
(2) Semi-supervised and assisted detector based approaches. See Si Wu, Shufeng Wang, Robert Laganiere, Cheng Liu, Hau-San Wong, and Yong Xu: expanding Target Data to Learn deep connected Networks for Scene-added Human Detection, IEEE TIP (2017). In the method, for the condition that a small number of positive and negative samples exist in a specific scene, an auxiliary detector is trained through the small number of samples, more unlabeled samples are obtained through the output of the auxiliary detector, and finally, the samples are used for training a model for the scene.
The above methods have many disadvantages. First, positive and negative samples in the current scene are obtained based on context information such as pedestrian size and background modeling, and the samples obtained by such a method have relatively large noise because such information is not very reliable. Meanwhile, the semi-supervised method requires a certain number of samples labeled manually, which is undoubtedly very time-consuming and labor-consuming.
Disclosure of Invention
Aiming at the condition that the pedestrian detector cannot well position the pedestrian in a specific scene at present, the invention provides the automatic learning method of the pedestrian detector in the specific scene based on the deep network enhancement.
The purpose of the invention is realized by the following technical scheme: the method for automatically learning the downlink human detector in a specific scene based on deep network enhancement comprises the following steps:
(1) training a first neural network and a second neural network by using a universal data set at a server side, wherein the second neural network is used for being deployed in the embedded equipment;
(2) capturing an image of a current scene by using embedded equipment in the working process of pedestrian detection to obtain a newly added image sample, and transmitting the newly added image sample to a server;
(3) testing the newly added image sample by utilizing the first neural network trained before at the server end, and labeling the sample by utilizing the test score of the first neural network;
(4) estimating the size of a pedestrian detection frame under the current height of the embedded device, calculating the difference value between the detection frame in the positive sample and the estimated pedestrian detection frame, if the difference value exceeds a threshold value, removing, and keeping the residual samples;
(5) the server side utilizes the residual samples to tune the second neural network;
(6) and redeploying the adjusted second neural network model to the embedded equipment from the server side.
In the invention, the first neural network is deployed at the server end, so that the design structure is complex, and the training precision is improved. The second neural network is used for being deployed to the embedded equipment, so that the structure can be designed to be simple, the embedded equipment can meet the speed requirement, for newly added image samples, the samples are tested and labeled by the complex first neural network, the samples with high scores are screened out, and then the second neural network is optimized, so that the identification result can be obtained quickly and accurately in a specific scene.
Preferably, in step (1), the step of training the first neural network and the second neural network using the common data set at the server side includes:
using manually labeled data in a plurality of scenes except the current scene as a universal data set, using Faster region-based convolutional neural network (Faster R-CNN) based on ResNet-101 (101-layer residual network) as a first neural network, and using Alexnet-based SSD (single frame multi-scale detector) as a second neural network.
Furthermore, the pre-training network adopted by the first neural network and the second neural network during training has the network parameter obtaining method as follows: and training on ImageNet to obtain network parameters for classification, removing layers after the last convolutional layer, and taking the parameters of the remaining convolutional layers as initialization parameters during current training.
Preferably, in the step (2), the embedded device transmits the newly added image sample to the server side by using an FTP protocol (file transfer protocol).
Furthermore, in the working process of pedestrian detection through the embedded equipment, the collected image samples are screened, and the steps are as follows:
setting the number of pedestrians detected by the current equipment to be NpIf N is presentp≥Tp,TpAnd if the threshold value is preset, the acquired image is used as a newly added image sample and is transmitted to the server side, otherwise, the current image is abandoned. Therefore, the embedded device can collect possibly effective samples, effectively shorten the time required by the next tuning process and simultaneously enhance the performance of the optimized result.
(3) Testing the newly added image sample by utilizing the first neural network trained before at the server end, and labeling the sample by utilizing the test score of the first neural network;
preferably, in the step (3), the step of testing and labeling the newly added image sample by the first neural network is:
for each image I, the result after the test by the first neural network is recorded as
Figure BDA0001610986970000031
Wherein n is the total number of detection frames, liIs the position vector of the ith detection frame, li=[xl,yl,xr,yr],(xl,yl)、(xr,yr) The coordinates of the upper left corner and the lower right corner of the detection frame at the image position, siS is 0 or more to the probability that the ith detection frame is discriminated as a pedestriani≤1;
For each detection box, a set threshold T is used to determine whether the sample is positive, i.e., for sample { l }i,siAnd (4) the following steps:
if siGreater than or equal to T, then { l-i,si-is the positive sample;
if siIf < T, then { li,si-is the negative sample;
all positive sample sets obtained after all images are subjected to the above operations are set as P, and all negative sample sets are set as N.
Preferably, in the step (4), the method for estimating the size of the pedestrian detection frame under the current height of the embedded device is as follows:
a person stands under the camera, takes the target frame of the person as the size of the pedestrian, and takes the length and the width of the target frame of the pedestrian as the height and the width of the person respectively;
let the area of the pedestrian be S when the pedestrian stands at the i-th position under the cameraiHeight of hiWidth of wi(ii) a Collecting data for many times, obtaining the area S of the pedestrian under the current camera height by an averaging method,height h, width w.
Preferably, in the step (4), the step of judging whether to reject the sample is:
for each sample, { l, under the set of positive samples Pi,siDetermine whether to cull from the positive sample set by the following criteria:
if xl-xrIf | is > γ w, then { li,siRemoving from P and adding into N;
if xl-xrIf | γ < w, then { li,siRemoving from P and adding into N;
if yl-yrIf | is greater than γ × h, then { l | > will bei,siRemoving from P and adding into N;
if yl-yrIf | gamma < h, then { li,siRemoving from P and adding into N;
if it is
Figure BDA0001610986970000041
Then will { li,siRemoving from P and adding into N;
if it is
Figure BDA0001610986970000042
Then will { li,siRemoving from P and adding into N;
the positive sample set obtained after the above operation is set as P1Set negative sample set to N1
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the method of the invention provides effective samples collected based on the camera, so that the adjustment and the optimization of the pedestrian detector on the samples are faster and better results are obtained, and meanwhile, the embedded equipment can be optimized in the working process without completing the optimization process under the online condition of the embedded equipment.
2. The invention provides a specific scene pedestrian detector automatic learning method based on deep network enhancement, which can accurately position the position of a pedestrian and a corresponding area in a specific scene.
3. In the invention, the first neural network (large neural network with complex structure) has better learning ability for training data, so that the first neural network still has better prediction ability for a test set with unknown distribution.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
FIG. 2 is a diagram of an example of prior art and inventive scene specific pedestrian detector detection results, where (a) - (d) are pedestrian detection results obtained in a particular scene using a detector trained using a common data set; (e) and (h) detecting results of the pedestrian detector under the specific scene by using the method.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
Example 1
Referring to fig. 1, the method for automatic learning of the pedestrian detector in the specific scene based on the deep network enhancement in the embodiment includes the steps of:
(1) and training a first neural network and a second neural network by using a universal data set at the server side.
In the present embodiment, data that is manually labeled in a plurality of scenes other than the current scene is used as a common data set, Faster R-CNN (Faster area-based convolutional neural network) based on ResNet-101 (101-layer residual network) is used as a first neural network, and an SSD (single frame multi-scale detector) based on AlexNet is used as a second neural network.
The pre-training networks adopted by the two networks during training are network parameters for classification obtained by training on ImageNet, parameters of the convolutional layer before the last convolutional layer is removed are taken as initialization parameters during current training, the initial learning rate of the small network (second neural network) is 0.005, the adopted learning rate adjustment strategy is multistep, wherein the parameter gamma is set to be 0.5, stepvalue is set to be 30000, and the training times are 10 ten thousands. The initial learning rate of the large network (first neural network) is 0.01, the adopted learning rate adjustment strategy is multistep, the parameter gamma is set to be 0.98, stepvalue is set to be 8500, and the training times are 20 ten thousand. And in the training process, a GeForce GTX 1080Ti model display card is used for network training.
(2) In the working process of pedestrian detection, the image of the current scene is captured through the embedded device, a newly added image sample is obtained, and the newly added image sample is transmitted to the server side.
The camera type adopted in the invention is MT9M001C12STM, for the people stream activity condition in the real scene, 20 frames per second is used for image acquisition, the size of each image is 640X480 pixels, and the image is transmitted to the embedded equipment raspberry adopted in the invention for three generations while being acquired. Meanwhile, the embedded device transmits the image to the server side by using an FTP (file transfer protocol).
Meanwhile, in order to ensure the effectiveness of the collected image samples as much as possible, the number N of pedestrians detected by the current equipment is used in the working process of pedestrian detection through the embedded equipmentpAs a basis for whether the image is collected and transmitted to the server. The specific operation is as follows:
Np≥Tp→ transfer of image to server
Np<Tp→ abandoning the image
In this experiment Tp3. Through the operation, the embedded device can collect possibly effective samples, so that the time required by the next tuning process can be effectively shortened, and the performance of the tuning result is improved.
(3) And testing the newly added image sample by utilizing the first neural network trained before at the server end, and labeling the sample by utilizing the test score of the first neural network.
And after the image I is obtained from the camera and transmitted to the server, the obtained image I is tested in the server by utilizing the large neural network trained before, and the sample is labeled by utilizing the test score of the large neural network.
The large network used therein is the Faster R-CNN (Faster area-based convolutional neural network) based on the ResNet-101 (101-layer residual network).
For each image I, the result after the large-scale network test is used
Figure BDA0001610986970000061
Wherein n is the total number of detection frames, liFor the position vector of the i-th detection frame, i.e./i=[xl,yl,xr,yr],(xl,yl)、(xr,yr) Respectively the coordinates of the upper left corner and the lower right corner of the image position of the detection frame. siS is 0 or more to the probability that the ith detection frame is discriminated as a pedestriani≤1。
For each detection frame, a determination is made as to whether it is a positive sample using a hard threshold T of 0.3, i.e., for sample { l }i,si}, there are
si≥T→{li,siIs a positive sample
si<T→{li,siIs a negative sample
All positive sample sets obtained after all images are subjected to the above operations are set as P, and all negative sample sets are set as N.
(4) Estimating the size of the pedestrian detection frame at the current height, and rejecting samples with obvious difference between the detection frame and the estimated size in the positive samples.
The pedestrian size is determined through experiments, and the specific method comprises the following steps: a person stands under the camera, takes the target frame of the person as the size of the pedestrian, and takes the length and the width of the target frame of the pedestrian as the height and the width of the person respectively.
Make pedestrian stand in ith position under cameraThe area of the pedestrian is SiHeight of hiWidth of wi. A total of 20 data acquisitions. And obtaining the area S, the height h and the width w of the pedestrian under the current camera height by an averaging method.
For each sample, { l, under the set of positive samples Pi,siDetermine whether to cull from the positive sample set by the following criteria:
|xl-xri > gamma w → will { li,siRemoving from P and adding to N
|xl-xr| γ < w → will { l | +, <i,siRemoving from P and adding to N
|yl-yr| > γ h → will { li,siRemoving from P and adding to N
|yl-yr| γ < h → will { l | +, < h-i,siRemoving from P and adding to N
Figure BDA0001610986970000071
Figure BDA0001610986970000072
In this experiment gamma is 1.3,
Figure BDA0001610986970000073
the positive sample set obtained after the above operation is set as P1Set negative sample set to N1
(5) And the server side utilizes the residual samples to tune the second neural network.
Using data set D ═ P1,N1And (2) optimizing the AlexNet-based SSD (single-frame multi-scale detector) trained in the step (1), wherein the initial learning rate is 0.0005, the learning rate adjustment strategy is multistep, the parameter gamma is set to 0.5, stepvalue is set to 10000, and the training times are 3 ten thousand.
(6) And redeploying the adjusted second neural network model to the embedded equipment from the server side.
And (4) transmitting the adjusted model back to the embedded device by using an FTP (file transfer protocol), and restarting the device. Heretofore, embedded devices have worked with models trained using a common data set.
In the embodiment, an experiment of the automatic learning method of the pedestrian detector is performed in a real scene, referring to fig. 2, the experimental results before automatic learning are shown in fig. (a) - (d), and the experimental results after automatic learning are shown in fig. (e) - (h), and it can be seen from the drawings that the invention can accurately position the position of the pedestrian and the corresponding region in a specific scene better.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (7)

1. The method for automatically learning the downlink human detector in the specific scene based on deep network enhancement is characterized by comprising the following steps:
(1) training a first neural network and a second neural network by using a universal data set at a server side, wherein the second neural network is used for being deployed in the embedded equipment;
(2) capturing an image of a current scene by using embedded equipment in the working process of pedestrian detection to obtain a newly added image sample, and transmitting the newly added image sample to a server;
(3) testing the newly added image sample by utilizing the first neural network trained before at the server end, and labeling the sample by utilizing the test score of the first neural network;
(4) estimating the size of a pedestrian detection frame under the current height of the embedded device, calculating the difference value between the detection frame in the positive sample and the estimated pedestrian detection frame, if the difference value exceeds a threshold value, removing, and keeping the residual samples;
(5) the server side utilizes the residual samples to tune the second neural network;
(6) redeploying the adjusted second neural network model to the embedded equipment from the server side;
in the step (4), the step of judging whether the samples are removed is as follows:
for each sample, { l, under the set of positive samples Pi,siDetermine whether to cull from the positive sample set by the following criteria:
if x1-xrIf | is > γ w, then { li,siRemoving from P and adding into N;
if x1-xrIf | γ < w, then { li,siRemoving from P and adding into N;
if yl-yrIf | is greater than γ × h, then { l | > will bei,siRemoving from P and adding into N;
if yl-yrIf | gamma < h, then { li,siRemoving from P and adding into N;
if it is
Figure FDA0002440176930000011
Then will { li,siRemoving from P and adding into N;
if it is
Figure FDA0002440176930000012
Then will { li,siRemoving from P and adding into N;
the positive sample set obtained after the above operation is set as P1Set negative sample set to N1,γ、
Figure FDA0002440176930000013
Are all preset parameters which are more than 1.
2. The method for automatic learning of the pedestrian detector under the specific scene based on the deep network enhancement as claimed in claim 1, wherein in the step (1), the step of training the first neural network and the second neural network by using the common data set at the server side comprises:
the data which are manually marked under other scenes except the current scene are used as a universal data set, the fast R-CNN based on ResNet-101 is used as a first neural network, and the SSD based on AlexNet is used as a second neural network.
3. The method for automatic learning of the pedestrian detector under the specific scene based on the deep network enhancement as claimed in claim 2, wherein the pre-training networks adopted by the first neural network and the second neural network during training have the network parameters obtained by: and training on ImageNet to obtain network parameters for classification, removing layers after the last convolutional layer, and taking the parameters of the remaining convolutional layers as initialization parameters during current training.
4. The method for automatic learning of pedestrian detectors under a specific scenario based on deep network enhancement as claimed in claim 1, wherein in step (2), the embedded device transmits the newly added image samples to the server side using FTP protocol.
5. The automatic learning method of the pedestrian detector under the specific scene based on the deep network enhancement as claimed in claim 1, wherein the embedded device is used for screening the collected image samples in the working process of pedestrian detection, and the steps are as follows:
setting the number of pedestrians detected by the current equipment to be NpIf N is presentp≥Tp,TpAnd if the threshold value is preset, the acquired image is used as a newly added image sample and is transmitted to the server side, otherwise, the current image is abandoned.
6. The method for automatic learning of the pedestrian detector under the specific scene based on the deep network enhancement as claimed in claim 2, wherein in the step (3), the step of testing and labeling the newly added image sample by the first neural network is:
for each image I, the result after the test by the first neural network is recorded as
Figure FDA0002440176930000021
Wherein n is the total number of detection frames, liIs the position vector of the ith detection frame, li=[xl,yl,xr,yr],(xl,y1)、(xr,yr) The coordinates of the upper left corner and the lower right corner of the detection frame at the image position, siS is 0 or more to the probability that the ith detection frame is discriminated as a pedestriani≤1;
For each detection box, a set threshold T is used to determine whether the sample is positive, i.e., for sample { l }i,siAnd (4) the following steps:
if siGreater than or equal to T, then { l-i,si-is the positive sample;
if siIf < T, then { li,si-is the negative sample;
all positive sample sets obtained after all images are subjected to the above operations are set as P, and all negative sample sets are set as N.
7. The method for automatic learning of pedestrian detectors under a specific scene based on deep network enhancement as claimed in claim 6, wherein in step (4), the method for estimating the size of the pedestrian detection frame under the current height of the embedded device is as follows:
a person stands under the camera, takes the target frame of the person as the size of the pedestrian, and takes the length and the width of the target frame of the pedestrian as the height and the width of the person respectively;
let the area of the pedestrian be S when the pedestrian stands at the i-th position under the cameraiHeight of hiWidth of wi(ii) a And acquiring data for multiple times, and obtaining the area S, the height h and the width w of the pedestrian under the current camera height by an averaging method.
CN201810264330.XA 2018-03-28 2018-03-28 Specific scene downlink person detector automatic learning method based on deep network enhancement Active CN108549852B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810264330.XA CN108549852B (en) 2018-03-28 2018-03-28 Specific scene downlink person detector automatic learning method based on deep network enhancement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810264330.XA CN108549852B (en) 2018-03-28 2018-03-28 Specific scene downlink person detector automatic learning method based on deep network enhancement

Publications (2)

Publication Number Publication Date
CN108549852A CN108549852A (en) 2018-09-18
CN108549852B true CN108549852B (en) 2020-09-08

Family

ID=63517102

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810264330.XA Active CN108549852B (en) 2018-03-28 2018-03-28 Specific scene downlink person detector automatic learning method based on deep network enhancement

Country Status (1)

Country Link
CN (1) CN108549852B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109614941B (en) * 2018-12-14 2023-02-03 中山大学 Embedded crowd density estimation method based on convolutional neural network model
CN109635750A (en) * 2018-12-14 2019-04-16 广西师范大学 A kind of compound convolutional neural networks images of gestures recognition methods under complex background
CN109816014A (en) * 2019-01-22 2019-05-28 天津大学 Generate method of the deep learning target detection network training with labeled data collection
CN110458114B (en) * 2019-08-13 2022-02-01 杜波 Method and device for determining number of people and storage medium
CN110705630A (en) * 2019-09-27 2020-01-17 聚时科技(上海)有限公司 Semi-supervised learning type target detection neural network training method, device and application
CN110909794B (en) * 2019-11-22 2022-09-13 乐鑫信息科技(上海)股份有限公司 Target detection system suitable for embedded equipment
CN111461120A (en) * 2020-04-01 2020-07-28 济南浪潮高新科技投资发展有限公司 Method for detecting surface defects of convolutional neural network object based on region
CN111582092B (en) * 2020-04-27 2023-12-22 西安交通大学 Pedestrian abnormal behavior detection method based on human skeleton
CN111695504A (en) * 2020-06-11 2020-09-22 重庆大学 Fusion type automatic driving target detection method
CN113706511A (en) * 2021-08-31 2021-11-26 佛山市南海区广工大数控装备协同创新研究院 Composite material damage detection method based on deep learning

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609686A (en) * 2012-01-19 2012-07-25 宁波大学 Pedestrian detection method
CN105528754A (en) * 2015-12-28 2016-04-27 湖南师范大学 Old people information service system based on dual neural network behavior recognition model
CN106845415A (en) * 2017-01-23 2017-06-13 中国石油大学(华东) A kind of pedestrian based on deep learning becomes more meticulous recognition methods and device
CN106982359A (en) * 2017-04-26 2017-07-25 深圳先进技术研究院 A kind of binocular video monitoring method, system and computer-readable recording medium
CN107341436A (en) * 2016-08-19 2017-11-10 北京市商汤科技开发有限公司 Gestures detection network training, gestures detection and control method, system and terminal
CN107463892A (en) * 2017-07-27 2017-12-12 北京大学深圳研究生院 Pedestrian detection method in a kind of image of combination contextual information and multi-stage characteristics

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016006626A (en) * 2014-05-28 2016-01-14 株式会社デンソーアイティーラボラトリ Detector, detection program, detection method, vehicle, parameter calculation device, parameter calculation program, and parameter calculation method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609686A (en) * 2012-01-19 2012-07-25 宁波大学 Pedestrian detection method
CN105528754A (en) * 2015-12-28 2016-04-27 湖南师范大学 Old people information service system based on dual neural network behavior recognition model
CN107341436A (en) * 2016-08-19 2017-11-10 北京市商汤科技开发有限公司 Gestures detection network training, gestures detection and control method, system and terminal
CN106845415A (en) * 2017-01-23 2017-06-13 中国石油大学(华东) A kind of pedestrian based on deep learning becomes more meticulous recognition methods and device
CN106982359A (en) * 2017-04-26 2017-07-25 深圳先进技术研究院 A kind of binocular video monitoring method, system and computer-readable recording medium
CN107463892A (en) * 2017-07-27 2017-12-12 北京大学深圳研究生院 Pedestrian detection method in a kind of image of combination contextual information and multi-stage characteristics

Also Published As

Publication number Publication date
CN108549852A (en) 2018-09-18

Similar Documents

Publication Publication Date Title
CN108549852B (en) Specific scene downlink person detector automatic learning method based on deep network enhancement
Zhao et al. Cloud shape classification system based on multi-channel cnn and improved fdm
US9652694B2 (en) Object detection method, object detection device, and image pickup device
CN111783576B (en) Pedestrian re-identification method based on improved YOLOv3 network and feature fusion
CN106960195B (en) Crowd counting method and device based on deep learning
WO2020015492A1 (en) Method and device for identifying key time point of video, computer apparatus and storage medium
WO2019127273A1 (en) Multi-person face detection method, apparatus, server, system, and storage medium
CN110264493B (en) Method and device for tracking multiple target objects in motion state
KR101697161B1 (en) Device and method for tracking pedestrian in thermal image using an online random fern learning
CN109145708B (en) Pedestrian flow statistical method based on RGB and D information fusion
WO2020094088A1 (en) Image capturing method, monitoring camera, and monitoring system
Avgerinakis et al. Recognition of activities of daily living for smart home environments
WO2016149938A1 (en) Video monitoring method, video monitoring system and computer program product
Li et al. Robust people counting in video surveillance: Dataset and system
US20180060653A1 (en) Method and apparatus for annotating a video stream comprising a sequence of frames
Amirgholipour et al. A-CCNN: adaptive CCNN for density estimation and crowd counting
CN105512618B (en) Video tracing method
CN109886170B (en) Intelligent detection, identification and statistics system for oncomelania
CN103530638A (en) Method for matching pedestrians under multiple cameras
CN104200218B (en) A kind of across visual angle action identification method and system based on timing information
WO2013075295A1 (en) Clothing identification method and system for low-resolution video
CN113435355A (en) Multi-target cow identity identification method and system
CN113076860B (en) Bird detection system under field scene
CN114299606A (en) Sleep detection method and device based on front-end camera
CN113177476A (en) Identification method, system and test method for heel key points of standing long jump

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant