CN110287862A - Anti- detection method of taking on the sly based on deep learning - Google Patents

Anti- detection method of taking on the sly based on deep learning Download PDF

Info

Publication number
CN110287862A
CN110287862A CN201910545151.8A CN201910545151A CN110287862A CN 110287862 A CN110287862 A CN 110287862A CN 201910545151 A CN201910545151 A CN 201910545151A CN 110287862 A CN110287862 A CN 110287862A
Authority
CN
China
Prior art keywords
sly
taking
convolutional layer
picture
behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910545151.8A
Other languages
Chinese (zh)
Other versions
CN110287862B (en
Inventor
张静
胡锐
周秦
申枭
李云松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201910545151.8A priority Critical patent/CN110287862B/en
Publication of CN110287862A publication Critical patent/CN110287862A/en
Application granted granted Critical
Publication of CN110287862B publication Critical patent/CN110287862B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The present invention discloses a kind of anti-detection method of taking on the sly based on deep learning, the steps include: the target detection network of 1, building deep learning;2, training set is generated;3, take a variety of scale picture frame modes that the same behavior of taking on the sly in picture is marked;4, training deep learning network;5, behavior of taking on the sly is detected;6, image enhancement is carried out to the picture without behavior of taking on the sly;7, the enhanced image of feature is detected again.The present invention is by taking a variety of scale picture frame modes when data set is marked, the low problem of accuracy is detected caused by overcoming because of behavior act diversification of taking on the sly, it constructs deep learning network and image enhancement processing is carried out to humanoid region, ensure that can reach live effect on the behavioral value of taking on the sly in monitor video, and accuracy with higher.

Description

Anti- detection method of taking on the sly based on deep learning
Technical field
The invention belongs to technical field of image processing, further relate to one of target detection technique field and are based on deeply Spend the anti-detection method of taking on the sly of study.The present invention can examine in real time the behavior of taking on the sly of people taken in video monitoring It surveys.
Technical background
Anti- in video monitoring take on the sly detection be in many privacy mechanisms and unit one very it is necessary to behavior, energy The internal security information of enough anti-locking mechanisms or unit is outwardly revealed.But in practice, stealing in artificial detection video monitoring Bat behavior is time-consuming and laborious, and is difficult to accomplish real-time detection.To solve the above problems, people are commonly designed object detection method, The behavior of taking on the sly in video monitoring is detected using computer.
Patent document " based on computer vision anti-take pictures display system of the abundant benefit year electronics Nantong Co., Ltd in its application One kind is provided in system and anti-photographic method " (application number: 201811171034.1, publication No.: CN109271814A) based on Anti- display system and the anti-photographic method of taking pictures of calculation machine vision.Steps of the method are call in from database based on RGB first The image data of color space, and data are filtered, keep data more smooth;Then RGB color is mapped Morphological scale-space is carried out to HSV space, and to image;Finally by the contour of object figure and size that will test and mobile phone and Digital camera is compared, and judges whether image includes the behavior of taking on the sly.The shortcoming of this method is: since this method is detecting When only equipment of taking on the sly is detected and behavior act of taking on the sly has diversification, be easily to take on the sly row by other behavior false judgments To have larger impact to Detection accuracy.
Shandong tide cloud service Information technology Co., Ltd is in patent document " a kind of Anti-sneak-shooting system and the side of its application A kind of Anti-sneak-shooting system and method are disclosed in method " (application number: 201711077705.3, publication No.: CN107784653A).It should The step of method is that the cine-oriented image shown on acquisition curtain in real time first exports cine-oriented image to judgment module of taking on the sly;So Real-time auditorium image afterwards, image is exported to judgment module of taking on the sly;For the every image received, current auditorium is calculated Matching degree between image and the cine-oriented image received, when calculated matching degree is more than or equal to preset matching degree threshold value, Think behavior of taking on the sly in present image.The shortcoming of this method is: since this method is used in contrast images matching degree Traditional matching process, calculation amount is larger, to can not be handled in real time video.
Summary of the invention
It is an object of the invention in view of the above shortcomings of the prior art, propose a kind of anti-theft based on deep learning network Detection method is clapped, solves the problems, such as to carry out the behavior of taking on the sly in video that inspection time difference method is low, is unable to reach live effect.
Realize the object of the invention thinking be first build the Yolov3 target detection network being made of four modules, The every layer parameter of network is set, then constructs data set, a variety of scale picture frame modes is taken to carry out the same behavior of taking on the sly in picture Label, and humanoid region is marked, then the good picture of input marking is trained deep learning network, finally will be real When the picture that acquires be input to and train detection in network and take on the sly behavior, to without the humanoid region carry out office in behavior picture of taking on the sly Portion's enhancing, the picture after local enhancement is inputted again in deep learning network, is detected again to picture.
The present invention realizes that specific step is as follows:
(1) the target detection network of deep learning is constructed:
It is as follows that (1a) builds the Yolov3 target detection network specific structure that one is made of four modules:
The structure of first module is successively are as follows: input layer → the 1st convolutional layer → the 2nd convolutional layer → first convolution submodule Block → the 3rd convolutional layer → the second convolution submodule → 4 convolutional layers → third convolution submodule → 5 convolutional layers → Volume Four product Submodule → the 6th convolutional layer → the 5th convolution submodule;The second convolution submodule is the first volume being sequentially connected in series by four Product unit composition;The 2nd convolution unit that the third convolution submodule is sequentially connected in series by eight forms;The Volume Four product submodule The 3rd convolution unit that block is sequentially connected in series by eight forms;The 5th convolution submodule is by four volumes 4 being sequentially connected in series Product unit composition;The structure of all convolution units is successively are as follows: two convolutional layer → ResNet layers be sequentially connected in series, each The ResNet layers of input terminal by place convolution submodule connects and is merged into output end;
The structure of second module is successively are as follows: the 7th convolutional layer → the 8th convolutional layer → the 9th convolutional layer → the 10th convolution Layer → the 11st convolutional layer → the 12nd convolutional layer → 13 convolutional layers → output layer;
The structure of the third module is successively are as follows: 14 convolutional layers → up-sampling layer → 1concat layers → volume 15 Lamination → the 16th convolutional layer → the 17th convolutional layer → the 18th convolutional layer → the 19th convolutional layer → the 20th convolutional layer → the 21st convolutional layer → output layer;
The structure of 4th module is successively are as follows: 22 convolutional layers → up-sampling layer → 2concat layers → volume 23 Lamination → the 24th convolutional layer → the 25th convolutional layer → the 26th convolutional layer → the 27th convolutional layer → the 28th convolutional layer → the 29th convolutional layer → output layer;
The 5th convolution submodule in first module is connected with the 7th convolutional layer in second module, by second mould The 11st convolutional layer is connected with the 14th convolutional layer in third module in block, by the 19th convolutional layer in third module and the 4th mould The 22nd convolutional layer is connected in block;By the Volume Four product submodule in first module and 1concat layers in third module It is connected, the third convolution submodule in first module is connected with 2concat layers in the 4th module, forms Yolov3 Target detection network;
The parameter that every layer of the target detection network of deep learning is arranged in (1b) is as follows:
All convolution kernel sizes of 1st to the 6th convolutional layer are disposed as 3*3, it is 32,64 that port number, which is set gradually, 128,256,512,1024, the 1st convolutional layer step-length is set as 1, and the step-length of the 2nd to the 5th convolutional layer is disposed as 2;
By the 7th, 9, the convolution kernel sizes of 11 convolutional layers be disposed as 1*1, port number is disposed as 512, and step-length is respectively provided with It is 1;
By the 8th, 10, the convolution kernel sizes of 12 convolutional layers be disposed as 3*3, port number is disposed as 1024, and step-length is all provided with It is set to 1;
The convolution kernel of 14th convolutional layer is dimensioned to 1*1, port number is set as 256, and step-length is set as 1;
By the 15th, 17, the convolution kernel sizes of 19 convolutional layers be disposed as 1*1, port number is disposed as 256, and step-length is all provided with It is set to 1;
By the 16th, 18, the convolution kernel sizes of 20 convolutional layers be disposed as 3*3, port number is disposed as 512, and step-length is all provided with It is set to 1;
The convolution kernel of 22nd convolutional layer is dimensioned to 1*1, port number is set as 128, and step-length is set as 1;
By the 23rd, 25, the convolution kernel sizes of 27 convolutional layers be disposed as 1*1, port number is disposed as 128, and step-length is all provided with It is set to 1;
By the 24th, 26, the convolution kernel sizes of 28 convolutional layers be disposed as 3*3, port number is disposed as 256, and step-length is all provided with It is set to 1;
By the 13rd, 21, the convolution kernel sizes of 29 convolutional layers be disposed as 1*1, port number is disposed as 255, and step-length is all provided with It is set to 1;
The convolution kernel size of two convolutional layers in 1st convolution submodule and the 1st to the 4th convolution unit is successively set respectively It is set to 1*1 and 3*3, step-length is disposed as 1, and the port number of two convolutional layers in the 1st convolution submodule is successively respectively set to 32 and 64, the port number of two convolutional layers in the 1st convolution unit is successively respectively set to 64 and 128, it will be in the 2nd convolution unit The port number of two convolutional layers is successively respectively set to 128 and 256, by the port number of two convolutional layers in the 3rd convolution unit according to It is secondary to be respectively set to 256 and 512, the port number of two convolutional layers in the 4th convolution unit is successively respectively set to 512 Hes 1024;
The step-length of up-sampling layers all in aforementioned four module is disposed as 2;
(2) training set is generated:
(2a) acquires at least 10,000 pictures and forms deep learning data set, wherein behavior of taking on the sly in 60% picture, It does not take on the sly in 40% picture behavior;
(2b) extracts 80% picture at random from all pictures for have the behavior of taking on the sly, all figures for behavior of never taking on the sly 80% picture composition training set is extracted in piece at random;
(3) take a variety of scale picture frame modes that the same behavior of taking on the sly in picture is marked:
(3a) makes marks to the peripheral picture frame for used photographing device of taking on the sly in all pictures for having same behavior of taking on the sly;
(3b) on photographing device used in taking on the sly in all pictures for having same behavior of taking on the sly and photographing device to going out The peripheral picture frame of existing a part of manpower makes marks;
(3c) is to the movement wheel of taking on the sly in all pictures for having same behavior of taking on the sly including complete manpower and photographing device Wide peripheral picture frame makes marks;
The humanoid region picture frame of every picture in deep learning data set is marked in (3d), obtains the training marked Collect picture;
(4) training deep learning network:
Labeled good training set picture is input in Yolov3 target detection network, network parameter is iterated It updates, the deconditioning when loss function drops to 0.1 or less obtains trained Yolov3 target detection network;
(5) behavior of taking on the sly is detected:
The picture acquired in real time in indoor environment to be detected is input to trained Yolov3 target and examined by (5a) In survey grid network, calculated target class of a picture for accordingly having marked detection target and network institute of the picture is exported Other score value;
(5b) sets 0.5 for the judgment threshold of the score value for the behavior of taking on the sly, if the behavior score value of taking on the sly detected Less than 0.5, it is believed that behavior of not taking on the sly;If the behavior score value of taking on the sly detected is greater than 0.5, it is believed that behavior of taking on the sly, it is defeated It takes on the sly out the score value and its location information of behavior;
(6) image enhancement is carried out to the picture without behavior of taking on the sly:
(6a) judges whether to be able to detect that humanoid region, will test if not detecting the behavior of taking on the sly in picture Humanoid area score value be confirmed as someone greater than 0.5 picture, humanoid picture of the area score value less than 0.5 that will test It is confirmed as nobody;
(6b) carries out in the humanoid region detected in picture at local equalization if being able to detect that humanoid region Reason;If not detecting humanoid region, it is believed that without behavior of taking on the sly in the picture;
(7) the enhanced image of feature is detected again:
Image after progress partial histogram equalization is inputted again in deep learning network, picture is carried out again It takes on the sly behavioral value.
Compared with prior art the present invention has the following advantages:
First, since the present invention takes a variety of scale picture frame modes when data set is marked, overcome existing skill Because detecting the low problem of accuracy caused by behavior act diversification of taking on the sly in art, so that the present invention is examined to the behavior of taking on the sly Accuracy with higher during survey.
Second, since the present invention has built the target detection network an of deep learning, for being examined to the behavior of taking on the sly Survey, overcome in the prior art because conventional method it is computationally intensive caused by the slow-footed problem of detection, allow the invention to reality Now the behavior of taking on the sly in monitor video is measured in real time.
Third makes the part of picture since the present invention is using the method for carrying out image enhancement to the picture without behavior of taking on the sly Feature is enhanced, so that the present invention has better detection effect to the detection for the behavior of taking on the sly.
Detailed description of the invention
Fig. 1 is flow chart of the invention;
Fig. 2 is analogous diagram of the invention.
Specific embodiment
The present invention will be further described with reference to the accompanying drawing.
Referring to Fig.1, the specific steps realized to the present invention are described in further detail.
Step 1, the target detection network of deep learning is constructed.
Build the Yolov3 target detection network being made of four modules;
The structure of first module is successively are as follows: input layer → the 1st convolutional layer → the 2nd convolutional layer → first convolution submodule Block → the 3rd convolutional layer → the second convolution submodule → 4 convolutional layers → third convolution submodule → 5 convolutional layers → Volume Four product Submodule → the 6th convolutional layer → the 5th convolution submodule;The second convolution submodule is the first volume being sequentially connected in series by four Product unit composition;The 2nd convolution unit that the third convolution submodule is sequentially connected in series by eight forms;The Volume Four product submodule The 3rd convolution unit that block is sequentially connected in series by eight forms;The 5th convolution submodule is by four volumes 4 being sequentially connected in series Product unit composition;The structure of all convolution units is successively are as follows: two convolutional layer → ResNet layers be sequentially connected in series, each The ResNet layers of input terminal by place convolution submodule connects and is merged into output end;
The structure of second module is successively are as follows: the 7th convolutional layer → the 8th convolutional layer → the 9th convolutional layer → the 10th convolution Layer → the 11st convolutional layer → the 12nd convolutional layer → 13 convolutional layers → output layer;
The structure of the third module is successively are as follows: 14 convolutional layers → up-sampling layer → 1concat layers → volume 15 Lamination → the 16th convolutional layer → the 17th convolutional layer → the 18th convolutional layer → the 19th convolutional layer → the 20th convolutional layer → the 21st convolutional layer → output layer;
The structure of 4th module is successively are as follows: 22 convolutional layers → up-sampling layer → 2concat layers → volume 23 Lamination → the 24th convolutional layer → the 25th convolutional layer → the 26th convolutional layer → the 27th convolutional layer → the 28th convolutional layer → the 29th convolutional layer → output layer;
The 5th convolution submodule in first module is connected with the 7th convolutional layer in second module, by second mould The 11st convolutional layer is connected with the 14th convolutional layer in third module in block, by the 19th convolutional layer in third module and the 4th mould The 22nd convolutional layer is connected in block;By the Volume Four product submodule in first module and 1concat layers in third module It is connected, the third convolution submodule in first module is connected with 2concat layers in the 4th module, forms Yolov3 Target detection network.
The parameter of every layer of the target detection network of deep learning is set;
All convolution kernel sizes of 1st to the 6th convolutional layer are disposed as 3*3, it is 32,64 that port number, which is set gradually, 128,256,512,1024, the 1st convolutional layer step-length is set as 1, and the step-length of the 2nd to the 5th convolutional layer is disposed as 2;
By the 7th, 9, the convolution kernel sizes of 11 convolutional layers be disposed as 1*1, port number is disposed as 512, and step-length is respectively provided with It is 1;
By the 8th, 10, the convolution kernel sizes of 12 convolutional layers be disposed as 3*3, port number is disposed as 1024, and step-length is all provided with It is set to 1;
The convolution kernel of 14th convolutional layer is dimensioned to 1*1, port number is set as 256, and step-length is set as 1;
By the 15th, 17, the convolution kernel sizes of 19 convolutional layers be disposed as 1*1, port number is disposed as 256, and step-length is all provided with It is set to 1;
By the 16th, 18, the convolution kernel sizes of 20 convolutional layers be disposed as 3*3, port number is disposed as 512, and step-length is all provided with It is set to 1;
The convolution kernel of 22nd convolutional layer is dimensioned to 1*1, port number is set as 128, and step-length is set as 1;
By the 23rd, 25, the convolution kernel sizes of 27 convolutional layers be disposed as 1*1, port number is disposed as 128, and step-length is all provided with It is set to 1;
By the 24th, 26, the convolution kernel sizes of 28 convolutional layers be disposed as 3*3, port number is disposed as 256, and step-length is all provided with It is set to 1;
By the 13rd, 21, the convolution kernel sizes of 29 convolutional layers be disposed as 1*1, port number is disposed as 255, and step-length is all provided with It is set to 1;
The convolution kernel size of two convolutional layers in 1st convolution submodule and the 1st to the 4th convolution unit is successively set respectively It is set to 1*1 and 3*3, step-length is disposed as 1, and the port number of two convolutional layers in the 1st convolution submodule is successively respectively set to 32 and 64, the port number of two convolutional layers in the 1st convolution unit is successively respectively set to 64 and 128, it will be in the 2nd convolution unit The port number of two convolutional layers is successively respectively set to 128 and 256, by the port number of two convolutional layers in the 3rd convolution unit according to It is secondary to be respectively set to 256 and 512, the port number of two convolutional layers in the 4th convolution unit is successively respectively set to 512 Hes 1024;
The step-length of up-sampling layers all in aforementioned four module is disposed as 2.
Step 2, training set is generated:
It acquires at least 10,000 pictures and forms deep learning data set, wherein behavior of taking on the sly in 60% picture, 40% Picture in do not take on the sly behavior;
The picture for extracting 80% at random from all pictures for have the behavior of taking on the sly, in all pictures for behavior of never taking on the sly The picture of random extraction 80%, forms training set.
The behavior of taking on the sly in the picture refers to that people hold mobile phone under environment indoors, camera, taking pictures for plate set The standby behavior taken pictures.
Step 3, take a variety of scale picture frame modes that the same behavior of taking on the sly in picture is marked:
It makes marks to the peripheral picture frame for used photographing device of taking on the sly in all pictures for having same behavior of taking on the sly;
To what is occurred on take on the sly in all pictures for having same behavior of taking on the sly used photographing device and photographing device The peripheral picture frame of a part of manpower makes marks;
To the motion profile of taking on the sly in all pictures for having same behavior of taking on the sly including complete manpower and photographing device Peripheral picture frame makes marks.
The humanoid region picture frame of every picture in deep learning data set is marked, the training set figure marked is obtained Piece.
Step 4, training deep learning network:
Labeled good training set picture is input in Yolov3 target detection network, network parameter is iterated It updates, the deconditioning when loss function drops to 0.1 or less obtains trained Yolov3 target detection network.
Step 5, behavior of taking on the sly is detected:
The picture acquired in real time in indoor environment to be detected is input to trained Yolov3 target detection net In network, export picture and network that one of the picture has accordingly marked detection target the calculated target category Score value.
0.5 is set by the judgment threshold of the score value for the behavior of taking on the sly, if the behavior score value of taking on the sly detected is less than 0.5, it is believed that behavior of not taking on the sly;If the behavior score value of taking on the sly detected is greater than 0.5, it is believed that behavior of taking on the sly, output are stolen The score value and its location information of bat behavior.
Step 6, image enhancement is carried out to the picture without behavior of taking on the sly:
If not detecting the behavior of taking on the sly in picture, judge whether to be able to detect that humanoid region, the people that will test Picture of the shape area score value greater than 0.5 is confirmed as someone, picture confirmation of the humanoid area score value that will test less than 0.5 For nobody.
If being able to detect that humanoid region, the humanoid region detected in picture is subjected to local equalization processing; If not detecting humanoid region, it is believed that without behavior of taking on the sly in the picture.
Step 7, the enhanced image of feature is detected again:
Image after progress partial histogram equalization is inputted again in deep learning network, picture is carried out again It takes on the sly behavioral value.
It is described to without take on the sly performance-based objective image carry out feature enhancing the step of it is as follows: save monitor video in detect people But the picture for not detecting the behavior of taking on the sly is obtained the pixel coordinate information that these pictures carry out humanoid detection, is sat using pixel It marks information and histogram equalization is carried out to humanoid region, finally obtain the picture after local equalization.
Effect of the invention is described further below with reference to emulation experiment.
1. emulation experiment condition:
The hardware platform of emulation experiment of the invention are as follows: processor is Intel Core i7-6850K CPU, and video card is NVIDIA GeForce GTX 1080Ti, inside saves as 128GB.
The software platform of emulation experiment of the invention are as follows: ubuntu16.04.
2. emulation experiment content:
Emulation experiment of the present invention is that therein 80% is randomly selected from data set as training using method of the invention Collection, 10% is used as verifying collection, and remaining 10% is used as test set.The deep learning network built is carried out on the data set Training, obtains trained deep learning network.
Data set used in emulation experiment of the present invention is held take on the sly the picture of movement of mobile phone by someone of indoor acquisition and is formed, Imaging time is in March, 2019, and picture size is 1920 × 1080, picture format jpg.
Trained deep learning network is tested on test set, the behavior of taking on the sly in picture is detected, to figure After humanoid region is enhanced in piece, two kinds of mark modes of phone and phone-hand have obtained correct testing result As shown in Figure 2.In Fig. 2, it is denoted as phone with the testing result that the mode that the peripheral picture frame to mobile phone makes marks obtains, with to people The testing result that the mode that the peripheral picture frame of the motion profile of taking on the sly of hand-held mobile phone makes marks obtains is denoted as phone-hand.
Picture in present invention emulation is carried out using three evaluation indexes (accuracy rate Acc, misclassification rate FPR, leakage knowledge rate FNR) Testing result before and after local enhancement is evaluated respectively.Using following formula, accuracy rate Acc, misclassification rate FPR, leakage knowledge rate are calculated FNR, wherein TP expression is correctly divided into the picture number of target, FP indicates the picture for being mistakenly divided into target Number, TN expression are mistakenly divided into aimless picture number, FN expression is correctly divided into aimless picture number, by institute There is calculated result to be depicted as table 1:
The quantitative analysis table of testing result of the present invention in 1. emulation experiment of table
FPR FNR Acc
Before local enhancement 67.6% 37.4% 28.4%
After local enhancement 76.0% 26.9% 21.6%
In conjunction with table 1 as can be seen that the misclassification rate FPR that the present invention detects after local enhancement is 76.0%, leakage knowledge rate FNR is 26.9%, accuracy rate Acc are 21.6%, these three indexs are above the testing result before local enhancement, it was demonstrated that local enhancement figure The available more accurate testing result of the method for piece.
The above emulation experiment shows: the present invention takes a variety of scale picture frame modes when data set is marked, and overcomes The prior art is because detecting the low problem of accuracy caused by behavior act diversification of taking on the sly, so that the present invention is in the detection process Accuracy with higher.The target detection network based on deep learning has been built when detecting, overcomes the prior art Because conventional method it is computationally intensive caused by the slow-footed problem of detection so that take behavioral value of the present invention in monitor video On can reach live effect.

Claims (3)

1. a kind of anti-detection method of taking on the sly based on deep learning, which is characterized in that the target detection network of deep learning is constructed, It takes a variety of scale picture frame modes that picture is marked, image enhancement, the step of this method is carried out to the picture without behavior of taking on the sly It is rapid as follows:
(1) the target detection network of deep learning is constructed:
It is as follows that (1a) builds the Yolov3 target detection network specific structure that one is made of four modules:
The structure of first module is successively are as follows: and input layer → the 1st convolutional layer → the 2nd convolutional layer → first convolution submodule → 3rd convolutional layer → the second convolution submodule → 4 convolutional layers → third convolution submodule → 5 convolutional layers → Volume Four product submodule Block → the 6th convolutional layer → the 5th convolution submodule;The second convolution submodule is the first convolution list being sequentially connected in series by four Member composition;The 2nd convolution unit that the third convolution submodule is sequentially connected in series by eight forms;The Volume Four accumulates submodule The 3rd convolution unit being sequentially connected in series by eight forms;The 5th convolution submodule is the 4th convolution list being sequentially connected in series by four Member composition;The structure of all convolution units is successively are as follows: two convolutional layer → ResNet layers be sequentially connected in series, it is ResNet layers each The input terminal of place convolution submodule is connected and is merged into output end;
The structure of second module is successively are as follows: and the 7th convolutional layer → the 8th convolutional layer → the 9th convolutional layer → the 10th convolutional layer → 11st convolutional layer → the 12nd convolutional layer → 13 convolutional layers → output layer;
The structure of the third module is successively are as follows: 14 convolutional layers → up-sampling layer → the 1concat layers → the 15th convolutional layer → the 16 convolutional layer → the 17th convolutional layer → the 18th convolutional layer → the 19th convolutional layer → the 20th convolutional layer → the 21st convolutional layer → defeated Layer out;
The structure of 4th module is successively are as follows: 22 convolutional layers → up-sampling layer → the 2concat layers → the 23rd convolutional layer → the 24 convolutional layer → the 25th convolutional layer → the 26th convolutional layer → the 27th convolutional layer → the 28th convolutional layer → the 29th convolutional layer → defeated Layer out;
The 5th convolution submodule in first module is connected with the 7th convolutional layer in second module, it will be in second module 11st convolutional layer is connected with the 14th convolutional layer in third module, will be in the 19th convolutional layer in third module and the 4th module 22nd convolutional layer is connected;Volume Four product submodule in first module is connected with 1concat layers in third module, Third convolution submodule in first module is connected with 2concat layers in the 4th module, forms Yolov3 target Detect network;
The parameter that every layer of the target detection network of deep learning is arranged in (1b) is as follows:
All convolution kernel sizes of 1st to the 6th convolutional layer are disposed as 3*3, it is 32,64,128 that port number, which is set gradually, 256,512,1024, the 1st convolutional layer step-length is set as 1, and the step-length of the 2nd to the 5th convolutional layer is disposed as 2;
By the 7th, 9, the convolution kernel sizes of 11 convolutional layers be disposed as 1*1, port number is disposed as 512, and step-length is disposed as 1;
By the 8th, 10, the convolution kernel sizes of 12 convolutional layers be disposed as 3*3, port number is disposed as 1024, and step-length is disposed as 1;
The convolution kernel of 14th convolutional layer is dimensioned to 1*1, port number is set as 256, and step-length is set as 1;
By the 15th, 17, the convolution kernel sizes of 19 convolutional layers be disposed as 1*1, port number is disposed as 256, and step-length is disposed as 1;
By the 16th, 18, the convolution kernel sizes of 20 convolutional layers be disposed as 3*3, port number is disposed as 512, and step-length is disposed as 1;
The convolution kernel of 22nd convolutional layer is dimensioned to 1*1, port number is set as 128, and step-length is set as 1;
By the 23rd, 25, the convolution kernel sizes of 27 convolutional layers be disposed as 1*1, port number is disposed as 128, and step-length is disposed as 1;
By the 24th, 26, the convolution kernel sizes of 28 convolutional layers be disposed as 3*3, port number is disposed as 256, and step-length is disposed as 1;
By the 13rd, 21, the convolution kernel sizes of 29 convolutional layers be disposed as 1*1, port number is disposed as 255, and step-length is disposed as 1;
The convolution kernel size of two convolutional layers in 1st convolution submodule and the 1st to the 4th convolution unit is successively respectively set to 1*1 and 3*3, step-length are disposed as 1, and the port number of two convolutional layers in the 1st convolution submodule is successively respectively set to 32 Hes 64, the port number of two convolutional layers in the 1st convolution unit is successively respectively set to 64 and 128, it will be two in the 2nd convolution unit The port number of convolutional layer is successively respectively set to 128 and 256, and the port number of two convolutional layers in the 3rd convolution unit is successively divided 256 and 512 are not set as, the port number of two convolutional layers in the 4th convolution unit is successively respectively set to 512 and 1024;
The step-length of up-sampling layers all in aforementioned four module is disposed as 2;
(2) training set is generated:
(2a) acquires at least 10,000 pictures and forms deep learning data set, wherein behavior of taking on the sly in 60% picture, 40% Picture in do not take on the sly behavior;
(2b) extracts 80% picture at random from all pictures for have the behavior of taking on the sly, in all pictures for behavior of never taking on the sly The picture of random extraction 80% forms training set;
(3) take a variety of scale picture frame modes that the same behavior of taking on the sly in picture is marked:
(3a) makes marks to the peripheral picture frame for used photographing device of taking on the sly in all pictures for having same behavior of taking on the sly;
(3b) is to occurring on photographing device used in taking on the sly in all pictures for having same behavior of taking on the sly and photographing device The peripheral picture frame of a part of manpower makes marks;
(3c) is to the motion profile of taking on the sly in all pictures for having same behavior of taking on the sly including complete manpower and photographing device Peripheral picture frame makes marks;
The humanoid region picture frame of every picture in deep learning data set is marked in (3d), obtains the training set figure marked Piece;
(4) training deep learning network:
Labeled good training set picture is input in Yolov3 target detection network, network parameter is iterated more Newly, the deconditioning when loss function drops to 0.1 or less obtains trained Yolov3 target detection network;
(5) behavior of taking on the sly is detected:
The picture acquired in real time in indoor environment to be detected is input to trained Yolov3 target detection net by (5a) In network, export picture and network that one of the picture has accordingly marked detection target the calculated target category Score value;
(5b) sets 0.5 for the judgment threshold of the score value for the behavior of taking on the sly, if the behavior score value of taking on the sly detected is less than 0.5, it is believed that behavior of not taking on the sly;If the behavior score value of taking on the sly detected is greater than 0.5, it is believed that behavior of taking on the sly, output are stolen The score value and its location information of bat behavior;
(6) image enhancement is carried out to the picture without behavior of taking on the sly:
(6a) judges whether to be able to detect that humanoid region, the people that will test if not detecting the behavior of taking on the sly in picture Picture of the shape area score value greater than 0.5 is confirmed as someone, picture confirmation of the humanoid area score value that will test less than 0.5 For nobody;
(6b) carries out local equalization processing if being able to detect that humanoid region, by the humanoid region detected in picture; If not detecting humanoid region, it is believed that without behavior of taking on the sly in the picture;
(7) the enhanced image of feature is detected again:
Image after progress partial histogram equalization is inputted again in deep learning network, is taken on the sly again to picture Behavioral value.
2. the anti-detection method of taking on the sly according to claim 1 based on deep learning, which is characterized in that step (2a), step Suddenly (2b), the behavior of taking on the sly described in step (2c) refer to that people hold mobile phone under environment indoors, camera, taking pictures for plate set The standby behavior taken pictures.
3. the anti-detection method of taking on the sly according to claim 1 based on deep learning, which is characterized in that institute in step (6) That states is as follows the step of carrying out image enhancement to the picture without behavior of taking on the sly: saving and detects people in monitor video but do not detect To the picture for the behavior of taking on the sly, the pixel coordinate information that these pictures carry out humanoid detection is obtained, using pixel coordinate information to people Shape region carries out histogram equalization, finally obtains the picture after local equalization.
CN201910545151.8A 2019-06-21 2019-06-21 Anti-candid detection method based on deep learning Active CN110287862B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910545151.8A CN110287862B (en) 2019-06-21 2019-06-21 Anti-candid detection method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910545151.8A CN110287862B (en) 2019-06-21 2019-06-21 Anti-candid detection method based on deep learning

Publications (2)

Publication Number Publication Date
CN110287862A true CN110287862A (en) 2019-09-27
CN110287862B CN110287862B (en) 2021-04-06

Family

ID=68004329

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910545151.8A Active CN110287862B (en) 2019-06-21 2019-06-21 Anti-candid detection method based on deep learning

Country Status (1)

Country Link
CN (1) CN110287862B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109040654A (en) * 2018-08-21 2018-12-18 苏州科达科技股份有限公司 Recognition methods, device and the storage medium of external capture apparatus
CN110807405A (en) * 2019-10-29 2020-02-18 维沃移动通信有限公司 Detection method of candid camera device and electronic equipment
CN113223036A (en) * 2020-01-21 2021-08-06 李华 Electronic equipment field positioning system
CN113408379A (en) * 2021-06-04 2021-09-17 开放智能机器(上海)有限公司 Mobile phone candid behavior monitoring method and system
CN114067441A (en) * 2022-01-14 2022-02-18 合肥高维数据技术有限公司 Shooting and recording behavior detection method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150125046A1 (en) * 2013-11-01 2015-05-07 Sony Corporation Information processing device and information processing method
CN107784653A (en) * 2017-11-06 2018-03-09 山东浪潮云服务信息科技有限公司 A kind of Anti-sneak-shooting system and method
WO2018066467A1 (en) * 2016-10-03 2018-04-12 大学共同利用機関法人情報・システム研究機構 Wearable article and method for preventing secret photographing of biometric features
CN107911551A (en) * 2017-11-16 2018-04-13 吴英 Intelligent mobile phone platform based on action recognition
CN109460754A (en) * 2019-01-31 2019-03-12 深兰人工智能芯片研究院(江苏)有限公司 A kind of water surface foreign matter detecting method, device, equipment and storage medium
CN109492594A (en) * 2018-11-16 2019-03-19 西安电子科技大学 Classroom participant's new line rate detection method based on deep learning network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150125046A1 (en) * 2013-11-01 2015-05-07 Sony Corporation Information processing device and information processing method
WO2018066467A1 (en) * 2016-10-03 2018-04-12 大学共同利用機関法人情報・システム研究機構 Wearable article and method for preventing secret photographing of biometric features
CN107784653A (en) * 2017-11-06 2018-03-09 山东浪潮云服务信息科技有限公司 A kind of Anti-sneak-shooting system and method
CN107911551A (en) * 2017-11-16 2018-04-13 吴英 Intelligent mobile phone platform based on action recognition
CN109492594A (en) * 2018-11-16 2019-03-19 西安电子科技大学 Classroom participant's new line rate detection method based on deep learning network
CN109460754A (en) * 2019-01-31 2019-03-12 深兰人工智能芯片研究院(江苏)有限公司 A kind of water surface foreign matter detecting method, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HONGQUAN QU等: "A Pedestrian Detection Method Based on YOLOv3 Model and Image Enhanced by Retinex", 《2018 11TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS》 *
盛恒等: "基于Faster R-CNN和IoU优化的实验室人数统计与管理系统", 《计算机应用》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109040654A (en) * 2018-08-21 2018-12-18 苏州科达科技股份有限公司 Recognition methods, device and the storage medium of external capture apparatus
CN110807405A (en) * 2019-10-29 2020-02-18 维沃移动通信有限公司 Detection method of candid camera device and electronic equipment
CN113223036A (en) * 2020-01-21 2021-08-06 李华 Electronic equipment field positioning system
CN113223036B (en) * 2020-01-21 2022-04-08 湖北讯甲科技有限公司 Electronic equipment field positioning system
CN113408379A (en) * 2021-06-04 2021-09-17 开放智能机器(上海)有限公司 Mobile phone candid behavior monitoring method and system
CN114067441A (en) * 2022-01-14 2022-02-18 合肥高维数据技术有限公司 Shooting and recording behavior detection method and system
CN114067441B (en) * 2022-01-14 2022-04-08 合肥高维数据技术有限公司 Shooting and recording behavior detection method and system

Also Published As

Publication number Publication date
CN110287862B (en) 2021-04-06

Similar Documents

Publication Publication Date Title
CN110287862A (en) Anti- detection method of taking on the sly based on deep learning
CN104166841B (en) The quick detection recognition methods of pedestrian or vehicle is specified in a kind of video surveillance network
CN110378232B (en) Improved test room examinee position rapid detection method of SSD dual-network
CN103345758B (en) Jpeg image region duplication based on DCT statistical nature distorts blind checking method
CN108416771A (en) Metal material corrosion area detection method based on monocular camera
CN116092199B (en) Employee working state identification method and identification system
US9280209B2 (en) Method for generating 3D coordinates and mobile terminal for generating 3D coordinates
CN112183438B (en) Image identification method for illegal behaviors based on small sample learning neural network
CN113111767A (en) Fall detection method based on deep learning 3D posture assessment
CN110929687A (en) Multi-user behavior recognition system based on key point detection and working method
CN103324852A (en) Four-modal medical imaging diagnosis system based on feature matching
CN112668557A (en) Method for defending image noise attack in pedestrian re-identification system
CN109492594A (en) Classroom participant's new line rate detection method based on deep learning network
CN109816656A (en) A kind of thermal power plant's negative pressure side system leak source accurate positioning method
CN116071424A (en) Fruit space coordinate positioning method based on monocular vision
CN109684986A (en) A kind of vehicle analysis method and system based on automobile detecting following
CN110321869A (en) Personnel's detection and extracting method based on Multiscale Fusion network
CN104243970A (en) 3D drawn image objective quality evaluation method based on stereoscopic vision attention mechanism and structural similarity
CN105989600A (en) Characteristic point distribution statistics-based power distribution network device appearance detection method and system
CN110321867A (en) Shelter target detection method based on part constraint network
Mu et al. Salient object detection in low contrast images via global convolution and boundary refinement
CN102148919B (en) Method and system for detecting balls
CN117237990A (en) Method and device for estimating weight of pig farm, electronic equipment and storage medium
CN113538720A (en) Embedded face recognition attendance checking method based on Haisi intelligent AI chip
CN116052230A (en) Palm vein recognition method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant