CN110287862A - Anti- detection method of taking on the sly based on deep learning - Google Patents
Anti- detection method of taking on the sly based on deep learning Download PDFInfo
- Publication number
- CN110287862A CN110287862A CN201910545151.8A CN201910545151A CN110287862A CN 110287862 A CN110287862 A CN 110287862A CN 201910545151 A CN201910545151 A CN 201910545151A CN 110287862 A CN110287862 A CN 110287862A
- Authority
- CN
- China
- Prior art keywords
- sly
- taking
- convolutional layer
- picture
- behavior
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The present invention discloses a kind of anti-detection method of taking on the sly based on deep learning, the steps include: the target detection network of 1, building deep learning;2, training set is generated;3, take a variety of scale picture frame modes that the same behavior of taking on the sly in picture is marked;4, training deep learning network;5, behavior of taking on the sly is detected;6, image enhancement is carried out to the picture without behavior of taking on the sly;7, the enhanced image of feature is detected again.The present invention is by taking a variety of scale picture frame modes when data set is marked, the low problem of accuracy is detected caused by overcoming because of behavior act diversification of taking on the sly, it constructs deep learning network and image enhancement processing is carried out to humanoid region, ensure that can reach live effect on the behavioral value of taking on the sly in monitor video, and accuracy with higher.
Description
Technical field
The invention belongs to technical field of image processing, further relate to one of target detection technique field and are based on deeply
Spend the anti-detection method of taking on the sly of study.The present invention can examine in real time the behavior of taking on the sly of people taken in video monitoring
It surveys.
Technical background
Anti- in video monitoring take on the sly detection be in many privacy mechanisms and unit one very it is necessary to behavior, energy
The internal security information of enough anti-locking mechanisms or unit is outwardly revealed.But in practice, stealing in artificial detection video monitoring
Bat behavior is time-consuming and laborious, and is difficult to accomplish real-time detection.To solve the above problems, people are commonly designed object detection method,
The behavior of taking on the sly in video monitoring is detected using computer.
Patent document " based on computer vision anti-take pictures display system of the abundant benefit year electronics Nantong Co., Ltd in its application
One kind is provided in system and anti-photographic method " (application number: 201811171034.1, publication No.: CN109271814A) based on
Anti- display system and the anti-photographic method of taking pictures of calculation machine vision.Steps of the method are call in from database based on RGB first
The image data of color space, and data are filtered, keep data more smooth;Then RGB color is mapped
Morphological scale-space is carried out to HSV space, and to image;Finally by the contour of object figure and size that will test and mobile phone and
Digital camera is compared, and judges whether image includes the behavior of taking on the sly.The shortcoming of this method is: since this method is detecting
When only equipment of taking on the sly is detected and behavior act of taking on the sly has diversification, be easily to take on the sly row by other behavior false judgments
To have larger impact to Detection accuracy.
Shandong tide cloud service Information technology Co., Ltd is in patent document " a kind of Anti-sneak-shooting system and the side of its application
A kind of Anti-sneak-shooting system and method are disclosed in method " (application number: 201711077705.3, publication No.: CN107784653A).It should
The step of method is that the cine-oriented image shown on acquisition curtain in real time first exports cine-oriented image to judgment module of taking on the sly;So
Real-time auditorium image afterwards, image is exported to judgment module of taking on the sly;For the every image received, current auditorium is calculated
Matching degree between image and the cine-oriented image received, when calculated matching degree is more than or equal to preset matching degree threshold value,
Think behavior of taking on the sly in present image.The shortcoming of this method is: since this method is used in contrast images matching degree
Traditional matching process, calculation amount is larger, to can not be handled in real time video.
Summary of the invention
It is an object of the invention in view of the above shortcomings of the prior art, propose a kind of anti-theft based on deep learning network
Detection method is clapped, solves the problems, such as to carry out the behavior of taking on the sly in video that inspection time difference method is low, is unable to reach live effect.
Realize the object of the invention thinking be first build the Yolov3 target detection network being made of four modules,
The every layer parameter of network is set, then constructs data set, a variety of scale picture frame modes is taken to carry out the same behavior of taking on the sly in picture
Label, and humanoid region is marked, then the good picture of input marking is trained deep learning network, finally will be real
When the picture that acquires be input to and train detection in network and take on the sly behavior, to without the humanoid region carry out office in behavior picture of taking on the sly
Portion's enhancing, the picture after local enhancement is inputted again in deep learning network, is detected again to picture.
The present invention realizes that specific step is as follows:
(1) the target detection network of deep learning is constructed:
It is as follows that (1a) builds the Yolov3 target detection network specific structure that one is made of four modules:
The structure of first module is successively are as follows: input layer → the 1st convolutional layer → the 2nd convolutional layer → first convolution submodule
Block → the 3rd convolutional layer → the second convolution submodule → 4 convolutional layers → third convolution submodule → 5 convolutional layers → Volume Four product
Submodule → the 6th convolutional layer → the 5th convolution submodule;The second convolution submodule is the first volume being sequentially connected in series by four
Product unit composition;The 2nd convolution unit that the third convolution submodule is sequentially connected in series by eight forms;The Volume Four product submodule
The 3rd convolution unit that block is sequentially connected in series by eight forms;The 5th convolution submodule is by four volumes 4 being sequentially connected in series
Product unit composition;The structure of all convolution units is successively are as follows: two convolutional layer → ResNet layers be sequentially connected in series, each
The ResNet layers of input terminal by place convolution submodule connects and is merged into output end;
The structure of second module is successively are as follows: the 7th convolutional layer → the 8th convolutional layer → the 9th convolutional layer → the 10th convolution
Layer → the 11st convolutional layer → the 12nd convolutional layer → 13 convolutional layers → output layer;
The structure of the third module is successively are as follows: 14 convolutional layers → up-sampling layer → 1concat layers → volume 15
Lamination → the 16th convolutional layer → the 17th convolutional layer → the 18th convolutional layer → the 19th convolutional layer → the 20th convolutional layer → the 21st convolutional layer
→ output layer;
The structure of 4th module is successively are as follows: 22 convolutional layers → up-sampling layer → 2concat layers → volume 23
Lamination → the 24th convolutional layer → the 25th convolutional layer → the 26th convolutional layer → the 27th convolutional layer → the 28th convolutional layer → the 29th convolutional layer
→ output layer;
The 5th convolution submodule in first module is connected with the 7th convolutional layer in second module, by second mould
The 11st convolutional layer is connected with the 14th convolutional layer in third module in block, by the 19th convolutional layer in third module and the 4th mould
The 22nd convolutional layer is connected in block;By the Volume Four product submodule in first module and 1concat layers in third module
It is connected, the third convolution submodule in first module is connected with 2concat layers in the 4th module, forms Yolov3
Target detection network;
The parameter that every layer of the target detection network of deep learning is arranged in (1b) is as follows:
All convolution kernel sizes of 1st to the 6th convolutional layer are disposed as 3*3, it is 32,64 that port number, which is set gradually,
128,256,512,1024, the 1st convolutional layer step-length is set as 1, and the step-length of the 2nd to the 5th convolutional layer is disposed as 2;
By the 7th, 9, the convolution kernel sizes of 11 convolutional layers be disposed as 1*1, port number is disposed as 512, and step-length is respectively provided with
It is 1;
By the 8th, 10, the convolution kernel sizes of 12 convolutional layers be disposed as 3*3, port number is disposed as 1024, and step-length is all provided with
It is set to 1;
The convolution kernel of 14th convolutional layer is dimensioned to 1*1, port number is set as 256, and step-length is set as 1;
By the 15th, 17, the convolution kernel sizes of 19 convolutional layers be disposed as 1*1, port number is disposed as 256, and step-length is all provided with
It is set to 1;
By the 16th, 18, the convolution kernel sizes of 20 convolutional layers be disposed as 3*3, port number is disposed as 512, and step-length is all provided with
It is set to 1;
The convolution kernel of 22nd convolutional layer is dimensioned to 1*1, port number is set as 128, and step-length is set as 1;
By the 23rd, 25, the convolution kernel sizes of 27 convolutional layers be disposed as 1*1, port number is disposed as 128, and step-length is all provided with
It is set to 1;
By the 24th, 26, the convolution kernel sizes of 28 convolutional layers be disposed as 3*3, port number is disposed as 256, and step-length is all provided with
It is set to 1;
By the 13rd, 21, the convolution kernel sizes of 29 convolutional layers be disposed as 1*1, port number is disposed as 255, and step-length is all provided with
It is set to 1;
The convolution kernel size of two convolutional layers in 1st convolution submodule and the 1st to the 4th convolution unit is successively set respectively
It is set to 1*1 and 3*3, step-length is disposed as 1, and the port number of two convolutional layers in the 1st convolution submodule is successively respectively set to
32 and 64, the port number of two convolutional layers in the 1st convolution unit is successively respectively set to 64 and 128, it will be in the 2nd convolution unit
The port number of two convolutional layers is successively respectively set to 128 and 256, by the port number of two convolutional layers in the 3rd convolution unit according to
It is secondary to be respectively set to 256 and 512, the port number of two convolutional layers in the 4th convolution unit is successively respectively set to 512 Hes
1024;
The step-length of up-sampling layers all in aforementioned four module is disposed as 2;
(2) training set is generated:
(2a) acquires at least 10,000 pictures and forms deep learning data set, wherein behavior of taking on the sly in 60% picture,
It does not take on the sly in 40% picture behavior;
(2b) extracts 80% picture at random from all pictures for have the behavior of taking on the sly, all figures for behavior of never taking on the sly
80% picture composition training set is extracted in piece at random;
(3) take a variety of scale picture frame modes that the same behavior of taking on the sly in picture is marked:
(3a) makes marks to the peripheral picture frame for used photographing device of taking on the sly in all pictures for having same behavior of taking on the sly;
(3b) on photographing device used in taking on the sly in all pictures for having same behavior of taking on the sly and photographing device to going out
The peripheral picture frame of existing a part of manpower makes marks;
(3c) is to the movement wheel of taking on the sly in all pictures for having same behavior of taking on the sly including complete manpower and photographing device
Wide peripheral picture frame makes marks;
The humanoid region picture frame of every picture in deep learning data set is marked in (3d), obtains the training marked
Collect picture;
(4) training deep learning network:
Labeled good training set picture is input in Yolov3 target detection network, network parameter is iterated
It updates, the deconditioning when loss function drops to 0.1 or less obtains trained Yolov3 target detection network;
(5) behavior of taking on the sly is detected:
The picture acquired in real time in indoor environment to be detected is input to trained Yolov3 target and examined by (5a)
In survey grid network, calculated target class of a picture for accordingly having marked detection target and network institute of the picture is exported
Other score value;
(5b) sets 0.5 for the judgment threshold of the score value for the behavior of taking on the sly, if the behavior score value of taking on the sly detected
Less than 0.5, it is believed that behavior of not taking on the sly;If the behavior score value of taking on the sly detected is greater than 0.5, it is believed that behavior of taking on the sly, it is defeated
It takes on the sly out the score value and its location information of behavior;
(6) image enhancement is carried out to the picture without behavior of taking on the sly:
(6a) judges whether to be able to detect that humanoid region, will test if not detecting the behavior of taking on the sly in picture
Humanoid area score value be confirmed as someone greater than 0.5 picture, humanoid picture of the area score value less than 0.5 that will test
It is confirmed as nobody;
(6b) carries out in the humanoid region detected in picture at local equalization if being able to detect that humanoid region
Reason;If not detecting humanoid region, it is believed that without behavior of taking on the sly in the picture;
(7) the enhanced image of feature is detected again:
Image after progress partial histogram equalization is inputted again in deep learning network, picture is carried out again
It takes on the sly behavioral value.
Compared with prior art the present invention has the following advantages:
First, since the present invention takes a variety of scale picture frame modes when data set is marked, overcome existing skill
Because detecting the low problem of accuracy caused by behavior act diversification of taking on the sly in art, so that the present invention is examined to the behavior of taking on the sly
Accuracy with higher during survey.
Second, since the present invention has built the target detection network an of deep learning, for being examined to the behavior of taking on the sly
Survey, overcome in the prior art because conventional method it is computationally intensive caused by the slow-footed problem of detection, allow the invention to reality
Now the behavior of taking on the sly in monitor video is measured in real time.
Third makes the part of picture since the present invention is using the method for carrying out image enhancement to the picture without behavior of taking on the sly
Feature is enhanced, so that the present invention has better detection effect to the detection for the behavior of taking on the sly.
Detailed description of the invention
Fig. 1 is flow chart of the invention;
Fig. 2 is analogous diagram of the invention.
Specific embodiment
The present invention will be further described with reference to the accompanying drawing.
Referring to Fig.1, the specific steps realized to the present invention are described in further detail.
Step 1, the target detection network of deep learning is constructed.
Build the Yolov3 target detection network being made of four modules;
The structure of first module is successively are as follows: input layer → the 1st convolutional layer → the 2nd convolutional layer → first convolution submodule
Block → the 3rd convolutional layer → the second convolution submodule → 4 convolutional layers → third convolution submodule → 5 convolutional layers → Volume Four product
Submodule → the 6th convolutional layer → the 5th convolution submodule;The second convolution submodule is the first volume being sequentially connected in series by four
Product unit composition;The 2nd convolution unit that the third convolution submodule is sequentially connected in series by eight forms;The Volume Four product submodule
The 3rd convolution unit that block is sequentially connected in series by eight forms;The 5th convolution submodule is by four volumes 4 being sequentially connected in series
Product unit composition;The structure of all convolution units is successively are as follows: two convolutional layer → ResNet layers be sequentially connected in series, each
The ResNet layers of input terminal by place convolution submodule connects and is merged into output end;
The structure of second module is successively are as follows: the 7th convolutional layer → the 8th convolutional layer → the 9th convolutional layer → the 10th convolution
Layer → the 11st convolutional layer → the 12nd convolutional layer → 13 convolutional layers → output layer;
The structure of the third module is successively are as follows: 14 convolutional layers → up-sampling layer → 1concat layers → volume 15
Lamination → the 16th convolutional layer → the 17th convolutional layer → the 18th convolutional layer → the 19th convolutional layer → the 20th convolutional layer → the 21st convolutional layer
→ output layer;
The structure of 4th module is successively are as follows: 22 convolutional layers → up-sampling layer → 2concat layers → volume 23
Lamination → the 24th convolutional layer → the 25th convolutional layer → the 26th convolutional layer → the 27th convolutional layer → the 28th convolutional layer → the 29th convolutional layer
→ output layer;
The 5th convolution submodule in first module is connected with the 7th convolutional layer in second module, by second mould
The 11st convolutional layer is connected with the 14th convolutional layer in third module in block, by the 19th convolutional layer in third module and the 4th mould
The 22nd convolutional layer is connected in block;By the Volume Four product submodule in first module and 1concat layers in third module
It is connected, the third convolution submodule in first module is connected with 2concat layers in the 4th module, forms Yolov3
Target detection network.
The parameter of every layer of the target detection network of deep learning is set;
All convolution kernel sizes of 1st to the 6th convolutional layer are disposed as 3*3, it is 32,64 that port number, which is set gradually,
128,256,512,1024, the 1st convolutional layer step-length is set as 1, and the step-length of the 2nd to the 5th convolutional layer is disposed as 2;
By the 7th, 9, the convolution kernel sizes of 11 convolutional layers be disposed as 1*1, port number is disposed as 512, and step-length is respectively provided with
It is 1;
By the 8th, 10, the convolution kernel sizes of 12 convolutional layers be disposed as 3*3, port number is disposed as 1024, and step-length is all provided with
It is set to 1;
The convolution kernel of 14th convolutional layer is dimensioned to 1*1, port number is set as 256, and step-length is set as 1;
By the 15th, 17, the convolution kernel sizes of 19 convolutional layers be disposed as 1*1, port number is disposed as 256, and step-length is all provided with
It is set to 1;
By the 16th, 18, the convolution kernel sizes of 20 convolutional layers be disposed as 3*3, port number is disposed as 512, and step-length is all provided with
It is set to 1;
The convolution kernel of 22nd convolutional layer is dimensioned to 1*1, port number is set as 128, and step-length is set as 1;
By the 23rd, 25, the convolution kernel sizes of 27 convolutional layers be disposed as 1*1, port number is disposed as 128, and step-length is all provided with
It is set to 1;
By the 24th, 26, the convolution kernel sizes of 28 convolutional layers be disposed as 3*3, port number is disposed as 256, and step-length is all provided with
It is set to 1;
By the 13rd, 21, the convolution kernel sizes of 29 convolutional layers be disposed as 1*1, port number is disposed as 255, and step-length is all provided with
It is set to 1;
The convolution kernel size of two convolutional layers in 1st convolution submodule and the 1st to the 4th convolution unit is successively set respectively
It is set to 1*1 and 3*3, step-length is disposed as 1, and the port number of two convolutional layers in the 1st convolution submodule is successively respectively set to
32 and 64, the port number of two convolutional layers in the 1st convolution unit is successively respectively set to 64 and 128, it will be in the 2nd convolution unit
The port number of two convolutional layers is successively respectively set to 128 and 256, by the port number of two convolutional layers in the 3rd convolution unit according to
It is secondary to be respectively set to 256 and 512, the port number of two convolutional layers in the 4th convolution unit is successively respectively set to 512 Hes
1024;
The step-length of up-sampling layers all in aforementioned four module is disposed as 2.
Step 2, training set is generated:
It acquires at least 10,000 pictures and forms deep learning data set, wherein behavior of taking on the sly in 60% picture, 40%
Picture in do not take on the sly behavior;
The picture for extracting 80% at random from all pictures for have the behavior of taking on the sly, in all pictures for behavior of never taking on the sly
The picture of random extraction 80%, forms training set.
The behavior of taking on the sly in the picture refers to that people hold mobile phone under environment indoors, camera, taking pictures for plate set
The standby behavior taken pictures.
Step 3, take a variety of scale picture frame modes that the same behavior of taking on the sly in picture is marked:
It makes marks to the peripheral picture frame for used photographing device of taking on the sly in all pictures for having same behavior of taking on the sly;
To what is occurred on take on the sly in all pictures for having same behavior of taking on the sly used photographing device and photographing device
The peripheral picture frame of a part of manpower makes marks;
To the motion profile of taking on the sly in all pictures for having same behavior of taking on the sly including complete manpower and photographing device
Peripheral picture frame makes marks.
The humanoid region picture frame of every picture in deep learning data set is marked, the training set figure marked is obtained
Piece.
Step 4, training deep learning network:
Labeled good training set picture is input in Yolov3 target detection network, network parameter is iterated
It updates, the deconditioning when loss function drops to 0.1 or less obtains trained Yolov3 target detection network.
Step 5, behavior of taking on the sly is detected:
The picture acquired in real time in indoor environment to be detected is input to trained Yolov3 target detection net
In network, export picture and network that one of the picture has accordingly marked detection target the calculated target category
Score value.
0.5 is set by the judgment threshold of the score value for the behavior of taking on the sly, if the behavior score value of taking on the sly detected is less than
0.5, it is believed that behavior of not taking on the sly;If the behavior score value of taking on the sly detected is greater than 0.5, it is believed that behavior of taking on the sly, output are stolen
The score value and its location information of bat behavior.
Step 6, image enhancement is carried out to the picture without behavior of taking on the sly:
If not detecting the behavior of taking on the sly in picture, judge whether to be able to detect that humanoid region, the people that will test
Picture of the shape area score value greater than 0.5 is confirmed as someone, picture confirmation of the humanoid area score value that will test less than 0.5
For nobody.
If being able to detect that humanoid region, the humanoid region detected in picture is subjected to local equalization processing;
If not detecting humanoid region, it is believed that without behavior of taking on the sly in the picture.
Step 7, the enhanced image of feature is detected again:
Image after progress partial histogram equalization is inputted again in deep learning network, picture is carried out again
It takes on the sly behavioral value.
It is described to without take on the sly performance-based objective image carry out feature enhancing the step of it is as follows: save monitor video in detect people
But the picture for not detecting the behavior of taking on the sly is obtained the pixel coordinate information that these pictures carry out humanoid detection, is sat using pixel
It marks information and histogram equalization is carried out to humanoid region, finally obtain the picture after local equalization.
Effect of the invention is described further below with reference to emulation experiment.
1. emulation experiment condition:
The hardware platform of emulation experiment of the invention are as follows: processor is Intel Core i7-6850K CPU, and video card is
NVIDIA GeForce GTX 1080Ti, inside saves as 128GB.
The software platform of emulation experiment of the invention are as follows: ubuntu16.04.
2. emulation experiment content:
Emulation experiment of the present invention is that therein 80% is randomly selected from data set as training using method of the invention
Collection, 10% is used as verifying collection, and remaining 10% is used as test set.The deep learning network built is carried out on the data set
Training, obtains trained deep learning network.
Data set used in emulation experiment of the present invention is held take on the sly the picture of movement of mobile phone by someone of indoor acquisition and is formed,
Imaging time is in March, 2019, and picture size is 1920 × 1080, picture format jpg.
Trained deep learning network is tested on test set, the behavior of taking on the sly in picture is detected, to figure
After humanoid region is enhanced in piece, two kinds of mark modes of phone and phone-hand have obtained correct testing result
As shown in Figure 2.In Fig. 2, it is denoted as phone with the testing result that the mode that the peripheral picture frame to mobile phone makes marks obtains, with to people
The testing result that the mode that the peripheral picture frame of the motion profile of taking on the sly of hand-held mobile phone makes marks obtains is denoted as phone-hand.
Picture in present invention emulation is carried out using three evaluation indexes (accuracy rate Acc, misclassification rate FPR, leakage knowledge rate FNR)
Testing result before and after local enhancement is evaluated respectively.Using following formula, accuracy rate Acc, misclassification rate FPR, leakage knowledge rate are calculated
FNR, wherein TP expression is correctly divided into the picture number of target, FP indicates the picture for being mistakenly divided into target
Number, TN expression are mistakenly divided into aimless picture number, FN expression is correctly divided into aimless picture number, by institute
There is calculated result to be depicted as table 1:
The quantitative analysis table of testing result of the present invention in 1. emulation experiment of table
FPR | FNR | Acc | |
Before local enhancement | 67.6% | 37.4% | 28.4% |
After local enhancement | 76.0% | 26.9% | 21.6% |
In conjunction with table 1 as can be seen that the misclassification rate FPR that the present invention detects after local enhancement is 76.0%, leakage knowledge rate FNR is
26.9%, accuracy rate Acc are 21.6%, these three indexs are above the testing result before local enhancement, it was demonstrated that local enhancement figure
The available more accurate testing result of the method for piece.
The above emulation experiment shows: the present invention takes a variety of scale picture frame modes when data set is marked, and overcomes
The prior art is because detecting the low problem of accuracy caused by behavior act diversification of taking on the sly, so that the present invention is in the detection process
Accuracy with higher.The target detection network based on deep learning has been built when detecting, overcomes the prior art
Because conventional method it is computationally intensive caused by the slow-footed problem of detection so that take behavioral value of the present invention in monitor video
On can reach live effect.
Claims (3)
1. a kind of anti-detection method of taking on the sly based on deep learning, which is characterized in that the target detection network of deep learning is constructed,
It takes a variety of scale picture frame modes that picture is marked, image enhancement, the step of this method is carried out to the picture without behavior of taking on the sly
It is rapid as follows:
(1) the target detection network of deep learning is constructed:
It is as follows that (1a) builds the Yolov3 target detection network specific structure that one is made of four modules:
The structure of first module is successively are as follows: and input layer → the 1st convolutional layer → the 2nd convolutional layer → first convolution submodule →
3rd convolutional layer → the second convolution submodule → 4 convolutional layers → third convolution submodule → 5 convolutional layers → Volume Four product submodule
Block → the 6th convolutional layer → the 5th convolution submodule;The second convolution submodule is the first convolution list being sequentially connected in series by four
Member composition;The 2nd convolution unit that the third convolution submodule is sequentially connected in series by eight forms;The Volume Four accumulates submodule
The 3rd convolution unit being sequentially connected in series by eight forms;The 5th convolution submodule is the 4th convolution list being sequentially connected in series by four
Member composition;The structure of all convolution units is successively are as follows: two convolutional layer → ResNet layers be sequentially connected in series, it is ResNet layers each
The input terminal of place convolution submodule is connected and is merged into output end;
The structure of second module is successively are as follows: and the 7th convolutional layer → the 8th convolutional layer → the 9th convolutional layer → the 10th convolutional layer →
11st convolutional layer → the 12nd convolutional layer → 13 convolutional layers → output layer;
The structure of the third module is successively are as follows: 14 convolutional layers → up-sampling layer → the 1concat layers → the 15th convolutional layer
→ the 16 convolutional layer → the 17th convolutional layer → the 18th convolutional layer → the 19th convolutional layer → the 20th convolutional layer → the 21st convolutional layer → defeated
Layer out;
The structure of 4th module is successively are as follows: 22 convolutional layers → up-sampling layer → the 2concat layers → the 23rd convolutional layer
→ the 24 convolutional layer → the 25th convolutional layer → the 26th convolutional layer → the 27th convolutional layer → the 28th convolutional layer → the 29th convolutional layer → defeated
Layer out;
The 5th convolution submodule in first module is connected with the 7th convolutional layer in second module, it will be in second module
11st convolutional layer is connected with the 14th convolutional layer in third module, will be in the 19th convolutional layer in third module and the 4th module
22nd convolutional layer is connected;Volume Four product submodule in first module is connected with 1concat layers in third module,
Third convolution submodule in first module is connected with 2concat layers in the 4th module, forms Yolov3 target
Detect network;
The parameter that every layer of the target detection network of deep learning is arranged in (1b) is as follows:
All convolution kernel sizes of 1st to the 6th convolutional layer are disposed as 3*3, it is 32,64,128 that port number, which is set gradually,
256,512,1024, the 1st convolutional layer step-length is set as 1, and the step-length of the 2nd to the 5th convolutional layer is disposed as 2;
By the 7th, 9, the convolution kernel sizes of 11 convolutional layers be disposed as 1*1, port number is disposed as 512, and step-length is disposed as 1;
By the 8th, 10, the convolution kernel sizes of 12 convolutional layers be disposed as 3*3, port number is disposed as 1024, and step-length is disposed as
1;
The convolution kernel of 14th convolutional layer is dimensioned to 1*1, port number is set as 256, and step-length is set as 1;
By the 15th, 17, the convolution kernel sizes of 19 convolutional layers be disposed as 1*1, port number is disposed as 256, and step-length is disposed as
1;
By the 16th, 18, the convolution kernel sizes of 20 convolutional layers be disposed as 3*3, port number is disposed as 512, and step-length is disposed as
1;
The convolution kernel of 22nd convolutional layer is dimensioned to 1*1, port number is set as 128, and step-length is set as 1;
By the 23rd, 25, the convolution kernel sizes of 27 convolutional layers be disposed as 1*1, port number is disposed as 128, and step-length is disposed as
1;
By the 24th, 26, the convolution kernel sizes of 28 convolutional layers be disposed as 3*3, port number is disposed as 256, and step-length is disposed as
1;
By the 13rd, 21, the convolution kernel sizes of 29 convolutional layers be disposed as 1*1, port number is disposed as 255, and step-length is disposed as
1;
The convolution kernel size of two convolutional layers in 1st convolution submodule and the 1st to the 4th convolution unit is successively respectively set to
1*1 and 3*3, step-length are disposed as 1, and the port number of two convolutional layers in the 1st convolution submodule is successively respectively set to 32 Hes
64, the port number of two convolutional layers in the 1st convolution unit is successively respectively set to 64 and 128, it will be two in the 2nd convolution unit
The port number of convolutional layer is successively respectively set to 128 and 256, and the port number of two convolutional layers in the 3rd convolution unit is successively divided
256 and 512 are not set as, the port number of two convolutional layers in the 4th convolution unit is successively respectively set to 512 and 1024;
The step-length of up-sampling layers all in aforementioned four module is disposed as 2;
(2) training set is generated:
(2a) acquires at least 10,000 pictures and forms deep learning data set, wherein behavior of taking on the sly in 60% picture, 40%
Picture in do not take on the sly behavior;
(2b) extracts 80% picture at random from all pictures for have the behavior of taking on the sly, in all pictures for behavior of never taking on the sly
The picture of random extraction 80% forms training set;
(3) take a variety of scale picture frame modes that the same behavior of taking on the sly in picture is marked:
(3a) makes marks to the peripheral picture frame for used photographing device of taking on the sly in all pictures for having same behavior of taking on the sly;
(3b) is to occurring on photographing device used in taking on the sly in all pictures for having same behavior of taking on the sly and photographing device
The peripheral picture frame of a part of manpower makes marks;
(3c) is to the motion profile of taking on the sly in all pictures for having same behavior of taking on the sly including complete manpower and photographing device
Peripheral picture frame makes marks;
The humanoid region picture frame of every picture in deep learning data set is marked in (3d), obtains the training set figure marked
Piece;
(4) training deep learning network:
Labeled good training set picture is input in Yolov3 target detection network, network parameter is iterated more
Newly, the deconditioning when loss function drops to 0.1 or less obtains trained Yolov3 target detection network;
(5) behavior of taking on the sly is detected:
The picture acquired in real time in indoor environment to be detected is input to trained Yolov3 target detection net by (5a)
In network, export picture and network that one of the picture has accordingly marked detection target the calculated target category
Score value;
(5b) sets 0.5 for the judgment threshold of the score value for the behavior of taking on the sly, if the behavior score value of taking on the sly detected is less than
0.5, it is believed that behavior of not taking on the sly;If the behavior score value of taking on the sly detected is greater than 0.5, it is believed that behavior of taking on the sly, output are stolen
The score value and its location information of bat behavior;
(6) image enhancement is carried out to the picture without behavior of taking on the sly:
(6a) judges whether to be able to detect that humanoid region, the people that will test if not detecting the behavior of taking on the sly in picture
Picture of the shape area score value greater than 0.5 is confirmed as someone, picture confirmation of the humanoid area score value that will test less than 0.5
For nobody;
(6b) carries out local equalization processing if being able to detect that humanoid region, by the humanoid region detected in picture;
If not detecting humanoid region, it is believed that without behavior of taking on the sly in the picture;
(7) the enhanced image of feature is detected again:
Image after progress partial histogram equalization is inputted again in deep learning network, is taken on the sly again to picture
Behavioral value.
2. the anti-detection method of taking on the sly according to claim 1 based on deep learning, which is characterized in that step (2a), step
Suddenly (2b), the behavior of taking on the sly described in step (2c) refer to that people hold mobile phone under environment indoors, camera, taking pictures for plate set
The standby behavior taken pictures.
3. the anti-detection method of taking on the sly according to claim 1 based on deep learning, which is characterized in that institute in step (6)
That states is as follows the step of carrying out image enhancement to the picture without behavior of taking on the sly: saving and detects people in monitor video but do not detect
To the picture for the behavior of taking on the sly, the pixel coordinate information that these pictures carry out humanoid detection is obtained, using pixel coordinate information to people
Shape region carries out histogram equalization, finally obtains the picture after local equalization.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910545151.8A CN110287862B (en) | 2019-06-21 | 2019-06-21 | Anti-candid detection method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910545151.8A CN110287862B (en) | 2019-06-21 | 2019-06-21 | Anti-candid detection method based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110287862A true CN110287862A (en) | 2019-09-27 |
CN110287862B CN110287862B (en) | 2021-04-06 |
Family
ID=68004329
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910545151.8A Active CN110287862B (en) | 2019-06-21 | 2019-06-21 | Anti-candid detection method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110287862B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109040654A (en) * | 2018-08-21 | 2018-12-18 | 苏州科达科技股份有限公司 | Recognition methods, device and the storage medium of external capture apparatus |
CN110807405A (en) * | 2019-10-29 | 2020-02-18 | 维沃移动通信有限公司 | Detection method of candid camera device and electronic equipment |
CN113223036A (en) * | 2020-01-21 | 2021-08-06 | 李华 | Electronic equipment field positioning system |
CN113408379A (en) * | 2021-06-04 | 2021-09-17 | 开放智能机器(上海)有限公司 | Mobile phone candid behavior monitoring method and system |
CN114067441A (en) * | 2022-01-14 | 2022-02-18 | 合肥高维数据技术有限公司 | Shooting and recording behavior detection method and system |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150125046A1 (en) * | 2013-11-01 | 2015-05-07 | Sony Corporation | Information processing device and information processing method |
CN107784653A (en) * | 2017-11-06 | 2018-03-09 | 山东浪潮云服务信息科技有限公司 | A kind of Anti-sneak-shooting system and method |
WO2018066467A1 (en) * | 2016-10-03 | 2018-04-12 | 大学共同利用機関法人情報・システム研究機構 | Wearable article and method for preventing secret photographing of biometric features |
CN107911551A (en) * | 2017-11-16 | 2018-04-13 | 吴英 | Intelligent mobile phone platform based on action recognition |
CN109460754A (en) * | 2019-01-31 | 2019-03-12 | 深兰人工智能芯片研究院(江苏)有限公司 | A kind of water surface foreign matter detecting method, device, equipment and storage medium |
CN109492594A (en) * | 2018-11-16 | 2019-03-19 | 西安电子科技大学 | Classroom participant's new line rate detection method based on deep learning network |
-
2019
- 2019-06-21 CN CN201910545151.8A patent/CN110287862B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150125046A1 (en) * | 2013-11-01 | 2015-05-07 | Sony Corporation | Information processing device and information processing method |
WO2018066467A1 (en) * | 2016-10-03 | 2018-04-12 | 大学共同利用機関法人情報・システム研究機構 | Wearable article and method for preventing secret photographing of biometric features |
CN107784653A (en) * | 2017-11-06 | 2018-03-09 | 山东浪潮云服务信息科技有限公司 | A kind of Anti-sneak-shooting system and method |
CN107911551A (en) * | 2017-11-16 | 2018-04-13 | 吴英 | Intelligent mobile phone platform based on action recognition |
CN109492594A (en) * | 2018-11-16 | 2019-03-19 | 西安电子科技大学 | Classroom participant's new line rate detection method based on deep learning network |
CN109460754A (en) * | 2019-01-31 | 2019-03-12 | 深兰人工智能芯片研究院(江苏)有限公司 | A kind of water surface foreign matter detecting method, device, equipment and storage medium |
Non-Patent Citations (2)
Title |
---|
HONGQUAN QU等: "A Pedestrian Detection Method Based on YOLOv3 Model and Image Enhanced by Retinex", 《2018 11TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS》 * |
盛恒等: "基于Faster R-CNN和IoU优化的实验室人数统计与管理系统", 《计算机应用》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109040654A (en) * | 2018-08-21 | 2018-12-18 | 苏州科达科技股份有限公司 | Recognition methods, device and the storage medium of external capture apparatus |
CN110807405A (en) * | 2019-10-29 | 2020-02-18 | 维沃移动通信有限公司 | Detection method of candid camera device and electronic equipment |
CN113223036A (en) * | 2020-01-21 | 2021-08-06 | 李华 | Electronic equipment field positioning system |
CN113223036B (en) * | 2020-01-21 | 2022-04-08 | 湖北讯甲科技有限公司 | Electronic equipment field positioning system |
CN113408379A (en) * | 2021-06-04 | 2021-09-17 | 开放智能机器(上海)有限公司 | Mobile phone candid behavior monitoring method and system |
CN114067441A (en) * | 2022-01-14 | 2022-02-18 | 合肥高维数据技术有限公司 | Shooting and recording behavior detection method and system |
CN114067441B (en) * | 2022-01-14 | 2022-04-08 | 合肥高维数据技术有限公司 | Shooting and recording behavior detection method and system |
Also Published As
Publication number | Publication date |
---|---|
CN110287862B (en) | 2021-04-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110287862A (en) | Anti- detection method of taking on the sly based on deep learning | |
CN104166841B (en) | The quick detection recognition methods of pedestrian or vehicle is specified in a kind of video surveillance network | |
CN110378232B (en) | Improved test room examinee position rapid detection method of SSD dual-network | |
CN103345758B (en) | Jpeg image region duplication based on DCT statistical nature distorts blind checking method | |
CN108416771A (en) | Metal material corrosion area detection method based on monocular camera | |
CN116092199B (en) | Employee working state identification method and identification system | |
US9280209B2 (en) | Method for generating 3D coordinates and mobile terminal for generating 3D coordinates | |
CN112183438B (en) | Image identification method for illegal behaviors based on small sample learning neural network | |
CN113111767A (en) | Fall detection method based on deep learning 3D posture assessment | |
CN110929687A (en) | Multi-user behavior recognition system based on key point detection and working method | |
CN103324852A (en) | Four-modal medical imaging diagnosis system based on feature matching | |
CN112668557A (en) | Method for defending image noise attack in pedestrian re-identification system | |
CN109492594A (en) | Classroom participant's new line rate detection method based on deep learning network | |
CN109816656A (en) | A kind of thermal power plant's negative pressure side system leak source accurate positioning method | |
CN116071424A (en) | Fruit space coordinate positioning method based on monocular vision | |
CN109684986A (en) | A kind of vehicle analysis method and system based on automobile detecting following | |
CN110321869A (en) | Personnel's detection and extracting method based on Multiscale Fusion network | |
CN104243970A (en) | 3D drawn image objective quality evaluation method based on stereoscopic vision attention mechanism and structural similarity | |
CN105989600A (en) | Characteristic point distribution statistics-based power distribution network device appearance detection method and system | |
CN110321867A (en) | Shelter target detection method based on part constraint network | |
Mu et al. | Salient object detection in low contrast images via global convolution and boundary refinement | |
CN102148919B (en) | Method and system for detecting balls | |
CN117237990A (en) | Method and device for estimating weight of pig farm, electronic equipment and storage medium | |
CN113538720A (en) | Embedded face recognition attendance checking method based on Haisi intelligent AI chip | |
CN116052230A (en) | Palm vein recognition method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |