CN111723654B - High-altitude parabolic detection method and device based on background modeling, YOLOv3 and self-optimization - Google Patents

High-altitude parabolic detection method and device based on background modeling, YOLOv3 and self-optimization Download PDF

Info

Publication number
CN111723654B
CN111723654B CN202010398931.7A CN202010398931A CN111723654B CN 111723654 B CN111723654 B CN 111723654B CN 202010398931 A CN202010398931 A CN 202010398931A CN 111723654 B CN111723654 B CN 111723654B
Authority
CN
China
Prior art keywords
model
data
training
building
altitude parabolic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010398931.7A
Other languages
Chinese (zh)
Other versions
CN111723654A (en
Inventor
刘加
缑秦征
周勇
寇振宇
黄笑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Electronic System Technology Co ltd
Zhongdian Cloud Computing Technology Co ltd
Original Assignee
China Electronic System Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Electronic System Technology Co ltd filed Critical China Electronic System Technology Co ltd
Priority to CN202010398931.7A priority Critical patent/CN111723654B/en
Publication of CN111723654A publication Critical patent/CN111723654A/en
Application granted granted Critical
Publication of CN111723654B publication Critical patent/CN111723654B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a high-altitude parabolic detection method and device based on background modeling, YOLOv3 and self-optimization. The invention relates to the technical field of deep learning and computer vision, and solves the problem that the accuracy of a model of the existing high-altitude falling object detection method is limited. The invention mainly aims at detecting falling objects, is based on background modeling and real-time detection of YOLOv3, has high model stability and good detection and identification performance, can eliminate the influence of external factors, and achieves the aim of 'precise control'. According to the invention, the building camera is used for shooting and transmitting the video in real time as a data source, so that manual intervention is reduced, and the scene of falling objects can be locked quickly in time. The method can collect the detected object pictures and the manual processing result, convert the manual processing result into the labels corresponding to the training model, and automatically start the model training process after the number of the collected object pictures reaches the specified threshold value, so that the model accuracy is gradually improved.

Description

High-altitude parabolic detection method and device based on background modeling, YOLOv3 and self-optimization
Technical Field
The invention relates to the technical field of deep learning and computer vision, in particular to a high-altitude parabolic detection method and device based on background modeling, YOLOv3 and self-optimization.
Background
In the current urban development and construction, high-rise building construction is more and more common due to various limiting factors such as land, space and the like. The variety of derived problems arising from this is also increasing, and among them, safety issues arising from high altitude parabolas and falls are also receiving increasing attention from all parties. The object thrown or dropped from high altitude is very easy to cause serious accidents of personal injury once the object is suddenly thrown or dropped, and is dangerous and irreplaceable.
At present, the behavior of monitoring the high-altitude parabolic object mainly depends on the methods of unscheduled patrol of spontaneous organization of residents, sticking warning slogans in a high-altitude parabolic disaster area and the like. But the method of irregular patrol cannot completely monitor the behavior of the high-altitude parabola and consumes a great deal of manpower. The method of pasting the warning slogan in the disaster area is also easy to be ignored by the passerby, and cannot play a role in effectively reminding the passerby. In this case, in order to further prevent the highly harmful illicit behavior of high altitude parabola, a set of high altitude parabola detection system based on camera video stream image processing needs to be built.
The traditional algorithm is mostly used in the existing high-altitude parabolic detection method: for example, background difference, inter-frame difference, etc., in the conventional algorithm, false detection is easily caused by non-high altitude parabolas such as birds and balloons. And most of the existing high-altitude falling object detection methods are based on traditional algorithms, and the models cannot be optimized spontaneously, so that the accuracy of the models is limited in the long run.
Disclosure of Invention
The invention provides a high-altitude parabolic detection method and device based on background modeling, YOLOv3 and self-optimization, and aims to solve the problem that the accuracy of a model of the existing high-altitude falling object detection method is limited.
In a first aspect, the present invention provides a high altitude parabolic detection method based on background modeling, YOLOv3 and self-optimization, wherein the method comprises:
acquiring building video data shot by a camera;
intercepting pictures of the building video data according to frames to obtain an initial data set;
preprocessing the initial data set to obtain a building picture data set;
dividing the building picture data set into a training set, a verification set and a test set according to a preset proportion;
marking the building picture data in the training set, the verification set and the test set according to object categories formulated in a preset high-altitude parabolic detection strategy to obtain marking data sets corresponding to the building picture data one by one, wherein the marking data sets comprise object marking frame coordinates and object category information;
performing data enhancement processing on the training set and the corresponding labeled data set;
carrying out model training on the training set subjected to data enhancement processing and the corresponding labeled data set by using a YOLOv3 network to obtain a plurality of high-altitude parabolic detection models with different weights;
inputting the marked data sets of the verification set and the verification set into a YOLOv3 network, verifying the model while training the model, and obtaining the current accuracy of the model so as to adjust model parameters in time and obtain an optimal weight model;
after the model training is finished, testing the optimal weight model by using the test set and the test set label data set;
acquiring real-time building video data shot by a camera;
searching for moving objects in the real-time building video data by using a background modeling algorithm;
inputting the image of the moving object into an optimal weight model to perform high-altitude parabolic detection to obtain a high-altitude parabolic detection result;
returning the high-altitude parabolic detection result to related personnel at the service end of the social treatment platform in a screenshot form for manual examination;
judging whether the moving object belongs to a falling object or not according to a manual checking result;
if the moving object belongs to a falling object, sending out falling object warning information to inform related personnel to process;
if the moving object does not belong to the high-altitude falling object, receiving modification information of the object type information of the moving object by related personnel, and taking a modified picture sample as a difficult sample;
judging whether the number of the difficult samples exceeds a preset threshold value or not;
and if the number of the difficult samples exceeds a preset threshold value, automatically starting a self-optimization process, combining the difficult samples and a training set into a new training set, and performing model training according to the new training set by using a YOLOv3 network to obtain a high altitude parabolic detection model after training iteration.
With reference to the first aspect, in a first implementation manner of the first aspect, in the step of preprocessing the starting data set to obtain a building picture data set, distorted, deformed and blurred picture data are screened and corrected to obtain the building picture data set.
With reference to the first aspect, in a second implementation manner of the first aspect, the marking building picture data in the training set, the verification set, and the test set according to an object class formulated in a preset high altitude parabolic detection strategy to obtain a marked data set corresponding to the building picture data one to one includes:
and selecting objects in each picture of the training set, the verification set and the test set in the object category by using a rectangular frame, storing the positions of the rectangular frame in the pictures, wherein the positions comprise coordinate information of the upper left corner and the lower right corner of the rectangular frame, marking the category of the objects, generating an XML file from the marked information, and forming a marked data set which corresponds to the marked pictures one by one.
With reference to the second implementable manner of the first aspect, in a third implementable manner of the first aspect, the object category includes moving objects that may often cause misjudgment, such as cars, birds, people, balloons, plastic bags, and the like.
With reference to the first aspect, in a fourth implementable manner of the first aspect, the performing data enhancement processing on the training set and the corresponding labeled data set includes:
performing data enhancement processing by adopting a flip transformation, a random pruning, a color dithering, a translation transformation, a scale transformation, a contrast transformation, a noise disturbance, a rotation transformation or a reflection transformation and a mixup method, wherein the mixup method comprises the following steps:
Figure BDA0002488717240000031
wherein (x) i ,y i ),(x j ,y j ) Is two samples randomly drawn from the training data, x represents the picture matrix, y represents the label information,
Figure BDA0002488717240000032
participating in model training data after enhancement, and enabling lambda to be E [0,1 ∈]。
With reference to the first aspect, in a fifth implementation manner of the first aspect, in the step of performing model training on the training set after data enhancement processing and the corresponding labeled data set by using a YOLOv3 network to obtain a multi-weight high altitude parabolic detection model, the labeled picture sample and the background picture sample are sent to the YOLOv3 network together, the labeled picture is a positive sample, the background picture does not contain an object in an object class, the unlabeled background picture is a negative sample, the positive and negative samples are trained in the YOLOv3 network together, and the multi-weight high altitude parabolic detection model is obtained through iterative training.
With reference to the first aspect, in a sixth implementation manner of the first aspect, in the step of acquiring real-time building video data shot by the camera, a data transmission format between the monitoring video and the server is a video stream, and a transmission protocol is an ONVIF protocol.
With reference to the first aspect, in a seventh implementable manner of the first aspect, the background modeling algorithm is an adaptive mixed gaussian background modeling algorithm.
In a second aspect, the present invention provides a high altitude parabolic detection apparatus based on background modeling, YOLOv3 and self-optimization, the apparatus comprising:
the first acquisition unit is used for acquiring building video data shot by the camera;
the intercepting unit is used for intercepting pictures of the building video data according to frames to obtain an initial data set;
the preprocessing unit is used for preprocessing the initial data set to obtain a building picture data set;
the grouping unit is used for dividing the building picture data set into a training set, a verification set and a test set according to a preset proportion;
the marking unit is used for marking the building picture data in the training set, the verification set and the test set according to object types formulated in a preset high-altitude parabolic detection strategy to obtain marking data sets corresponding to the building picture data one by one, and each marking data set comprises object marking frame coordinates and object type information;
the data enhancement unit is used for carrying out data enhancement processing on the training set and the corresponding marked data set;
the model training unit is used for performing model training on the training set subjected to data enhancement processing and the corresponding marked data set by using a YOLOv3 network to obtain a plurality of high-altitude parabolic detection models with different weights;
the model verification unit is used for inputting the verification set and the mark data set of the verification set into a YOLOv3 network, verifying the model while training the model, and obtaining the current accuracy of the model so as to adjust the parameters of the model in time and obtain an optimal weight model;
the model testing unit is used for testing the optimal weight model by utilizing the test set and the test set marking data set after the model training is finished;
the second acquisition unit is used for acquiring real-time building video data shot by the camera;
the searching unit is used for searching a moving object in the real-time building video data by utilizing a background modeling algorithm;
the detection unit is used for inputting the image of the moving object into an optimal weight model to perform high-altitude parabolic detection to obtain a high-altitude parabolic detection result;
the return unit is used for returning the high-altitude parabolic detection result to related personnel at the service end of the social treatment platform in a screenshot form for manual examination;
the first judgment unit is used for judging whether the moving object belongs to an overhead falling object or not according to a manual auditing result;
the notification unit is used for sending out high-altitude falling object warning information to notify related personnel to process under the condition that the moving object belongs to a high-altitude falling object;
the modifying unit is used for receiving modification information of object type information of the moving object by related personnel under the condition that the moving object does not belong to a high falling object, and taking a modified picture sample as a difficult sample;
the second judging unit is used for judging whether the number of the difficult samples exceeds a preset threshold value or not;
and the merging unit is used for automatically starting a self-optimization process under the condition that the number of the difficult samples exceeds a preset threshold value, merging the difficult samples and the training set into a new training set, and performing model training according to the new training set by using a YOLOv3 network to obtain a high-altitude parabolic detection model after training iteration.
The invention has the following beneficial effects: according to the high-altitude parabolic detection method and device based on background modeling, YOLOv3 and self-optimization, a building camera is used for shooting and transmitting a video in real time to serve as a data source, manual intervention is reduced, a scene where a falling object occurs can be locked quickly in time, and the purposes of early finding and early treating of the falling object behavior are achieved. The invention is applied to the detection of foreign matters in the social treatment platform, and manual treatment can be carried out on the platform after the high-altitude falling object behavior is detected. The method can collect the detected object pictures and the results of manual treatment, convert the results of manual treatment into the labels corresponding to the training models, automatically start the model training process when the number of the collected object pictures reaches the specified threshold value, and obtain more manual feedback and more on-line training iterations as time goes on, so that the effect of the models is gradually optimized, and the accuracy is gradually improved. The invention mainly aims at detecting falling objects, is based on background modeling and real-time detection of YOLOv3, has high model stability and good detection and identification performance, can eliminate the influence of external factors, and achieves the aim of 'precise control'.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings required to be used in the embodiments will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive labor.
Fig. 1 is a flowchart of a high altitude parabolic detection method based on background modeling, YOLOv3 and self-optimization provided by the present invention.
Fig. 2 is a schematic diagram of the high altitude parabolic detection apparatus based on background modeling, YOLOv3 and self optimization provided by the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the specific embodiments of the present invention and the accompanying drawings. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. The technical solutions provided by the embodiments of the present invention are described in detail below with reference to the accompanying drawings.
Referring to fig. 1, the present invention provides a high-altitude parabolic detection method based on background modeling, YOLOv3 and self-optimization, the method includes:
and step S101, acquiring building video data shot by the camera.
Data acquisition is an important foundation in the invention, in the YOLOv3 algorithm, data is required to be input into a network to train a model, and videos or photos of various common objects in a living scene including building video data shot by a camera can be used as a data set.
And S102, intercepting pictures of the building video data according to frames to obtain an initial data set.
And step S103, preprocessing the initial data set to obtain a building picture data set.
Specifically, distorted, deformed and blurred picture data are screened and corrected to obtain the building picture data set.
And step S104, dividing the building picture data set into a training set, a verification set and a test set according to a preset proportion.
And S105, marking the building picture data in the training set, the verification set and the test set according to object types formulated in a preset high-altitude parabolic detection strategy to obtain marking data sets corresponding to the building picture data one by one, wherein the marking data sets comprise object marking frame coordinates and object type information.
Specifically, for each object in the object category in each picture of the training set, the verification set and the test set, selecting the object by using a rectangular frame, storing the position of the rectangular frame in the picture, wherein the position comprises coordinate information of the upper left corner and the lower right corner of the rectangular frame, marking the category of the object, generating an XML file from the marked information, and forming a marked data set corresponding to the marked picture one to one. The object categories may include moving objects such as cars, birds, people, balloons, plastic bags, etc. which are often misjudged.
And S106, performing data enhancement processing on the training set and the corresponding labeled data set.
Because the data volume of the method is far smaller than the data volume required by general deep learning, the method can be classified into small sample learning (few _ shot _ learning), and the method for solving the small sample learning mainly comprises data enhancement and meta-learning. The invention mainly uses data enhancement to solve the problem of small sample learning. The generalization and the precision of the model can be improved by a large amount of data, so that the expansion of the data volume by technical means is an indispensable step on the premise of limited data.
Specifically, the data enhancement processing may be performed using a flip transform, a random clipping, a color dithering, a translation transform, a scale transform, a contrast transform, a noise disturbance, a rotation transform, or a reflection transform, and a mixup method as follows:
Figure BDA0002488717240000061
wherein (x) i ,y i ),(x j ,y j ) Is two samples randomly drawn from the training data, x represents the picture matrix, y represents the label information,
Figure BDA0002488717240000062
participating in model training data after enhancement, and enabling lambda to be in the range of 0,1]. The invention adopts a mixup method to add the obtained picture into the original data set. By expanding the training samples, the overfitting problem caused by too small data volume can be avoided. The mixup method is a simple data enhancement mode irrelevant to data, and a virtual training sample is constructed. The mixup extends the training distribution by incorporating a priori knowledge that linear interpolation of linear vectors should result in linear interpolation of the relevant labels, and introduces minimal computational overhead.
And S107, performing model training on the training set subjected to data enhancement processing and the corresponding marked data set by using a YOLOv3 network to obtain a plurality of high-altitude parabolic detection models with different weights.
Specifically, the marked picture sample and the background picture sample are sent to a YOLOv3 network together, the marked picture is a positive sample, the background picture does not contain objects in the object class, the unmarked background picture is a negative sample, the positive sample and the negative sample are trained in the YOLOv3 network together, and the high-altitude parabolic detection model with multiple weights is obtained through iterative training.
And S108, inputting the marked data sets of the verification set and the verification set into a YOLOv3 network, verifying the model while training the model, and obtaining the current accuracy of the model so as to adjust the parameters of the model in time and obtain an optimal weight model.
And step S109, after the model training is finished, testing the optimal weight model by using the test set and the test set label data set.
And step S110, acquiring real-time building video data shot by the camera.
Specifically, the data transmission format of the monitoring video and the server is video streaming, and the transmission protocol is an ONVIF protocol.
And step S111, searching for a moving object in the real-time building video data by using a background modeling algorithm.
Specifically, the background modeling algorithm is an adaptive Gaussian mixture background modeling algorithm.
The working principle of the self-adaptive mixed Gaussian background modeling is as follows: in the detection and extraction of the moving target, a background target is important for the identification and tracking of the target. Modeling is an important link of background target extraction. Foreground means that any meaningful moving object is the foreground under the assumption that the background is stationary. The problems of moving object detection are mainly divided into two categories, camera fixation and camera motion. For the problem of detecting a moving object moving by a camera, a well-known solution is an optical flow method, and an optical flow field of an image sequence is solved by solving a partial differential equation, so that the motion state of the camera is predicted. In the case of a fixed camera, an optical flow method may be used, but due to the complexity of the optical flow method, it is often difficult to calculate in real time. Whereas the mixed gaussian background modeling is suitable for separating the background and the foreground from the image sequence with the camera fixed. Under the condition that a camera is fixed, the change of a background is slow, and is mostly influenced by illumination, wind and the like, a foreground and the background are separated from a given image through modeling the background, and generally, the foreground is a moving object, so that the purpose of detecting the moving object is achieved.
The Gaussian mixture model has been widely applied to robust complex scene background modeling, especially in situations with small repetitive motions, such as swaying leaves, bushes, rotating fans, sea surges, rainy and snowy weather, light reflections, and the like. The pixel-based Gaussian mixture model is effective in modeling the multimodal distribution background, can adapt to the change of the background such as light gradual change, and can basically meet the real-time requirement of the algorithm in practical application.
The mixed Gaussian background modeling is a background representation method based on pixel sample statistical information, the background is represented by using statistical information (such as mode quantity, mean value and standard deviation of each mode) such as probability density of a large number of sample values of a pixel in a long time, and then target pixels are judged by using statistical difference, so that the complex dynamic background can be modeled, and the calculated amount is large.
In the Gaussian mixture background model, the color information among the pixels is considered to be irrelevant, and the processing of each pixel point is independent. For each pixel point in the video image, the change of the value in the sequence image can be regarded as a random process which continuously generates the pixel value, namely, the color rendering rule of each pixel point is described by Gaussian distribution.
For a multi-peak Gaussian distribution model, each pixel point of an image is modeled according to superposition of a plurality of Gaussian distributions with different weights, each Gaussian distribution corresponds to a state which can possibly generate the color presented by the pixel point, and the weight and distribution parameters of each Gaussian distribution are updated along with time. When processing color images, it is assumed that the R, G, B channels of the image pixels are independent and have the same variance. Observation data set { X for random variable X 1 ,x 2 ,…,x N },x t =(r t ,g t ,b t ) For a sample of the pixel at time t, then a single sample point x t Its obeyed mixture gaussian distribution probability density function:
Figure BDA0002488717240000071
Figure BDA0002488717240000072
Figure BDA0002488717240000073
where k is the total number of distribution modes, η (x) ti,ti,t ) For the ith Gaussian distribution at time t, μ i,t Is the mean value of i,t For the purpose of its covariance matrix,
Figure BDA0002488717240000074
is variance, I is three-dimensional identity matrix, w i,t The weight of the ith gaussian distribution at time t.
The maximum number of Gaussian distributions of each pixel point is set to be k by the self-adaptive Gaussian mixture model max And =4. Assuming that the initial number of the Gaussian distributions of each pixel point is k =1, taking the pixel value of each point of the first frame as the initial mean value u of the Gaussian distributions 0 Variance is σ 0 =30, weight is ω 0 =0.2。
If the current k Gaussian distributions are not matched with the target pixel and k exist<k max K = k +1, a new gaussian distribution is added to the background model, which is averaged over the current pixel value, with a standard deviation and weight of 30, 0.2, respectively. But if k = k at this time max A new gaussian distribution will be generated with its mean initialized with the current pixel value, standard deviation and weight of 30, 0.01 respectively. The new Gaussian distribution will immediately replace the original k max The one with the smallest weight among the distributions.
In the traditional mixed Gaussian background modeling algorithm, the first successful matching is taken as a matching result. In fact, the new pixel may be successfully matched to the polynomial gaussian distribution, and the first match is not necessarily the best match. In the adaptive Gaussian mixture model, each Gaussian distribution is matched with a new pixel, and the optimal distribution of the matching result is found out, wherein the optimal distribution is obtained by the following formula:
Figure BDA0002488717240000081
if the optimal matching of the continuous 10 frames is the same distribution and k is greater than 1, k = k-1, and the distribution with the minimum weight value is directly removed.
The working principle of the YOLOv3 network is explained again:
YOLOv3 adjusts the network structure on the basis of YOLOv1 and YOLOv2, object detection is carried out by utilizing multi-scale features, and softmax is replaced by Logistic in object classification. YOLOv3 has no full connection layer and no pooling layer, can correspond to input images of any size, mainly comprises 75 convolutional layers, and is additionally provided with a resnet residual module in the network, so that the gradient problem of a deep network is solved.
The Resnet residual error network is equivalent to adding a shortcut path in an original CNN network structure, and the learning process is changed from directly learning features to adding certain features on the basis of the previously learned features so as to obtain better features. Thus, a complex feature H (x), which was previously learned layer by layer independently, now becomes a model H (x) = F (x) + x, where x is the feature at the beginning of the short, and F (x) is the padding and addition of x, which becomes the residual. Therefore, the learning target changes from learning complete information to learning residual. The difficulty of learning the high-quality features is greatly reduced.
An image typically contains a variety of objects and has a size. It is desirable to be able to detect all sizes of objects at once. Therefore, the network must have the ability to "see" objects of different sizes. And the deeper the network, the smaller the signature, so that smaller objects are more difficult to detect later. For this problem, YOLOv3 uses 3 Feature maps of different scales to detect objects, and can detect features of finer granularity, YOLOv3 uses an FPN (Feature Pyramid Network) structure to correspond to different accuracies of multiple scales, and performs target detection on Feature maps of different depths respectively, and the Feature map of the current layer performs up-sampling on the Feature map of the future layer and uses the up-sampled Feature map to fuse the low-order features and the high-order features, thereby improving the detection accuracy.
The Softmax layer is replaced with a 1x1 convolutional layer + logistic activation function structure. May correspond to a multi-tagged object. When YOLOv3 predicts the preselected boxes bbox, a logistic regression is adopted, each preselected box comprises five elements bbox (bx, y, w, h, c), wherein the first four elements represent the size and the coordinate position of the preselected box, and the last value is a confidence coefficient.
Pr (object) = IOU (bbox, object), where Pr (object) = IOU (bbox, object) is confidence.
The logistic regression will score the bbox surrounding part for an object, and find the one with the highest object existence probability score.
And step S112, inputting the image of the moving object into the optimal weight model to perform high-altitude parabolic detection, so as to obtain a high-altitude parabolic detection result.
Specifically, by comparing the image of the moving object with the object identified by the YOLOv3 target detection model, if the identification result is an object such as a bird, a balloon, a plastic bag, or the like, it is not determined as a falling object. And strategies can be flexibly configured according to actual scenes, and interference of aerial flying objects is effectively reduced.
And S113, returning the high-altitude parabolic detection result to related personnel of the social treatment platform server in a screenshot form for manual examination.
And step S114, judging whether the moving object belongs to a falling object or not according to the manual checking result.
And step S115, if the moving object belongs to a falling object, sending out falling object warning information to inform related personnel to process.
And step S116, if the moving object does not belong to the high-altitude falling object, receiving modification information of the object type information of the moving object by related personnel, and taking the modified picture sample as a difficult sample.
In step S117, it is determined whether the number of difficult samples exceeds a preset threshold.
And S118, if the number of the difficult samples exceeds a preset threshold value, automatically starting a self-optimization process, combining the difficult samples and a training set into a new training set, and performing model training according to the new training set by using a YOLOv3 network to obtain a high-altitude parabolic detection model after training iteration.
The data source of the invention mainly comprises three parts: and detecting the wrong picture sample and the background picture sample which does not contain the target object type. The marked picture sample is a positive sample, the background picture is a negative sample, the positive sample and the negative sample form a data set X, and the X is sent to a YOLOv3 network for training; the samples that detect errors are called difficult samples whose labels are manually altered and then iteratively trained in conjunction with the data set X. The method adds the detected result into the data set to enter a self-optimization process, collects the detected object picture and the result of manual treatment, and converts the result of manual treatment into a label corresponding to the training model. When the number of the collected object pictures reaches a specified threshold value, the method can automatically start the process of model training. With the lapse of time, the artificial feedback obtained by the method is increased, the number of times of on-line training iteration is increased, the effect of the model is gradually optimized, and the accuracy is gradually improved.
As shown in fig. 2, the present invention provides a high altitude parabolic detection apparatus based on background modeling, YOLOv3 and self-optimization, the apparatus includes:
a first obtaining unit 201, configured to obtain building video data captured by a camera.
And an intercepting unit 202, configured to intercept a picture from the building video data according to a frame, so as to obtain an initial data set.
And the preprocessing unit 203 is configured to preprocess the initial data set to obtain a building picture data set.
And the grouping unit 204 is used for dividing the building picture data set into a training set, a verification set and a test set according to a preset proportion.
And the marking unit 205 is configured to mark the building picture data in the training set, the verification set, and the test set according to an object type formulated in a preset high-altitude parabolic detection strategy, so as to obtain a marking data set in one-to-one correspondence with the building picture data, where the marking data set includes object marking frame coordinates and object type information.
And a data enhancement unit 206, configured to perform data enhancement processing on the training set and the corresponding labeled data set.
And the model training unit 207 is configured to perform model training on the training set subjected to the data enhancement processing and the corresponding labeled data set by using the YOLOv3 network to obtain a plurality of high-altitude parabolic detection models with different weights.
And the model verification unit 208 is configured to input the verification set and the labeled data set of the verification set into a YOLOv3 network, verify the model while training the model, and obtain the current accuracy of the model, so as to adjust model parameters in time and obtain an optimal weight model.
And the model testing unit 209 is configured to test the optimal weight model by using the test set and the test set label data set after the model training is completed.
And a second obtaining unit 210, configured to obtain real-time building video data captured by the camera.
The finding unit 211 is configured to find a moving object in the real-time building video data by using a background modeling algorithm.
And the detection unit 212 is configured to input the image of the moving object into the optimal weight model to perform high-altitude parabolic detection, so as to obtain a high-altitude parabolic detection result.
And a returning unit 213, configured to return the high-altitude parabolic detection result to a relevant person at the service end of the social treatment platform in a screenshot form for manual review.
And a first judging unit 214, configured to judge whether the moving object belongs to an overhead falling object according to the manual review result.
And a notification unit 215, configured to send out falling object warning information to notify relevant persons to perform processing in case that the moving object belongs to a falling object.
And the modifying unit 216 is configured to receive modification information of the object category information of the moving object from a related person under the condition that the moving object does not belong to a high falling object, and take the modified picture sample as a difficult sample.
A second judging unit 217, configured to judge whether the number of difficult samples exceeds a preset threshold.
A merging unit 218, configured to automatically start a self-optimization process when the number of the difficult samples exceeds a preset threshold, merge the difficult samples and the training set into a new training set, and perform model training according to the new training set by using a YOLOv3 network to obtain a training-iterated high altitude parabolic detection model.
The embodiment of the present invention further provides a storage medium, and the storage medium stores a computer program, and when the computer program is executed by a processor, the computer program implements part or all of the steps in each embodiment of the high altitude parabola detection method based on background modeling, YOLOv3 and self-optimization provided by the present invention. The storage medium may be a magnetic disk, an optical disk, a Read-only memory (ROM) or a Random Access Memory (RAM).
Those skilled in the art will readily appreciate that the techniques of the embodiments of the present invention may be implemented using software plus any required general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.
The same and similar parts in the various embodiments in this specification may be referred to each other. In particular, for the embodiments of the high altitude parabolic detection apparatus based on background modeling, YOLOv3 and self optimization, since they are substantially similar to the embodiments of the method, the description is simple, and the relevant points can be referred to the description in the embodiments of the method.
The above-described embodiments of the present invention do not limit the scope of the present invention.

Claims (9)

1. A high-altitude parabolic detection method based on background modeling, YOLOv3 and self optimization is characterized by comprising the following steps:
acquiring building video data shot by a camera;
intercepting pictures of the building video data according to frames to obtain an initial data set;
preprocessing the initial data set to obtain a building picture data set;
dividing the building picture data set into a training set, a verification set and a test set according to a preset proportion;
marking the building picture data in the training set, the verification set and the test set according to object types formulated in a preset high-altitude parabolic detection strategy to obtain marking data sets corresponding to the building picture data one by one, wherein the marking data sets comprise object marking frame coordinates and object type information;
performing data enhancement processing on the training set and the corresponding labeled data set;
carrying out model training on the training set subjected to data enhancement processing and the corresponding labeled data set by using a YOLOv3 network to obtain a plurality of high-altitude parabolic detection models with different weights;
inputting the verification set and the mark data set of the verification set into a YOLOv3 network, verifying the model while training the model to obtain the current accuracy of the model so as to adjust the parameters of the model in time and obtain an optimal weight model;
after the model training is finished, testing the optimal weight model by using the test set and the test set label data set;
acquiring real-time building video data shot by a camera;
searching for moving objects in the real-time building video data by using a background modeling algorithm;
inputting the image of the moving object into an optimal weight model to perform high-altitude parabolic detection to obtain a high-altitude parabolic detection result;
returning the high-altitude parabolic detection result to related personnel at the service end of the social management platform in a screenshot form for manual examination;
judging whether the moving object belongs to an overhead falling object or not according to a manual auditing result;
if the moving object belongs to a falling object, sending out falling object warning information to inform related personnel to process;
if the moving object does not belong to the falling object, receiving modification information of the object type information of the moving object by related personnel, and taking a modified picture sample as a difficult sample;
judging whether the number of the difficult samples exceeds a preset threshold value or not;
and if the number of the difficult samples exceeds a preset threshold value, automatically starting a self-optimization process, combining the difficult samples and a training set into a new training set, and performing model training according to the new training set by using a YOLOv3 network to obtain a high altitude parabolic detection model after training iteration.
2. The method as claimed in claim 1, wherein the step of preprocessing said starting data set to obtain a building picture data set includes screening and correcting distorted, or blurred picture data to obtain said building picture data set.
3. The method as claimed in claim 1, wherein the step of marking the building picture data in the training set, the verification set and the test set according to the object categories formulated in the preset high altitude parabolic detection strategy to obtain a marked data set corresponding to the building picture data one by one comprises the steps of:
and for the training set, selecting the object in each image of the verification set and the test set in the object category by using a rectangular frame, storing the position of the rectangular frame in the image, wherein the position comprises coordinate information of the upper left corner and the lower right corner of the rectangular frame, marking the category of the object, generating an XML file from the marked information, and forming a marked data set corresponding to the marked images one by one.
4. The method of claim 3, wherein the object categories include moving objects that are commonly found to cause false positives, such as cars, birds, people, balloons, plastic bags, and the like.
5. The method of claim 1, wherein data enhancing the training set and corresponding labeled data set comprises:
performing data enhancement processing by adopting a flip transformation, a random pruning, a color dithering, a translation transformation, a scale transformation, a contrast transformation, a noise disturbance, a rotation transformation or a reflection transformation and a mixup method, wherein the mixup method comprises the following steps:
Figure FDA0002488717230000021
wherein (x) i ,y i ),(x j ,y j ) Is two samples randomly drawn from the training data, x represents the picture matrix, y represents the label information,
Figure FDA0002488717230000022
participating in model training data after enhancement, and enabling lambda to be in the range of 0,1]。
6. The method as claimed in claim 1, wherein in the step of performing model training on the training set after data enhancement processing and the corresponding labeled data set by using a YOLOv3 network to obtain the multi-weight high altitude parabolic detection model, the labeled picture sample and the background picture sample are sent into the YOLOv3 network together, the labeled picture is a positive sample, the background picture does not contain objects in the object class, the unlabeled background picture is a negative sample, the positive and negative samples are trained in the YOLOv3 network together, and the multi-weight high altitude parabolic detection model is obtained through iterative training.
7. The method as claimed in claim 1, wherein in the step of acquiring real-time building video data shot by the camera, the data transmission format of the monitoring video and the server is video stream, and the transmission protocol is ONVIF protocol.
8. The method of claim 1, wherein the background modeling algorithm is an adaptive mixed gaussian background modeling algorithm.
9. An apparatus for high altitude parabolic detection based on background modeling, YOLOv3 and self optimization, the apparatus comprising:
the first acquisition unit is used for acquiring building video data shot by the camera;
the intercepting unit is used for intercepting pictures of the building video data according to frames to obtain an initial data set;
the preprocessing unit is used for preprocessing the initial data set to obtain a building picture data set;
the grouping unit is used for dividing the building picture data set into a training set, a verification set and a test set according to a preset proportion;
the marking unit is used for marking the building picture data in the training set, the verification set and the test set according to object types formulated in a preset high-altitude parabolic detection strategy to obtain marking data sets corresponding to the building picture data one by one, and each marking data set comprises object marking frame coordinates and object type information;
the data enhancement unit is used for carrying out data enhancement processing on the training set and the corresponding marked data set;
the model training unit is used for performing model training on the training set subjected to data enhancement processing and the corresponding labeled data set by using a YOLOv3 network to obtain a plurality of high-altitude parabolic detection models with different weights;
the model verification unit is used for inputting the verification set and the marked data set of the verification set into a YOLOv3 network, verifying the model while training the model, and obtaining the current accuracy of the model so as to adjust the model parameters in time and obtain an optimal weight model;
the model testing unit is used for testing the optimal weight model by utilizing the test set and the test set marking data set after the model training is finished;
the second acquisition unit is used for acquiring real-time building video data shot by the camera;
the searching unit is used for searching a moving object in the real-time building video data by utilizing a background modeling algorithm;
the detection unit is used for inputting the image of the moving object into an optimal weight model to perform high-altitude parabolic detection to obtain a high-altitude parabolic detection result;
the return unit is used for returning the high-altitude parabolic detection result to related personnel at the service end of the social treatment platform in a screenshot form for manual examination;
the first judgment unit is used for judging whether the moving object belongs to an overhead falling object or not according to a manual auditing result;
the notification unit is used for sending out high-altitude falling object warning information to notify related personnel to process under the condition that the moving object belongs to a high-altitude falling object;
the modifying unit is used for receiving modification information of object type information of the moving object from related personnel under the condition that the moving object does not belong to a high falling object, and taking a modified picture sample as a difficult sample;
the second judging unit is used for judging whether the number of the difficult samples exceeds a preset threshold value or not;
and the merging unit is used for automatically starting a self-optimization process under the condition that the number of the difficult samples exceeds a preset threshold value, merging the difficult samples and the training set into a new training set, and performing model training according to the new training set by using a YOLOv3 network to obtain a high-altitude parabolic detection model after training iteration.
CN202010398931.7A 2020-05-12 2020-05-12 High-altitude parabolic detection method and device based on background modeling, YOLOv3 and self-optimization Active CN111723654B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010398931.7A CN111723654B (en) 2020-05-12 2020-05-12 High-altitude parabolic detection method and device based on background modeling, YOLOv3 and self-optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010398931.7A CN111723654B (en) 2020-05-12 2020-05-12 High-altitude parabolic detection method and device based on background modeling, YOLOv3 and self-optimization

Publications (2)

Publication Number Publication Date
CN111723654A CN111723654A (en) 2020-09-29
CN111723654B true CN111723654B (en) 2023-04-07

Family

ID=72564368

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010398931.7A Active CN111723654B (en) 2020-05-12 2020-05-12 High-altitude parabolic detection method and device based on background modeling, YOLOv3 and self-optimization

Country Status (1)

Country Link
CN (1) CN111723654B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329627B (en) * 2020-11-05 2024-02-09 重庆览辉信息技术有限公司 High-altitude throwing object distinguishing method
CN112347915B (en) * 2020-11-05 2024-03-12 重庆览辉信息技术有限公司 High-altitude throwing object distinguishing system
CN112308000B (en) * 2020-11-06 2023-03-07 安徽清新互联信息科技有限公司 High-altitude parabolic detection method based on space-time information
CN112330743B (en) * 2020-11-06 2023-03-10 安徽清新互联信息科技有限公司 High-altitude parabolic detection method based on deep learning
CN112800846B (en) * 2020-12-30 2024-08-27 深圳云天励飞技术股份有限公司 High-altitude parabolic monitoring method and device, electronic equipment and storage medium
CN112686186A (en) * 2021-01-05 2021-04-20 润联软件系统(深圳)有限公司 High-altitude parabolic recognition method based on deep learning and related components thereof
CN112926445B (en) * 2021-02-24 2024-09-13 北京爱笔科技有限公司 Parabolic behavior recognition method, model training method and related devices
CN113076809B (en) * 2021-03-10 2023-07-21 海纳云物联科技有限公司 Visual transducer-based high-altitude object detection method
CN113112777A (en) * 2021-04-07 2021-07-13 浙江磐至科技有限公司 Wireless transmission security system based on system technology service under monitoring
CN113159159B (en) * 2021-04-15 2023-09-29 东北大学 Small sample image classification method based on improved CNN
CN113516042A (en) * 2021-05-17 2021-10-19 江苏奥易克斯汽车电子科技股份有限公司 High-altitude parabolic detection method, device and equipment
CN113837087B (en) * 2021-09-24 2023-08-29 上海交通大学宁波人工智能研究院 Animal target detection system and method based on YOLOv3
CN114863370B (en) * 2022-07-08 2022-10-25 合肥中科类脑智能技术有限公司 Complex scene high altitude parabolic identification method and system
CN117079095A (en) * 2023-06-25 2023-11-17 江南大学 Deep learning-based high-altitude parabolic detection method, system, medium and equipment
CN118155284A (en) * 2024-03-20 2024-06-07 飞虎互动科技(北京)有限公司 Signature action detection method, signature action detection device, electronic equipment and readable storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018095082A1 (en) * 2016-11-28 2018-05-31 江苏东大金智信息系统有限公司 Rapid detection method for moving target in video monitoring
CN109309811A (en) * 2018-08-31 2019-02-05 中建三局智能技术有限公司 A kind of throwing object in high sky detection system based on computer vision and method
CN109711319A (en) * 2018-12-24 2019-05-03 安徽高哲信息技术有限公司 A kind of method and system that grain unsound grain image recognition sample database is established
CN109872341A (en) * 2019-01-14 2019-06-11 中建三局智能技术有限公司 A kind of throwing object in high sky detection method based on computer vision and system
CN110275042A (en) * 2019-05-07 2019-09-24 深圳市零壹移动互联系统有限公司 A kind of throwing object in high sky detection method based on computer vision and radio signal analysis
CN110853295A (en) * 2019-11-12 2020-02-28 江西赣鄱云新型智慧城市技术研究有限公司 High-altitude parabolic early warning method and device
CN110969604A (en) * 2019-11-26 2020-04-07 北京工业大学 Intelligent security real-time windowing detection alarm system and method based on deep learning
CN111079663A (en) * 2019-12-19 2020-04-28 深圳云天励飞技术有限公司 High-altitude parabolic monitoring method and device, electronic equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018095082A1 (en) * 2016-11-28 2018-05-31 江苏东大金智信息系统有限公司 Rapid detection method for moving target in video monitoring
CN109309811A (en) * 2018-08-31 2019-02-05 中建三局智能技术有限公司 A kind of throwing object in high sky detection system based on computer vision and method
CN109711319A (en) * 2018-12-24 2019-05-03 安徽高哲信息技术有限公司 A kind of method and system that grain unsound grain image recognition sample database is established
CN109872341A (en) * 2019-01-14 2019-06-11 中建三局智能技术有限公司 A kind of throwing object in high sky detection method based on computer vision and system
CN110275042A (en) * 2019-05-07 2019-09-24 深圳市零壹移动互联系统有限公司 A kind of throwing object in high sky detection method based on computer vision and radio signal analysis
CN110853295A (en) * 2019-11-12 2020-02-28 江西赣鄱云新型智慧城市技术研究有限公司 High-altitude parabolic early warning method and device
CN110969604A (en) * 2019-11-26 2020-04-07 北京工业大学 Intelligent security real-time windowing detection alarm system and method based on deep learning
CN111079663A (en) * 2019-12-19 2020-04-28 深圳云天励飞技术有限公司 High-altitude parabolic monitoring method and device, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Visual Detail Augmented Mapping for Small Aerial Target Detection;Jing Li 等;《Remote Sensing》;20181221;第1-23页 *
一种改进YOLOv3的动态小目标检测方法;崔艳鹏 等;《西安电子科技大学学报》;20200304;第47卷(第3期);第1-7页 *

Also Published As

Publication number Publication date
CN111723654A (en) 2020-09-29

Similar Documents

Publication Publication Date Title
CN111723654B (en) High-altitude parabolic detection method and device based on background modeling, YOLOv3 and self-optimization
CN112380952B (en) Power equipment infrared image real-time detection and identification method based on artificial intelligence
CN110046631B (en) System and method for automatically inferring changes in spatiotemporal images
CN114022432B (en) Insulator defect detection method based on improved yolov5
CN112001339A (en) Pedestrian social distance real-time monitoring method based on YOLO v4
CN115731164A (en) Insulator defect detection method based on improved YOLOv7
CN110084165A (en) The intelligent recognition and method for early warning of anomalous event under the open scene of power domain based on edge calculations
CN107563299B (en) Pedestrian detection method using RecNN to fuse context information
CN116343330A (en) Abnormal behavior identification method for infrared-visible light image fusion
Shen et al. Biomimetic vision for zoom object detection based on improved vertical grid number YOLO algorithm
CN111539422B (en) Flight target cooperative identification method based on fast RCNN
Gleason et al. A Fusion Approach for Tree Crown Delineation from Lidar Data.
CN111488911B (en) Image entity extraction method based on Mask R-CNN and GAN
CN113408584A (en) RGB-D multi-modal feature fusion 3D target detection method
CN114170144A (en) Power transmission line pin defect detection method, equipment and medium
CN111260687A (en) Aerial video target tracking method based on semantic perception network and related filtering
Chang et al. Locating waterfowl farms from satellite images with parallel residual u-net architecture
CN115861799A (en) Light-weight air-to-ground target detection method based on attention gradient
CN115272882A (en) Discrete building detection method and system based on remote sensing image
Zhou et al. Information distribution based defense against physical attacks on object detection
CN114419444A (en) Lightweight high-resolution bird group identification method based on deep learning network
CN109886303A (en) A kind of TrAdaboost sample migration aviation image classification method based on particle group optimizing
CN117765348A (en) Target detection model deployment method, target detection method and electronic equipment
CN116630828B (en) Unmanned aerial vehicle remote sensing information acquisition system and method based on terrain environment adaptation
CN113223081A (en) High-altitude parabolic detection method and system based on background modeling and deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240122

Address after: No. N3013, 3rd Floor, R&D Building N, Artificial Intelligence Science and Technology Park, Wuhan Economic and Technological Development Zone, Wuhan City, Hubei Province, 430058

Patentee after: Zhongdian Cloud Computing Technology Co.,Ltd.

Country or region after: China

Patentee after: CHINA ELECTRONIC SYSTEM TECHNOLOGY Co.,Ltd.

Address before: No.49 Fuxing Road, Haidian District, Beijing 100036

Patentee before: CHINA ELECTRONIC SYSTEM TECHNOLOGY Co.,Ltd.

Country or region before: China