CN111709285A - Epidemic situation protection monitoring method and device based on unmanned aerial vehicle and storage medium - Google Patents

Epidemic situation protection monitoring method and device based on unmanned aerial vehicle and storage medium Download PDF

Info

Publication number
CN111709285A
CN111709285A CN202010385377.9A CN202010385377A CN111709285A CN 111709285 A CN111709285 A CN 111709285A CN 202010385377 A CN202010385377 A CN 202010385377A CN 111709285 A CN111709285 A CN 111709285A
Authority
CN
China
Prior art keywords
aerial vehicle
unmanned aerial
network
epidemic situation
monitoring method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010385377.9A
Other languages
Chinese (zh)
Inventor
吴细
柯琪锐
翟懿奎
陈丽燕
周文略
应自炉
甘俊英
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuyi University
Original Assignee
Wuyi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuyi University filed Critical Wuyi University
Priority to CN202010385377.9A priority Critical patent/CN111709285A/en
Publication of CN111709285A publication Critical patent/CN111709285A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01JMEASUREMENT OF INTENSITY, VELOCITY, SPECTRAL CONTENT, POLARISATION, PHASE OR PULSE CHARACTERISTICS OF INFRARED, VISIBLE OR ULTRAVIOLET LIGHT; COLORIMETRY; RADIATION PYROMETRY
    • G01J5/00Radiation pyrometry, e.g. infrared or optical thermometry
    • G01J5/0022Radiation pyrometry, e.g. infrared or optical thermometry for sensing the radiation of moving bodies
    • G01J5/0025Living bodies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01JMEASUREMENT OF INTENSITY, VELOCITY, SPECTRAL CONTENT, POLARISATION, PHASE OR PULSE CHARACTERISTICS OF INFRARED, VISIBLE OR ULTRAVIOLET LIGHT; COLORIMETRY; RADIATION PYROMETRY
    • G01J5/00Radiation pyrometry, e.g. infrared or optical thermometry
    • G01J2005/0077Imaging

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an epidemic situation protection monitoring method, device and storage medium based on an unmanned aerial vehicle.A human-face image is obtained by the unmanned aerial vehicle, a human-face image is obtained from the input image by a pre-trained single-step human-face detection network, and a body temperature value is obtained according to the human-face image, so that remote non-contact body temperature detection is realized without short-distance temperature measurement of detection personnel; identifying a mask area from the face image through a pre-trained Yolov3 target detection network, and playing first voice prompt information if the identification fails, thereby effectively realizing the monitoring of wearing the mask; acquiring crowd density information in the input image through a pre-trained regional crowd density detection network, and playing second voice prompt information if the crowd density information is greater than a preset density threshold value, so that the gathered crowd can be prompted; realize remote monitoring from body temperature, gauze mask and crowd's gathering, effectively improved the degree of automation of epidemic situation protection control.

Description

Epidemic situation protection monitoring method and device based on unmanned aerial vehicle and storage medium
Technical Field
The disclosure relates to the technical field of image processing, in particular to an epidemic situation protection monitoring method and device based on an unmanned aerial vehicle and a storage medium.
Background
The COVID-19 new coronavirus has extremely strong infectivity, is easy to spread among people through modes such as droplets and the like, needs people to wear masks to reduce cross infection, and an important symptom of a new coronavirus infected person is fever, so that people need to wear masks in a mild manner when entering and exiting various public places and hospitals during epidemic situation prevention and control, and aggregation is reduced.
Disclosure of Invention
In order to overcome the defects of the prior art, the purpose of the present disclosure is to provide an epidemic situation protection monitoring method, device and storage medium based on an unmanned aerial vehicle, which can realize remote automatic epidemic situation protection monitoring through the unmanned aerial vehicle, and reduce the investment of human resources.
The technical scheme adopted by the disclosure for solving the problems is as follows: in a first aspect, the present disclosure provides an epidemic situation protection monitoring method based on an unmanned aerial vehicle, which is used for the unmanned aerial vehicle, wherein a binocular camera and an infrared thermal imager are arranged in the unmanned aerial vehicle, and the epidemic situation protection monitoring method comprises the following steps:
acquiring an input image, acquiring a face image from the input image through a pre-trained single-step face detection network, and acquiring a body temperature value according to the face image;
recognizing a mask area from the face image through a pre-trained Yolov3 target detection network, and playing first voice prompt information if the recognition fails;
and acquiring crowd density information in the input image through a pre-trained regional crowd density detection network, and playing second voice prompt information if the crowd density information is greater than a preset density threshold value.
Optionally, in an embodiment of the present disclosure, the acquiring a body temperature value according to the face image specifically includes:
acquiring surface temperature information corresponding to the input image;
and setting the surface temperature information corresponding to the face image as a body temperature value.
Optionally, in an embodiment of the present disclosure, the method further includes: and carrying out temperature compensation on the surface temperature information through a PSO-BP neural network algorithm.
Optionally, in an embodiment of the present disclosure, the single-step face detection network includes a first backbone network layer, a pyramid network layer, a prediction network layer, and a PyramidBox loss layer.
Optionally, in an embodiment of the present disclosure, the YOLOV3 target detection network includes a second backbone network layer and a multi-scale prediction network, the second backbone network layer includes 52 convolutional layers and 1 max pooling layer, and a last feature map of the multi-scale prediction network is a 104 × 104 detection feature map.
Optionally, in an embodiment of the present disclosure, the second backbone network layer performs small target detection on the 4-fold downsampling feature map through a K-means + + algorithm.
Optionally, in an embodiment of the present disclosure, the regional population density detection network is a dense scale single column neural network based on scene segmentation.
Optionally, in an embodiment of the present disclosure, the number of scene partitions in the dense-scale single-column neural network is 25.
In a second aspect, the present disclosure provides an epidemic prevention monitoring apparatus based on an unmanned aerial vehicle, including at least one control processor and a memory for communication connection with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform the drone-based epidemic prevention monitoring method as described above.
In a third aspect, the present disclosure provides a computer-readable storage medium storing computer-executable instructions for causing a computer to execute the method for monitoring and controlling epidemic situation based on unmanned aerial vehicle as described above.
In a fourth aspect, the present disclosure also provides a computer program product comprising a computer program stored on a computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the drone-based epidemic prevention monitoring method as described above.
One or more technical schemes provided in the embodiments of the present disclosure have at least the following beneficial effects: according to the embodiment of the invention, the input image is obtained through the unmanned aerial vehicle, the face image is obtained from the input image through the pre-trained single-step face detection network, and the body temperature value is obtained according to the face image, so that the remote non-contact body temperature detection is realized, and the close-range temperature measurement of a detector is not needed; identifying a mask area from the face image through a pre-trained Yolov3 target detection network, and if the identification fails, playing first voice prompt information, automatically detecting whether people wear masks, and carrying out voice prompt on the people who do not wear masks, thereby effectively realizing the monitoring of wearing masks; acquiring crowd density information in the input image through a pre-trained regional crowd density detection network, and playing second voice prompt information if the crowd density information is greater than a preset density threshold value, so that the gathered crowd can be prompted; in conclusion, the technical scheme disclosed by the invention realizes remote monitoring on body temperature, the mask and crowd gathering, and effectively improves the automation degree of epidemic situation protection monitoring.
Drawings
The disclosure is further illustrated with reference to the following figures and examples.
Fig. 1 is a flowchart of an epidemic situation protection monitoring method based on an unmanned aerial vehicle according to an embodiment of the present disclosure;
fig. 2 is a flowchart of an epidemic situation protection monitoring method based on an unmanned aerial vehicle according to another embodiment of the present disclosure;
FIG. 3 is a block diagram of a PSO-BP neural network provided by another embodiment of the present disclosure;
FIG. 4 is a block diagram of a single-step face detection network according to another embodiment of the present disclosure;
FIG. 5(A) is a residual unit of a prior art Yolov3 target detection network;
FIG. 5(B) is a residual unit of the YOLOV3 target detection network in the present disclosure;
FIG. 6 is a block diagram of a YOLOV3 target detection network according to another embodiment of the present disclosure;
FIG. 7 is a block diagram of a dense dilated convolution block in a dense scale single column neural network provided by another embodiment of the present disclosure;
FIG. 8 is a block diagram of a dense scale single column neural network provided by another embodiment of the present disclosure;
FIG. 9(a) is a block diagram of a dense scale single column neural network provided by another embodiment of the present disclosure;
FIG. 9(b) is a block diagram of a dense scale single column neural network provided by another embodiment of the present disclosure;
FIG. 9(c) is a block diagram of a dense scale single column neural network provided by another embodiment of the present disclosure;
FIG. 9(d) is a block diagram of a dense scale single column neural network provided by another embodiment of the present disclosure;
fig. 10 is a schematic device diagram for executing an epidemic situation prevention monitoring method based on an unmanned aerial vehicle according to another embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the present disclosure more clearly understood, the present disclosure is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the disclosure and are not intended to limit the disclosure.
It should be noted that, if not conflicted, various features of the embodiments of the disclosure may be combined with each other within the scope of protection of the disclosure. Additionally, while functional block divisions are performed in apparatus schematics, with logical sequences shown in flowcharts, in some cases, steps shown or described may be performed in sequences other than block divisions in apparatus or flowcharts.
Referring to fig. 1, a first embodiment of the present disclosure provides an epidemic situation protection monitoring method based on an unmanned aerial vehicle, which is used for the unmanned aerial vehicle, wherein a binocular camera and an infrared thermal imager are arranged in the unmanned aerial vehicle, and the epidemic situation protection monitoring method includes the following steps:
step S110, acquiring an input image, acquiring a face image from the input image through a pre-trained single-step face detection network, and acquiring a body temperature value according to the face image;
step S120, identifying a mask area from the face image through a pre-trained Yolov3 target detection network, and playing first voice prompt information if the identification fails;
step S130, acquiring crowd density information in the input image through a pre-trained regional crowd density detection network, and playing second voice prompt information if the crowd density information is larger than a preset density threshold.
In an embodiment, the input image can be obtained by a binocular camera carried on the unmanned aerial vehicle, the body temperature value can be obtained by an infrared thermal imager carried on the unmanned aerial vehicle, and the above devices are prior art, but not the improvement made by the present disclosure, and other devices can be adopted to obtain corresponding data, which is not described herein again.
In an embodiment, the human face image is preferably detected by using a single-step human face detection network, because the unmanned aerial vehicle usually acquires images in the air, the human face image in the input image is smaller, and the unmanned aerial vehicle is also worn with a mask, so that the human face area is smaller, and the single-step human face detection network is more favorable for detecting a smaller and blurry human face, and is favorable for ensuring the detection accuracy.
In one embodiment, the mask area is identified by adopting a Yolov3 target detection network, so that a plurality of targets can be identified rapidly and simultaneously, and the efficiency and accuracy of mask area identification are effectively improved. It can be understood that if the mask area identification fails, the representative does not wear the mask, so the first voice prompt information can be information for prompting wearing the mask, and the specific broadcast content can be recorded according to actual requirements, which is not described in detail herein.
In an embodiment, the crowd density in a specific area can be detected by using an area crowd density detection network, and whether the crowd density is aggregated is determined by the crowd density, for example, a density threshold value can be set, when the density threshold value is exceeded, the number of people in the area is large, and a second pre-recorded voice prompt message is played to prompt people to stop aggregating.
Referring to fig. 2, in another embodiment of the present disclosure, step S110 further includes the following refinement steps:
step S111, acquiring surface temperature information corresponding to the input image;
and step S112, setting the surface temperature information corresponding to the face image as a body temperature value.
Based on the above embodiment, carry on infrared thermal imager on unmanned aerial vehicle, can obtain infrared thermal imaging, this kind of thermal imaging is corresponding with the thermal distribution field on object surface, according to the human face image that people single step face detection network detected, utilizes infrared thermal imager to measure target body surface temperature. It can be understood that the voice alarm module can be further connected, if the body temperature of the measurement target is higher than a set threshold, the measurement target is considered to be heated, and the voice alarm is performed, wherein the threshold is set according to actual requirements.
Referring to fig. 3, in another embodiment of the present disclosure, further comprising: and carrying out temperature compensation on the surface temperature information through a PSO-BP neural network algorithm.
In one embodiment, the temperature data is subjected to fusion processing by using a PSO-BP neural network algorithm, so that the influence caused by the environment temperature is effectively compensated. The traditional temperature compensation algorithm adopts a BP algorithm, a BP neural network is along the gradient descending direction of an error function during training, local optimization is easy to fall into, the phenomenon of overfitting is serious, the convergence rate is low, the training time is long, and global search cannot be carried out. Therefore, the present embodiment adopts a combination of the PSO algorithm and the BP algorithm, i.e., the PSO-BP algorithm. The PSO algorithm and the BP algorithm have strong global optimization capability and local optimization capability, and the purpose of optimizing the BP neural network by using the PSO algorithm is to obtain better initial weight and threshold of the BP network through the PSO algorithm.
For example, the following describes temperature compensation with one specific example:
initializing and randomly generating N particle groups, wherein the position vector of the particle group substantially represents all initial weights and thresholds of the neural network, the PSO utilizes algorithm steps, and finds a global optimal position vector, namely an optimal BP neural network initial weight and threshold, and simultaneously minimizes a mean square error, on the basis, the BP algorithm further optimizes the obtained weights and thresholds until the weights and the thresholds are optimal, and the following formula can be specifically adopted:
Figure BDA0002483611930000081
where N is the number of training set samples
Figure BDA0002483611930000082
Is the ideal output value, y, of the jth network output node of the ith samplej,iIs the actual output value of the jth network output node of the ith sample; c is the number of network output neurons. The method specifically comprises the following steps:
the method comprises the following steps: constructing a BP network and initializing network parameters;
step two: initializing PSO algorithm parameters;
step three: calculating a particle fitness value, and determining an individual extreme value and a global extreme value;
step four: updating the particle speed and position;
step five: if the value of the step four meets the termination condition, outputting the global optimal particles as the initial weight and the threshold of the BP network; if not, re-executing the third step;
step six: and (5) training the BP network according to the meeting condition of the step five.
On the basis, the BP algorithm optimizes the obtained parameters until the precision requirement is met. The BP learning process provides a certain amount of learning samples for the network, actual output is compared with expected output, connection parameters among the neurons are modified through back propagation, and when the error between the actual output and the expected output of the network is minimum, the connection parameters of the network are optimal, so that the whole network is determined.
In order to compensate the influence of the environmental temperature and achieve the purpose of high temperature measurement precision, a PSO-BP temperature compensation model is established as shown in fig. 3, and since the temperature is detected by the infrared thermal imager in the above embodiment, the present embodiment can process thermal imaging data by the network model and output a compensation value p for the temperature of the target object to be measured. The measured value y and the environmental temperature t of the infrared thermal imager are input into a PSO-BP temperature compensation model, the PSO-BP model establishes a nonlinear mapping among the measured value y, the environmental temperature t and an actual target value x of a sensor through learning a training sample set, and the mapping can reach higher precision through optimization of a PSO-BP algorithm, namely the output p of the PSO-BP compensation model better approaches the actual target value x, so that the compensation effect on the measured temperature data of the infrared thermal imager is achieved, and the body temperature measured value is closer to the actual body temperature value of a measured target.
Referring to fig. 4, in another embodiment of the present disclosure, the single-step face detection network includes a first backbone network layer, a pyramid network layer, a prediction network layer, and a PyramidBox loss layer.
In an embodiment, the first backbone network layer may be a reasonable-scale backbone network layer, and the specific scale is adjusted according to actual requirements, the reasonable-scale backbone network layer is used for feature extraction, and includes a basic convolutional layer and an additional convolutional layer, wherein the basic convolutional layer is conV1_1 layer to pool5 layer in VGG-16, the additional convolutional layer converts fc6 layer and fc7 layer in VGG-16 into conV _ fc layer, and the purpose of adding more convolutional layers is to make the network layer become deeper.
In one embodiment, the pyramid network layer may be a low-level feature pyramid network layer, i.e., each LFPN block has the same structure as the FPN, since the LFPN blocks are in a top-down structure starting from the middle layer rather than the top layer, and thus the receptive field is approximately half the input size. When the mask is worn, the small, fuzzy and partially shielded human face and the large, clear and complete human face have different texture features, and because the high-level features are extracted from the region lacking the environment and can introduce noise information, the low-level features are more suitable for detecting the small-size, fuzzy and partially shielded human face, so that the low-level feature pyramid network layer is preferred in the embodiment.
In one embodiment, the prediction network layer may be an environment-sensitive prediction network layer for absorbing environment information around the target face. The prediction module in the network layer can merge two methods, SSH and DSSD, where SSH increases the field of view by configuring wider convolutional prediction modules of different stride on the layer, and DSSD adds a residual block to each prediction module.
In one embodiment, for each face detection target, there is a series of pyramidboxes in the PyramidBox loss layer to supervise both classification and regression tasks. Where pyramid boxloss was classified using SoftMax loss and regression was performed with smooth L1 loss. As a generalization of the polygon box loss, the PyramidBox loss function of an image can be defined as: l ({ p)k,i},{tk,i})=∑kλkLk({pk,i},{tk,i}); wherein the k first pyramid-anchor loss is:
Figure BDA0002483611930000111
Figure BDA0002483611930000112
Wherein k is the number of pyramid-anchors, i is the number of anchors, pk,iIndicating that anchori is the predicted probability of the kth target.
Referring to fig. 5(a), 5(B), and 6, in another embodiment of the present disclosure, the YOLOV3 target detection network includes a second backbone network layer including 52 convolutional layers and 1 max pooling layer, and a multi-scale prediction network whose last feature map is a 104 detection feature map.
In one embodiment, the mask identification belongs to a two-classification task, and only two types of masks, namely a worn mask and an unworn mask, need to be marked when the data set is manufactured. Since the small targets occupy fewer pixels and have less obvious features, a 104 × 104 detection feature map is added behind a 52 × 52 feature map of a YOLOV3 target detection network framework, 2 times of upsampling is performed on an 8 times down-sampling feature map output by the framework of the YOLOV3 target detection network, the 2 times up-sampling feature map is spliced with a feature map output by a 2 nd residual block, a feature fusion target detection layer with 4 times down-sampling output is established, and the network structure shown in fig. 6 is obtained.
In an embodiment, in order to obtain more small target feature information, the second backbone network layer of this embodiment may be a backbone network Darknet53, which includes 52 convolutional layers and 1 maximum pooling layer, and is formed by 5 residual blocks Residualblock.
As shown in fig. 5(B), in order to obtain more small target feature information, 2 residual units are added to the 2 nd residual block of the structure Darknet53 of the YOLOV3 target detection network. Meanwhile, in order to avoid gradient disappearance and enhance multiplexing of features, 6 DBL units in front of an object detection output layer are changed into 2 DBL units and 2 ResNet units. Wherein the DBL unit comprises convolution, batch normalization and a leak ReLU activation function; the residual block is convolved once with a step size of 2 by 3x3, then the convolved layer is saved, and again by 1x1 and once by 3x3, and this result is added to the layer as the final result. The introduction of the residual error unit can not only increase the network depth, but also avoid the gradient disappearance.
In an embodiment, the multi-scale prediction network of the present embodiment adds a 104 × 104 detection feature map after the 52 × 52 feature map. For example, the output of the 145 th layer of the network can be up-sampled again, the resolution is improved to 104 × 128, and a feature fusion layer is added to splice the feature map of the 11 th layer to the channels of the 146 th layer of output feature maps, so that overfitting can be prevented and the extracted features of the high-resolution position information of the neural network can be exerted. The resolution sizes of the modified 4-layer feature detection maps are 13 × 13, 26 × 26, 52 × 52, 104 × 104, respectively, i.e., the final feature map is the feature map of 104 × 104.
Compared with a YOLOV3 target detection network in the prior art, the YOLOV3 target detection network can extract more features of small feature maps, and the mask small target identification rate and identification accuracy based on the unmanned aerial vehicle are greatly improved.
In another embodiment of the present disclosure, the second backbone network layer performs small target detection on the 4-fold downsampled feature map through a K-means + + algorithm.
In an embodiment, because the K-means algorithm in the prior art needs to preset a K value in advance according to the distribution of data points, and the clustering result obtained by the algorithm heavily depends on the selection of the initial seed point, the K-means + + algorithm is adopted in the embodiment instead of the K-means algorithm for clustering analysis. The K-means + + algorithm corrects for the shortcomings of the initial seed selection of the K-means algorithm, e.g., a point may be randomly selected from the input set of data points as a first cluster center, a distance d (x) between each point x in the data set and the selected cluster center is calculated, each point selects a corresponding cluster center according to d (x) minimum, according to the principle that the probability of selecting the point with larger D (x) as the cluster center is larger, selecting a new point from the points of the non-clustering centers as a new clustering center, repeating the above operations until all K clustering centers are selected, operating the existing K-means algorithm by using the selected K clustering centers as initial center points, as only the first initial seed point is randomly selected by the K-means + + algorithm, the distance between the initial seed points can be furthest.
In one embodiment, the K-means + + algorithm may perform cluster analysis on the data set using AVg IOU (average degree of overlap) as a measure of target cluster analysis. The clustered AVg IOU objective function f can be expressed as
Figure BDA0002483611930000131
Wherein B represents a data sample, i.e. a mask target in the data set; c represents the center of the cluster; n iskRepresenting the number of samples in the kth clustering center; n represents the total number of samples; k represents the number of cluster centers; i isIOU(B, C) representing the intersection ratio of the central frame of the cluster and the cluster frame; i represents a sample number; j represents the serial number of the sample in the cluster center; the purpose of selecting the anchor boxes by using cluster analysis is to reduce the number of the anchor boxes required when the labeling value of the data sample is the same as the IOU result of the anchor box, reduce the calculated amount of the model and accelerate the position regression speed.
In another embodiment of the present disclosure, the regional population density detection network is a dense scale single column neural network based on scene partitions.
With reference to fig. 7 and fig. 8, the following describes the operation principle of the area population density detection network according to this embodiment with a specific example:
as shown in fig. 8, the dense scale single-column neural network of the present embodiment mainly includes a backbone network mainly composed of the first 10 layers of the VGG-16 network, 3 dense expansion convolution blocks with dense residual connection, and 3 convolution layers for population density map regression, and the adopted convolution layers are convolution layers with smaller kernel but more layers, which is beneficial to balance the relationship between accuracy and calculated amount, and is suitable for accurate and fast population counting. The scale diversity and the receptive field of the features are expanded using dense dilated convolution blocks with dense residual connections. The network has denser scale diversity and can effectively solve the problems of large scale change and large density level difference in dense and sparse scenes. To cope with large scale variations and thereby accurately estimate the density map.
Referring to fig. 7, in an embodiment where the dense dilated convolutional block contains 3 dilated convolutional layers, the dilation rate of the network can be set according to different application scenarios. The reasonable setting of the expansion rate can keep information from more dense scales, reduce the difference between the sizes of the receptive fields, and can also utilize all pixels of the receptive fields to perform feature calculation, thereby overcoming the gridding effect. Each expanded layer within a block may be tightly coupled to other layers so that each layer may access all subsequent layers and pass information that needs to be retained. After the method is connected with the dense residual error, the problem that the crowd size changes greatly in a large range can be solved, and the characteristics of dense size and large receptive field can be output, so that the crowd information of different sizes can be captured, and the accuracy of the network is improved.
In one embodiment, dense residual concatenation is employed to improve the architecture of the entire volume block, thereby further improving information flow and also preventing the network from becoming wider. It can be understood that the output of the dense expanded volume block can directly access each layer of the subsequent dense expanded volume block, thereby realizing continuous information transfer, and compared with the common residual connection mode, the connection mode further expands the scale diversity and adaptively retains the characteristics suitable for a specific scene in the information flow process.
It should be noted that the convolutional layer of the present embodiment may be used for performing crowd density graph regression, and the following is a specific example:
in order to solve the problem of global and local density level consistency between the estimated crowd density and the real crowd density graph, a multi-scale density level consistency loss function is introduced, and the loss function is combined with an Euclidean loss function to measure the global and local consistency. The Euclidean loss function is used for measuring the estimation error of the pixel level between the estimated density graph and the true value, and the specific formula is as follows:
Figure BDA0002483611930000151
wherein N is a batNumber of images in ch, G (X)i(ii) a θ) is the training image XiThe parameter is θ. D is XiActual density map of (a); the multi-scale density level consistency loss function is used for measuring the global and local density level consistency between the estimated density graph and the true value, and the specific formula is as follows:
Figure BDA0002483611930000152
where S is the number of scale levels used for consistency checking, P is the average pooling operation, kjIs the average pooled specified output size. The scale level is the level of crowd density that divides the density map into different sub-regions and forms a pooled representation, i.e. different locations. According to the up-and-down comparison of the density levels, the estimated density map needs to be consistent with the actual situation on different scales. It should be noted that the number of scale levels and the output size of a particular scale control the trade-off between training speed and estimation accuracy. The scale levels and the number of scale levels may be set according to different application scenarios until a global and local consistency of density levels between the estimated population density map and the real population density map is obtained. The final objective function of the whole network can be obtained according to the two loss functions: i.e., the weighted sum of the two loss functions, the formula is as follows: l ═ Le+λLc
In one embodiment, during model training, a density map is generated by adopting geometric adaptive kernel processing for a scene map of a dense population in a data set; and for the image with relatively sparse crowd in the data set, a density map is generated by adopting a fixed Gaussian kernel.
Referring to fig. 9(a) -9 (d), in another embodiment of the present disclosure, the number of scene partitions in a dense scale single column neural network is 25.
The following illustrates a scene block with one specific example:
in the prior art, a complete scene in an input image is generally uniformly divided into 9 blocks, although a crowd event occurs in one block, pedestrians appearing in other blocks have no influence on the analysis of the block, the method for dividing the block is too simple, if the crowd occurs at the junction of the blocks, the number of people is divided into a plurality of blocks, and the number of people in each block cannot meet the requirement of the crowd, so that the crowd event cannot be detected. The number of tiles is therefore increased in this embodiment and distributed at the borders or intersections of the tiles. Therefore, according to the shooting angle of view and the field of view when the unmanned aerial vehicle patrols and examines, the scene segmentation method of the embodiment averagely segments the scene into 9 blocks as shown in fig. 9(a), adds corresponding segments at 6 positions adjacent to each other up and down to the 9 blocks, and obtains a result as shown in fig. 9 (b); on the basis of the block shown in fig. 9(b), the corresponding blocks are added at 6 positions adjacent to each other on the left and right sides, and the result is shown in fig. 9(c), and on the basis of the block shown in fig. 9(c), the corresponding blocks are added at 4 positions at the intersections, and the final 4 block results are shown in fig. 9 (d). Since the 25 segments overlap each other, all areas in the scene can be completely covered, and therefore the main body of the crowd in the input image can always be completely contained by one segment. Through the blocking method, the problem that the characteristic extraction is wrong and the detection cannot be carried out when people gather in the confusable position can be solved. When the number of the gathered people is too large, the crowd can be contained by a plurality of blocks, and the crowd density can be detected.
Referring to fig. 10, another embodiment of the present disclosure also provides an epidemic situation protection monitoring apparatus 1000 based on an unmanned aerial vehicle, including: the storage 1100, the control processor 1200 and a computer program stored on the storage 1200 and executable on the control processor 1100, when the control processor executes the computer program, implement the drone-based epidemic prevention monitoring method in any of the above embodiments, for example, execute the above-described method steps S110 to S130 in fig. 1 and the method steps S111 to S112 in fig. 2.
The control processor 1200 and the memory 1100 may be connected by a bus or other means, and the bus connection is exemplified in fig. 1.
The memory 1100, which is a non-transitory computer-readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer-executable programs. Further, the memory 1100 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 1100 may optionally include a memory remotely located from the control processor 1200, which may be connected to the drone-based epidemic prevention monitoring apparatus 1000 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The above-described embodiments of the apparatus are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may also be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
Furthermore, another embodiment of the present disclosure also provides a computer-readable storage medium storing computer-executable instructions, which are executed by one or more control processors, for example, by one control processor 1200 in fig. 1, and may cause the one or more control processors 1200 to execute the drone-based epidemic prevention monitoring method in the above method embodiment, for example, execute the above-described method steps S110 to S130 in fig. 1 and method steps S111 to S112 in fig. 2.
It should be noted that, since the apparatus for executing the epidemic situation prevention monitoring method based on the unmanned aerial vehicle in the embodiment is based on the same inventive concept as the above-mentioned epidemic situation prevention monitoring method based on the unmanned aerial vehicle, the corresponding contents in the method embodiment are also applicable to the embodiment of the apparatus, and are not described in detail here.
Through the above description of the embodiments, those skilled in the art can clearly understand that the embodiments can be implemented by software plus a general hardware platform. Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by hardware related to instructions of a computer program, which may be stored in a computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like.
While the present disclosure has been described with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the appended claims.

Claims (10)

1. The epidemic situation protection monitoring method based on the unmanned aerial vehicle is characterized by being used for the unmanned aerial vehicle, a binocular camera and an infrared thermal imager are arranged in the unmanned aerial vehicle, and the epidemic situation protection monitoring method comprises the following steps:
acquiring an input image, acquiring a face image from the input image through a pre-trained single-step face detection network, and acquiring a body temperature value according to the face image;
recognizing a mask area from the face image through a pre-trained Yolov3 target detection network, and playing first voice prompt information if the recognition fails;
and acquiring crowd density information in the input image through a pre-trained regional crowd density detection network, and playing second voice prompt information if the crowd density information is greater than a preset density threshold value.
2. The epidemic situation protection monitoring method based on the unmanned aerial vehicle according to claim 1, wherein the obtaining of the body temperature value according to the face image specifically comprises:
acquiring surface temperature information corresponding to the input image;
and setting the surface temperature information corresponding to the face image as a body temperature value.
3. The epidemic situation protection monitoring method based on the unmanned aerial vehicle according to claim 2, further comprising: and carrying out temperature compensation on the surface temperature information through a PSO-BP neural network algorithm.
4. The epidemic situation protection monitoring method based on the unmanned aerial vehicle according to claim 1, characterized in that: the single-step face detection network comprises a first trunk network layer, a pyramid network layer, a prediction network layer and a PyramidBox loss layer.
5. The epidemic situation protection monitoring method based on the unmanned aerial vehicle according to claim 1, characterized in that: the Yolov3 target detection network comprises a second backbone network layer and a multi-scale prediction network, wherein the second backbone network layer comprises 52 convolutional layers and 1 maximum value pooling layer, and the last feature map of the multi-scale prediction network is a detection feature map of 104 x 104.
6. The epidemic situation protection monitoring method based on the unmanned aerial vehicle according to claim 5, characterized in that: and the second backbone network layer performs small target detection on the 4-time downsampling feature map through a K-means + + algorithm.
7. The epidemic situation protection monitoring method based on the unmanned aerial vehicle according to claim 1, characterized in that: the regional crowd density detection network is a dense scale single-column neural network based on scene blocking.
8. The epidemic situation protection monitoring method based on the unmanned aerial vehicle according to claim 7, characterized in that: the number of scene blocks in the dense-scale single-column neural network is 25.
9. An epidemic situation protection monitoring device based on an unmanned aerial vehicle is characterized by comprising at least one control processor and a memory, wherein the memory is used for being in communication connection with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform the drone-based epidemic prevention monitoring method of any one of claims 1-8.
10. A computer-readable storage medium characterized by: the computer-readable storage medium stores computer-executable instructions for causing a computer to perform the method for monitoring and controlling epidemic situation based on unmanned aerial vehicle according to any one of claims 1 to 8.
CN202010385377.9A 2020-05-09 2020-05-09 Epidemic situation protection monitoring method and device based on unmanned aerial vehicle and storage medium Pending CN111709285A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010385377.9A CN111709285A (en) 2020-05-09 2020-05-09 Epidemic situation protection monitoring method and device based on unmanned aerial vehicle and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010385377.9A CN111709285A (en) 2020-05-09 2020-05-09 Epidemic situation protection monitoring method and device based on unmanned aerial vehicle and storage medium

Publications (1)

Publication Number Publication Date
CN111709285A true CN111709285A (en) 2020-09-25

Family

ID=72536631

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010385377.9A Pending CN111709285A (en) 2020-05-09 2020-05-09 Epidemic situation protection monitoring method and device based on unmanned aerial vehicle and storage medium

Country Status (1)

Country Link
CN (1) CN111709285A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112153352A (en) * 2020-10-20 2020-12-29 上海理工大学 Unmanned aerial vehicle epidemic situation monitoring auxiliary method and device based on deep learning
CN112488647A (en) * 2020-11-25 2021-03-12 京东方科技集团股份有限公司 Attendance system and method, storage medium and electronic equipment
CN112507948A (en) * 2020-12-18 2021-03-16 Oppo广东移动通信有限公司 Mask wearing prompting method and related device
CN112507783A (en) * 2020-10-29 2021-03-16 上海交通大学 Mask face detection, identification, tracking and temperature measurement method based on attention mechanism
CN112802412A (en) * 2020-12-31 2021-05-14 中国海洋大学 Day and night commuting anti-gathering optical radar capable of carrying unmanned aerial vehicle in epidemic situation
CN113052237A (en) * 2021-03-25 2021-06-29 中国工商银行股份有限公司 Target object detection method and device and server
CN113155293A (en) * 2021-04-06 2021-07-23 内蒙古工业大学 Human body remote sensing temperature measurement monitoring and recognition system based on unmanned aerial vehicle
CN113314230A (en) * 2021-05-27 2021-08-27 创新奇智(上海)科技有限公司 Intelligent epidemic prevention method, device, equipment and storage medium based on big data
CN113591607A (en) * 2021-07-12 2021-11-02 辽宁科技大学 Station intelligent epidemic prevention and control system and method
CN114373121A (en) * 2021-09-08 2022-04-19 武汉众智数字技术有限公司 Method and system for improving small target detection of yolov5 network
KR20220109120A (en) * 2021-01-28 2022-08-04 경북대학교 산학협력단 Method and apparatus for measuring body temperature using histogram and deep learning
CN117311801A (en) * 2023-11-27 2023-12-29 湖南科技大学 Micro-service splitting method based on networking structural characteristics
CN117456449A (en) * 2023-10-13 2024-01-26 南通大学 Efficient cross-modal crowd counting method based on specific information
US11983913B2 (en) 2021-05-14 2024-05-14 Honeywell International Inc. Video surveillance system with crowd size estimation

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105962908A (en) * 2016-06-28 2016-09-28 深圳市元征科技股份有限公司 Flying body temperature detector control method and device
CN109858424A (en) * 2019-01-25 2019-06-07 佳都新太科技股份有限公司 Crowd density statistical method, device, electronic equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105962908A (en) * 2016-06-28 2016-09-28 深圳市元征科技股份有限公司 Flying body temperature detector control method and device
CN109858424A (en) * 2019-01-25 2019-06-07 佳都新太科技股份有限公司 Crowd density statistical method, device, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
孙永生 等: "无人系统在新冠肺炎疫情防控中的应用实践" *
邓黄潇: "基于迁移学习与RetinaNet的口罩佩戴检测的方法" *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112153352A (en) * 2020-10-20 2020-12-29 上海理工大学 Unmanned aerial vehicle epidemic situation monitoring auxiliary method and device based on deep learning
CN112507783A (en) * 2020-10-29 2021-03-16 上海交通大学 Mask face detection, identification, tracking and temperature measurement method based on attention mechanism
CN112488647A (en) * 2020-11-25 2021-03-12 京东方科技集团股份有限公司 Attendance system and method, storage medium and electronic equipment
CN112507948A (en) * 2020-12-18 2021-03-16 Oppo广东移动通信有限公司 Mask wearing prompting method and related device
CN112802412A (en) * 2020-12-31 2021-05-14 中国海洋大学 Day and night commuting anti-gathering optical radar capable of carrying unmanned aerial vehicle in epidemic situation
KR102508595B1 (en) * 2021-01-28 2023-03-09 경북대학교 산학협력단 Method and apparatus for measuring body temperature using histogram and deep learning
KR20220109120A (en) * 2021-01-28 2022-08-04 경북대학교 산학협력단 Method and apparatus for measuring body temperature using histogram and deep learning
CN113052237A (en) * 2021-03-25 2021-06-29 中国工商银行股份有限公司 Target object detection method and device and server
CN113155293A (en) * 2021-04-06 2021-07-23 内蒙古工业大学 Human body remote sensing temperature measurement monitoring and recognition system based on unmanned aerial vehicle
US11983913B2 (en) 2021-05-14 2024-05-14 Honeywell International Inc. Video surveillance system with crowd size estimation
CN113314230A (en) * 2021-05-27 2021-08-27 创新奇智(上海)科技有限公司 Intelligent epidemic prevention method, device, equipment and storage medium based on big data
CN113591607A (en) * 2021-07-12 2021-11-02 辽宁科技大学 Station intelligent epidemic prevention and control system and method
CN113591607B (en) * 2021-07-12 2023-07-04 辽宁科技大学 Station intelligent epidemic situation prevention and control system and method
CN114373121A (en) * 2021-09-08 2022-04-19 武汉众智数字技术有限公司 Method and system for improving small target detection of yolov5 network
CN117456449A (en) * 2023-10-13 2024-01-26 南通大学 Efficient cross-modal crowd counting method based on specific information
CN117311801A (en) * 2023-11-27 2023-12-29 湖南科技大学 Micro-service splitting method based on networking structural characteristics
CN117311801B (en) * 2023-11-27 2024-04-09 湖南科技大学 Micro-service splitting method based on networking structural characteristics

Similar Documents

Publication Publication Date Title
CN111709285A (en) Epidemic situation protection monitoring method and device based on unmanned aerial vehicle and storage medium
CN110929578B (en) Anti-shielding pedestrian detection method based on attention mechanism
CN111767882B (en) Multi-mode pedestrian detection method based on improved YOLO model
CN110378381B (en) Object detection method, device and computer storage medium
CN111797983A (en) Neural network construction method and device
CN108805016B (en) Head and shoulder area detection method and device
CN113420607A (en) Multi-scale target detection and identification method for unmanned aerial vehicle
CN110222718B (en) Image processing method and device
CN107220603A (en) Vehicle checking method and device based on deep learning
CN110991444A (en) Complex scene-oriented license plate recognition method and device
CN107563299B (en) Pedestrian detection method using RecNN to fuse context information
CN113326735B (en) YOLOv 5-based multi-mode small target detection method
CN113033523B (en) Method and system for constructing falling judgment model and falling judgment method and system
CN110263920A (en) Convolutional neural networks model and its training method and device, method for inspecting and device
CN111401215B (en) Multi-class target detection method and system
CN105404894A (en) Target tracking method used for unmanned aerial vehicle and device thereof
CN112036381B (en) Visual tracking method, video monitoring method and terminal equipment
CN116343330A (en) Abnormal behavior identification method for infrared-visible light image fusion
CN107944403A (en) Pedestrian's attribute detection method and device in a kind of image
CN112464930A (en) Target detection network construction method, target detection method, device and storage medium
CN107403451A (en) Adaptive binary feature monocular vision odometer method and computer, robot
CN115376125A (en) Target detection method based on multi-modal data fusion and in-vivo fruit picking method based on target detection model
CN114764856A (en) Image semantic segmentation method and image semantic segmentation device
CN113065379B (en) Image detection method and device integrating image quality and electronic equipment
KR101612779B1 (en) Method of detecting view-invariant, partially occluded human in a plurality of still images using part bases and random forest and a computing device performing the method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200925