CN111813997B

CN111813997B - Intrusion analysis method, device, equipment and storage medium

Info

Publication number: CN111813997B
Application number: CN202010935740.XA
Authority: CN
Inventors: 王龙
Original assignee: Ping An International Smart City Technology Co Ltd
Current assignee: Ping An International Smart City Technology Co Ltd
Priority date: 2020-09-08
Filing date: 2020-09-08
Publication date: 2020-12-29
Anticipated expiration: 2040-09-08
Also published as: CN111813997A

Abstract

The invention relates to the technical field of monitoring of smart cities, and discloses an intrusion analysis method, device, equipment and storage medium, which are used for improving the accuracy and monitoring efficiency of human intrusion analysis in community management. The method comprises the following steps: pulling an original monitoring video stream from a video monitoring platform, and performing frame extraction analysis on the original monitoring video stream to obtain a corresponding video image to be identified; inputting the video image to be recognized to a preset deep learning target detector, performing recognition through a coordinate frame of the deep learning target detector, performing suppression through a non-maximum suppression algorithm, performing clustering and merging processing to obtain a target picture, recognizing the preprocessed image through a preset convolutional neural network model to obtain an output pedestrian intrusion behavior recognition result, and returning the pedestrian intrusion behavior recognition result to the video monitoring platform.

Description

Intrusion analysis method, device, equipment and storage medium

Technical Field

The invention relates to the technical field of monitoring of smart cities, in particular to an intrusion analysis method, an intrusion analysis device, intrusion analysis equipment and a storage medium.

Background

The intelligent community is a new idea of community management, is a new mode of social management innovation, can make full use of a new generation of information technology, and provides a safe, comfortable and convenient living environment for residents. Pedestrian intrusion analysis is the important component among the community management, adopts the technique based on video monitoring to carry out intrusion analysis management at present, however traditional video monitoring system has the problem that artifical discrimination error is big, intrusion analysis rate of accuracy is low, has leaded to the control inefficiency to personnel's invasion, causes the hidden danger to the community safety.

Disclosure of Invention

The invention mainly aims to solve the problems of low intrusion analysis accuracy and low monitoring efficiency in monitoring community personnel intrusion by using a traditional video system.

To achieve the above object, a first aspect of the present invention provides an intrusion analysis method, including:

pulling an original monitoring video stream from a video monitoring platform, and performing frame extraction analysis on the original monitoring video stream to obtain a corresponding video image to be identified;

inputting the video image to be recognized to a preset deep learning target detector to obtain a corresponding original anchor point coordinate frame, and reserving the original anchor point coordinate frame with the probability value larger than a preset threshold value to obtain a corresponding candidate coordinate frame;

performing suppression processing on the candidate coordinate frame according to a non-maximum suppression algorithm to obtain a coordinate frame of each pedestrian in the video image to be recognized;

carrying out clustering combination processing on the coordinate frames of all the pedestrians to obtain a new clustering combination coordinate frame;

and capturing and storing the video image to be recognized according to the new coordinate frame to obtain a group of corresponding target pictures, inputting the target pictures into a preset convolutional neural network model, and obtaining a pedestrian intrusion behavior recognition result output by the convolutional neural network model.

Optionally, in another implementation manner of the first aspect of the present invention, the preset deep learning target detector is trained in advance;

the pre-training of the preset deep learning target detector specifically includes:

the method comprises the steps that an original monitoring video stream with preset data volume is pulled from a video monitoring platform for frame extraction and analysis, an obtained video image is used as a training data set, the training data set comprises an image part and a label part, the label part is used for calibrating a human body image, and the training data set is divided into a training set and a verification set;

carrying out image data enhancement processing on the training set, and inputting the processed image into a deep learning target detector;

and verifying the output of the deep learning target detector according to the verification set, and finishing model training when the loss function value of the verification set is not decreased any more.

Optionally, in another implementation manner of the first aspect of the present invention, the performing a suppression process on the candidate coordinate frame according to a non-maximum suppression algorithm to obtain a coordinate frame of each pedestrian in the video image to be identified includes:

sorting the candidate coordinate frames from high to low according to the probability values, and taking the candidate coordinate frame with the highest probability value as a first reference frame;

traversing the candidate coordinate frames with the highest probability value, comparing the intersection ratio of the candidate coordinate frames with the first reference frame, and removing the candidate coordinate frames reaching a preset height coincidence standard;

and repeatedly traversing and removing the candidate coordinate frames which are left after the candidate coordinate frame reaching the preset height coincidence standard to obtain the coordinate frame of each pedestrian in the video image to be identified.

Optionally, in another implementation manner of the first aspect of the present invention, the clustering and merging the coordinate frames of all the pedestrians to obtain a new coordinate frame for clustering and merging includes:

presetting inter-class distances, wherein the inter-class distances are used for classifying and delimiting frames of the same class;

sorting all the coordinate frames of the pedestrians from high to low according to probability values;

selecting a coordinate frame with the highest probability value from the coordinate frames which are not marked and classified as a second reference frame;

sequentially traversing coordinate frames of pedestrians with probability values behind the second reference frame, and calculating Euclidean distances between the coordinates of the center point of each coordinate frame and the coordinates of the center point of the second reference frame;

when the Euclidean distance between the center point coordinate of one coordinate frame and the center point coordinate of the second reference frame is smaller than the inter-class distance, the coordinate frames are classified into the same class frame and marked as classified coordinate frames;

after one traversal is finished, repeatedly traversing the coordinate frames of the pedestrians until all the coordinate frames of the pedestrians belong to the corresponding categories;

and combining the coordinate frames of the pedestrians classified into one category into a corresponding new coordinate frame, so as to obtain the new coordinate frame corresponding to the clustering and combination of the coordinate frames of all the pedestrians.

Optionally, in another implementation manner of the first aspect of the present invention, the method further includes:

pre-training the preset convolutional neural network model, wherein the convolutional neural network model is used for identifying whether the pedestrian in the video image to be identified has an intrusion behavior;

the pre-training of the preset convolutional neural network model specifically includes:

acquiring the target picture of the preprocessed preset data volume as a test set, and dividing the target picture with intrusion behaviors into a positive class and dividing the target picture without intrusion behaviors into a negative class according to a preset intrusion behavior judgment standard;

preprocessing the target picture and inputting the processed image into a pre-training model parameter on a convolutional neural network model;

setting the initial learning rate of the model as a first preset probability value, setting the model optimization to adopt random gradient descent, adjusting the learning rate to a second preset probability value when the training reaches a preset training time, and finishing the model training.

Optionally, in another implementation manner of the first aspect of the present invention, the inputting the target picture into a preset convolutional neural network model to obtain a pedestrian intrusion behavior recognition result output by the convolutional neural network model specifically includes:

inputting each image in the target picture into a preset convolutional neural network model, and inputting a category probability corresponding to each image obtained through a Sigmoid function, wherein the higher the category probability is, the higher the probability that a pedestrian invades in the corresponding target picture is;

and determining whether the pedestrian has the identification result of the intrusion behavior in each target picture obtained by the output of the convolutional neural network model according to the category probability.

Optionally, in another implementation manner of the first aspect of the present invention, after the capturing and saving the video image to be recognized according to the new coordinate frame to obtain a corresponding batch of target pictures, the method further includes:

preprocessing the target picture, wherein the preprocessing at least comprises zooming, cutting and/or turning the target picture;

the inputting the target picture into a preset convolutional neural network model comprises:

and inputting the preprocessed image into a preset convolutional neural network model.

A second aspect of the present invention provides an intrusion analysis device, including:

the video image to be identified acquisition module is used for pulling an original monitoring video stream from a video monitoring platform and carrying out frame extraction analysis on the original monitoring video stream to obtain a corresponding video image to be identified;

a candidate coordinate frame obtaining module, configured to input the video image to be identified to a preset deep learning target detector, to obtain a corresponding original anchor coordinate frame, and reserve the original anchor coordinate frame with a probability value greater than a preset threshold value, to obtain a corresponding candidate coordinate frame;

the suppression processing module is used for suppressing the candidate coordinate frame according to a non-maximum suppression algorithm to obtain a coordinate frame of each pedestrian in the video image to be recognized;

the clustering and merging module is used for clustering and merging the coordinate frames of all the pedestrians to obtain a new clustering and merging coordinate frame;

and the recognition output module is used for capturing and storing the video image to be recognized according to the new coordinate frame to obtain a group of corresponding target pictures, inputting the target pictures into a preset convolutional neural network model, and obtaining a pedestrian intrusion behavior recognition result output by the convolutional neural network model.

Optionally, in another implementation manner of the second aspect of the present invention, the apparatus further includes:

the target detector training module is used for training the preset deep learning target detector in advance;

the target detector training module specifically comprises:

the system comprises a training data set acquisition unit, a frame extraction analysis unit and a verification unit, wherein the training data set acquisition unit is used for pulling an original monitoring video stream with a preset data volume from a video monitoring platform to perform frame extraction analysis, and an obtained video image is used as a training data set, the training data set comprises an image part and a label part, and the label part is used for calibrating a human body image and dividing the training data set into a training set and a verification set;

the data enhancement unit is used for carrying out image data enhancement processing on the training set and inputting the processed image into the deep learning target detector;

and the model verification unit is used for verifying the output of the deep learning target detector according to the verification set, and finishing model training when the loss function value of the verification set is not decreased any more.

Optionally, in another implementation manner of the second aspect of the present invention, the suppression processing module includes:

the first reference frame acquisition unit is used for sequencing the candidate coordinate frames from high to low according to the probability values, and taking the candidate coordinate frame with the highest probability value as a first reference frame;

the traversal and duplication removal unit is used for traversing the candidate coordinate frame after the highest probability value, comparing the intersection ratio of the candidate coordinate frame with the first reference frame, and removing the candidate coordinate frame reaching the preset height coincidence standard;

and the repeated traversal and confidence coefficient acquisition unit is used for repeatedly traversing and removing the candidate coordinate frames which are left after the candidate coordinate frame reaching the preset height coincidence standard to obtain the coordinate frame of each pedestrian in the video image to be identified.

Optionally, in another implementation manner of the second aspect of the present invention, the cluster merging module includes:

the classification and demarcation unit is used for presetting inter-class distance which is used for classifying and demarcating the same class of frames;

the probability value sorting unit is used for sorting the coordinate frames of all the pedestrians from high to low according to the probability values;

the second reference frame acquiring unit is used for selecting the coordinate frame with the highest probability value from the coordinate frames which are not marked and classified as a second reference frame;

the Euclidean distance calculating unit is used for sequentially traversing coordinate frames of pedestrians with probability values arranged behind the second reference frame and calculating Euclidean distances between the center point coordinates of each coordinate frame and the center point coordinates of the second reference frame;

a classified coordinate frame obtaining unit, configured to, when a euclidean distance between a center point coordinate of a certain coordinate frame and a center point coordinate of the second reference frame is smaller than the inter-class distance, classify the coordinate frame into the same class frame, and mark the classified coordinate frame;

the repeated traversal unit is used for repeatedly traversing the coordinate frames of the pedestrians after one traversal is finished until all the coordinate frames of the pedestrians belong to the corresponding categories;

and the new coordinate frame acquisition unit is used for merging the coordinate frames of the pedestrians classified into one category into a corresponding new coordinate frame so as to obtain the new coordinate frames corresponding to the coordinate frame clustering and merging of all the pedestrians.

the convolutional neural network model pre-training module is used for pre-training the preset convolutional neural network model, and the convolutional neural network model is used for identifying whether the pedestrian in the video image to be identified has an intrusion behavior;

the convolutional neural network model pre-training module comprises:

the test set obtaining and positive and negative classification unit is used for obtaining the target picture with the preset data size after pretreatment as a test set, and classifying the target picture with intrusion behaviors into a positive classification and classifying the target picture without intrusion behaviors into a negative classification according to a preset intrusion behavior judgment standard;

the image preprocessing and input unit is used for preprocessing the target image and inputting the processed image into a pre-training model parameter on a convolutional neural network model;

and the model verification and training ending unit is used for setting the initial learning rate of the model to be a first preset probability value, setting the model to be optimized by adopting random gradient descent, adjusting the learning rate to be a second preset probability value when the training reaches the preset training times, and ending the model training.

Optionally, in another implementation manner of the second aspect of the present invention, the identification output module includes:

the class probability obtaining unit is used for inputting each image in the target picture into a preset convolutional neural network model and inputting the class probability corresponding to each image obtained through a Sigmoid function, wherein the higher the class probability is, the higher the probability that a pedestrian invades in the corresponding target picture is;

and the identification result acquisition unit is used for determining whether the pedestrian has the identification result of the intrusion behavior in each target picture obtained by the output of the convolutional neural network model according to the category probability.

Optionally, in another implementation manner of the second aspect of the present invention, the identification output module further includes:

the target picture preprocessing unit is used for preprocessing the target picture, wherein the preprocessing at least comprises the steps of zooming, cutting and/or turning the target picture;

and the input unit is used for inputting the preprocessed image into a preset convolutional neural network model.

A third aspect of the present invention provides an intrusion analysis device comprising: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line; the at least one processor invokes the instructions in the memory to cause the intrusion analysis device to perform the method of the first aspect.

A fourth aspect of the present invention provides a computer-readable storage medium having stored therein instructions which, when run on a computer, cause the computer to perform the method of the first aspect described above.

In the technical scheme provided by the invention, an original monitoring video stream is pulled from a video monitoring platform, and the original monitoring video stream is subjected to frame extraction and analysis to obtain a corresponding video image to be identified; inputting the video image to be recognized to a preset deep learning target detector to obtain a corresponding original anchor point coordinate frame, and reserving the original anchor point coordinate frame with the probability value larger than a preset threshold value to obtain a corresponding candidate coordinate frame; performing suppression processing on the candidate coordinate frame according to a non-maximum suppression algorithm to obtain a coordinate frame of each pedestrian in the video image to be recognized; carrying out clustering combination processing on the coordinate frames of all the pedestrians to obtain a new clustering combination coordinate frame; and capturing and storing the video image to be recognized according to the new coordinate frame to obtain a group of corresponding target pictures, inputting the target pictures into a preset convolutional neural network model, and obtaining a pedestrian intrusion behavior recognition result output by the convolutional neural network model. The embodiment of the invention obtains the video image by pulling and decoding from the video platform, returns the calculation result to the video platform after algorithm processing and calculation, and improves the accuracy rate and the monitoring efficiency of the personnel intrusion analysis in the community management.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a process diagram of an embodiment of an intrusion analysis method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an embodiment of an intrusion analysis device according to an embodiment of the present invention;

fig. 3 is a schematic diagram of an embodiment of an intrusion analysis device according to an embodiment of the present invention.

Detailed Description

The embodiment of the invention provides an intrusion analysis method, an intrusion analysis device, intrusion analysis equipment and a storage medium, which are used for reducing the cost of local software deployment.

In order to make the technical field of the invention better understand the scheme of the invention, the embodiment of the invention will be described in conjunction with the attached drawings in the embodiment of the invention.

The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Pedestrian intrusion analysis is the important component among the community management, adopts the technique based on video monitoring to carry out intrusion analysis management at present, however traditional video monitoring system has the problem that artifical discrimination error is big, intrusion analysis rate of accuracy is low, has leaded to the control inefficiency to personnel's invasion, causes the hidden danger to the community safety. The invention provides a pedestrian intrusion analysis method used in community management, which comprises the steps of pulling an original monitoring video stream from a video monitoring platform, and carrying out frame extraction and analysis on the original monitoring video stream to obtain a corresponding video image to be identified; the method comprises the steps of inputting a video image to be recognized to a preset deep learning target detector, performing recognition through a coordinate frame of the deep learning target detector, performing suppression through a non-maximum suppression algorithm, performing clustering combination processing to obtain a target picture, recognizing the preprocessed image through a preset convolutional neural network model to obtain an output pedestrian intrusion behavior recognition result, and returning the pedestrian intrusion behavior recognition result to a video monitoring platform, so that the accuracy and the monitoring efficiency of personnel intrusion analysis in community management are improved, and detailed explanation is performed below respectively.

Referring to fig. 1, an embodiment of an intrusion analysis method according to an embodiment of the present invention includes:

step 101, pulling an original monitoring video stream from a video monitoring platform, and performing frame extraction analysis on the original monitoring video stream to obtain a corresponding video image to be identified;

step 102, inputting the video image to be recognized to a preset deep learning target detector to obtain a corresponding original anchor point coordinate frame, and reserving the original anchor point coordinate frame with the probability value larger than a preset threshold value to obtain a corresponding candidate coordinate frame;

103, suppressing the candidate coordinate frame according to a non-maximum suppression algorithm to obtain a coordinate frame of each pedestrian in the video image to be recognized;

104, clustering and combining the coordinate frames of all the pedestrians to obtain a new clustered and combined coordinate frame;

and 105, screenshot is conducted on the video image to be recognized according to the new coordinate frame and is stored, a corresponding batch of target pictures are obtained, the target pictures are input into a preset convolutional neural network model, and a pedestrian intrusion behavior recognition result output by the convolutional neural network model is obtained.

Specifically, in community management, the acquisition of the monitoring images comes from each video monitoring platform, so that the original monitoring video stream in the video monitoring platform is acquired and acquired by a community or public security video monitoring networking platform including a community monitoring camera and a sharing platform. The method includes the steps of pulling an original monitoring video stream from a video monitoring platform, and performing frame extraction analysis on the original monitoring video stream to obtain a corresponding video image to be identified, wherein the video monitoring platform provides an rstp video stream in a format such as url, and frame extraction analysis on the video stream can be performed by means of ffmpeg fast decoding frame extraction, or matlab reading and frame extraction, or opencv frame extraction on the video stream, and details are not repeated.

Further, the video image to be identified is subjected to image scaling processing to obtain the video image to be identified. Optionally, the video image to be recognized is scaled to a fixed size set by the preset deep learning target detector during pre-training, for example, 416 × 416 may be set, that is, the size of the video image to be recognized.

In the step 102, the video image to be recognized is input to a preset deep learning target detector to obtain a corresponding original anchor coordinate frame, and the original anchor coordinate frame with the probability value larger than a preset threshold is retained to obtain a corresponding candidate coordinate frame. It should be noted that the preset deep learning target detector is obtained through pre-training, the original anchor point coordinate frames for pedestrians in the image can be obtained by inputting the image to the target detector, the model probability value corresponding to each coordinate frame is obtained, and the original anchor point coordinates with the probability value larger than the preset threshold value are selected through brushing to serve as candidate coordinate frames for pedestrian intrusion analysis.

Optionally, in another embodiment of the intrusion analysis method, the method further includes:

and pre-training the preset deep learning target detector. Wherein the pre-training of the preset deep learning target detector specifically comprises:

Optionally, image data enhancement processing may be performed on the training set, and the processed image is input to a deep learning target detector, a loss function, a model initial learning rate of 0.001, and a model learning rate adopting a decreasing manner are set, wherein a probability value in the loss function adopts a cross entropy loss function, and a coordinate loss adopts a mean square error loss function; and verifying the output of the deep learning target detector according to the verification set, adjusting the learning rate to be 0.1 when the loss function value of the verification set is not decreased any more, and finishing the model training.

Specifically, the original monitoring video stream with a preset data volume is pulled in a video monitoring platform in advance for frame extraction and analysis, an obtained video image is used as a training data set, the training data set comprises an image part and a label part, the label part is used for calibrating a human body image, the training data set is divided into a training set and a verification set, and a model is trained through the training set and the convergence of the model is verified through the verification set. The deep learning target detector adopts a YOLOv3 model, and is preferably YOLOv3 (a third version of a YOLO series target detection algorithm, yo Only Look one 3). Through a YOLOv3 target detection framework, tens of thousands of original anchor point coordinate frames can be obtained, the original anchor point coordinate frames are screened according to probability values, frames larger than a threshold value are reserved, frames smaller than the threshold value are removed, and then a coordinate frame of a pedestrian is obtained.

Furthermore, in the process of training and identifying the deep learning target detector, the picture part is in a jpg format or a png format; use every human picture in the label part transversely to be the x axle, vertically be the y axle, the picture upper left corner is the central point coordinate of origin as everybody profile rectangle frame, wherein every label is for corresponding the width height of central point coordinate and the rectangle frame that contains everybody profile rectangle frame in every picture, the format of every label adopts xml file or txt file.

And further, before the model is trained, performing data enhancement processing on the images in the training set, wherein the image data enhancement processing comprises image random horizontal turning, image color temperature saturation adjustment and image random rotation. According to the invention, the formats of the picture part and the label part are specifically limited, and the definition and the standardization processing are carried out on the input image in the training process, so that the accuracy of identifying each human body in the image in the model training process is improved, the reliability of model training is improved, and the identification accuracy of the deep learning target detector is enhanced.

Optionally, the training phase optimization method is designed to Adam + SGD, that is, the Adam algorithm is used to increase the convergence rate in the early stage of training, and the SGD algorithm is used to ensure the convergence of the model in the later stage. The probability value selects a cross entropy loss function, the coordinate loss selects a mean square error loss function, and the regularization is set to inhibit overfitting, so that the reliability and the identification accuracy of the deep learning target detector are further enhanced.

In the step 103, the candidate coordinate frame is subjected to suppression processing according to a non-maximum suppression algorithm, so as to obtain a coordinate frame of each pedestrian in the video image to be identified. Optionally, in another embodiment of the intrusion analysis method according to the present invention, the step 103 includes:

It can be understood that, when the target detection is performed on the image, many candidate frames are generated on the image, for example, several frames may be generated on the face of the same pedestrian, and only one coordinate frame is needed for each pedestrian, so that the candidate coordinate frames can be suppressed through the non-maximum suppression algorithm, the coordinate frame with the most probability value can be obtained through repeated traversal iteration and is used as the coordinate frame of the target pedestrian, the accuracy of the coordinate frame of each pedestrian obtained in the video image to be recognized can be greatly improved through the non-maximum suppression algorithm, and the accuracy of monitoring and analyzing the pedestrian intrusion is improved.

In the step 104, the coordinate frames of all the pedestrians are subjected to clustering and merging processing to obtain a new coordinate frame subjected to clustering and merging.

Optionally, in another embodiment of the intrusion analysis method according to the present invention, the step 104 includes:

Specifically, the obtained pedestrian coordinate frames are sorted in a mode that the probability value of each frame is from high to low; setting an inter-class distance, selecting a frame which has the highest probability from unmarked frames as a reference frame, sequentially traversing the frames arranged behind the reference frame, when the Euclidean distance between the center point of a certain frame and the center point of the reference frame is less than the set inter-class distance, setting the Euclidean distance as the evolution of one fifth of the image area, namely, defining the frame as the same class, marking the classified coordinate frame, and continuing the traversal; when a traversal is completed, it is repeated until all the boxes have been attributed to a certain class. Further, for the coordinate frames classified into one class, the coordinate frames are merged into a new coordinate frame, and the new coordinate frame is set as the frame with the smallest area including the class of frame, specifically, for each class, the smallest horizontal and vertical coordinates are respectively searched from the top left vertices of all the frames in the class, and are denoted as x1 and y1, and the largest horizontal and vertical coordinates are respectively searched from the bottom right vertices of all the frames in the class, and are denoted as x2 and y2, (x 1, y 1), (x 2, y 2) are the top left vertices and the bottom right vertices of the new frame. According to the invention, the coordinate frames of all pedestrians are subjected to clustering and merging treatment to obtain a new clustering and merging coordinate frame, and the new coordinate frame is subjected to class attribution, so that the target detection precision is higher, the accuracy of pedestrian identification is correspondingly improved, and the monitoring accuracy is improved by carrying out subsequent intrusion identification.

In the step 105, after the capturing and storing the video image to be recognized according to the new coordinate frame to obtain a batch of corresponding target pictures, the method further includes: and preprocessing the target picture, wherein the preprocessing at least comprises the steps of zooming, cutting and/or turning the target picture. Wherein the inputting the target picture into a preset convolutional neural network model comprises: and inputting the preprocessed image into a preset convolutional neural network model.

Capturing and storing the video image to be identified according to the new coordinate frame to obtain a group of corresponding target pictures, and preprocessing the target pictures; by storing the screenshot of the new coordinate frame of the video image to be recognized, a target picture with better pedestrian recognition accuracy can be obtained, and the target picture is further preprocessed, so that the picture definition is improved.

In another embodiment of the intrusion analysis method of the present invention, the preprocessing the target picture specifically includes: and processing the target image according to the set image scaling size, the image random cutting size and the image random turning mode so as to enable the image identified in the neural network model to be more effective.

Further, after the step 105, the preprocessed image may be input into a preset convolutional neural network model, so as to obtain a pedestrian intrusion behavior recognition result output by the convolutional neural network model, and the pedestrian intrusion behavior recognition result is returned to the video monitoring platform.

In another embodiment of the intrusion analysis method of the present invention, the inputting the target picture into a preset convolutional neural network model to obtain a pedestrian intrusion behavior recognition result output by the convolutional neural network model specifically includes:

In another embodiment of the intrusion analysis method of the present invention, the intrusion analysis method further includes:

setting the initial learning rate of the model as a first preset probability value, setting the model optimization to adopt random gradient descent, adjusting the learning rate to a second preset probability value when the training reaches a preset training time, and finishing the model training. Optionally, the first preset probability value is set to 0.001, and the second preset probability value is set to 0.0001, that is, when training reaches a predetermined number of times, the learning rate is adjusted from the first preset probability value to one tenth of the original rate.

By pre-training the convolutional neural network model, a deep learning model for identifying whether the pedestrian in the video image to be identified has intrusion behavior can be obtained.

Specifically, each obtained image is input to a preset convolutional neural network model, and a category probability corresponding to each image is obtained through a Sigmoid function, so that the possibility that a pedestrian in the target picture invades can be obtained, the higher the category probability is, the higher the probability that the pedestrian in the corresponding target picture invades is, whether an invasion action exists can be judged through the obtained category probability, for example, images for category distinguishing of normal walking, door break, wall turn, and the like are obtained in advance, and the convolutional neural network model is trained. Further, the coordinate values of the coordinate frames and the probability of whether the intrusion behaviors exist are finally obtained, the coordinate values are packaged into structured data, and the structured data can be returned to the video monitoring platform so as to prompt and warn monitoring personnel of the intrusion behaviors existing in the community in real time, and therefore the intelligence and the safety of community management are improved.

In summary, the intrusion analysis method provided by the invention obtains the corresponding video image to be identified by pulling the original monitoring video stream from the video monitoring platform and performing frame extraction analysis on the original monitoring video stream; the method comprises the steps of inputting video images to be recognized to a preset deep learning target detector, performing recognition through a coordinate frame of the deep learning target detector, performing suppression through a non-maximum suppression algorithm, performing clustering combination processing to obtain a target picture, recognizing the preprocessed images through a preset convolutional neural network model to obtain an output pedestrian intrusion behavior recognition result, and returning the pedestrian intrusion behavior recognition result to a video monitoring platform, so that the accuracy and the monitoring efficiency of intrusion analysis of personnel in community management are greatly improved.

With reference to fig. 2, the intrusion analysis method in the embodiment of the present invention is described above, and an intrusion analysis apparatus in the embodiment of the present invention is described below, where an embodiment of the intrusion analysis apparatus in the embodiment of the present invention includes:

the video image to be identified acquisition module 11 is configured to pull an original monitoring video stream from a video monitoring platform, and perform frame extraction analysis on the original monitoring video stream to obtain a corresponding video image to be identified;

a candidate coordinate frame obtaining module 12, configured to input the video image to be identified to a preset deep learning target detector, so as to obtain a corresponding original anchor coordinate frame, and reserve the original anchor coordinate frame with a probability value greater than a preset threshold value, so as to obtain a corresponding candidate coordinate frame;

the suppression processing module 13 is configured to perform suppression processing on the candidate coordinate frame according to a non-maximum suppression algorithm to obtain a coordinate frame of each pedestrian in the video image to be identified;

the clustering and merging module 14 is configured to perform clustering and merging processing on the coordinate frames of all the pedestrians to obtain a new clustering and merging coordinate frame;

and the recognition output module 15 is used for capturing and storing the video image to be recognized according to the new coordinate frame to obtain a group of corresponding target pictures, inputting the target pictures into a preset convolutional neural network model, and obtaining a pedestrian intrusion behavior recognition result output by the convolutional neural network model.

Optionally, in another embodiment of the intrusion analysis device of the present invention, the device further includes:

the target detector training module specifically comprises:

Optionally, in another embodiment of the intrusion analysis device of the present invention, the suppression processing module 14 includes:

Optionally, in another embodiment of the intrusion analysis device of the present invention, the cluster merging module 15 includes:

the convolutional neural network model pre-training module comprises:

Optionally, in another embodiment of the intrusion analysis device of the present invention, the identification output module includes:

Optionally, in another embodiment of the intrusion analysis device of the present invention, the identification output module further includes:

It should be noted that the apparatus in the embodiment of the present invention may be configured to implement all technical solutions in the foregoing method embodiments, and the functions of each functional module may be implemented specifically according to the method in the foregoing method embodiments, and the specific implementation process may refer to the relevant description in the foregoing example, which is not described herein again.

Fig. 2 describes the intrusion analysis apparatus in the embodiment of the present invention in detail from the perspective of the modular functional entity, and the intrusion analysis device in the embodiment of the present invention in detail from the perspective of hardware processing.

Fig. 3 is a schematic structural diagram of an intrusion analysis device 300 according to an embodiment of the present invention, where the intrusion analysis device 300 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 301 (e.g., one or more processors) and a memory 309, and one or more storage media 308 (e.g., one or more mass storage devices) storing applications 307 or data 306. Memory 309 and storage media 308 may be, among other things, transient storage or persistent storage. The program stored on the storage medium 308 may include one or more modules (not shown), each of which may include a series of instruction operations in a boolean variable store computed on a graph. Further, the processor 301 may be configured to communicate with the storage medium 308 to execute a series of instruction operations in the storage medium 308 on the intrusion analysis device 300.

The intrusion analysis device 300 may also include one or more power supplies 302, one or more wired or wireless network interfaces 303, one or more input-output interfaces 304, and/or one or more operating systems 305, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, and the like. Those skilled in the art will appreciate that the intrusion analysis device configuration shown in fig. 3 is not intended to be limiting of intrusion analysis devices and may include more or fewer components than shown, or some components in combination, or a different arrangement of components.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium, which may be non-volatile or volatile. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. An intrusion analysis method, comprising:

performing suppression processing on the candidate coordinate frame according to a non-maximum suppression algorithm to obtain a coordinate frame of each pedestrian in the target identification image;

and capturing and storing the target identification image according to the new coordinate frame to obtain a group of corresponding target images, inputting the target images into a preset convolutional neural network model, and obtaining a pedestrian intrusion behavior identification result output by the convolutional neural network model.

2. The intrusion analysis method according to claim 1, further comprising:

pre-training the preset deep learning target detector;

3. The intrusion analysis method according to claim 1, wherein the suppressing the candidate coordinate frame according to a non-maximum suppression algorithm to obtain a coordinate frame of each pedestrian in the target recognition image comprises:

and repeatedly traversing and removing the candidate coordinate frames which are left after the candidate coordinate frames which reach the preset height coincidence standard to obtain the coordinate frame of each pedestrian in the target recognition image.

4. The intrusion analysis method according to claim 1, wherein the clustering and merging the coordinate frames of all the pedestrians to obtain a new coordinate frame of clustering and merging includes:

5. The intrusion analysis method according to claim 1, further comprising:

setting the initial learning rate of the model as a preset probability value, setting the model to be optimized by adopting random gradient descent, adjusting the learning rate to one tenth of the original learning rate when the training reaches a preset training frequency, and finishing the model training.

6. The intrusion analysis method according to claim 1, wherein the step of inputting the target picture into a preset convolutional neural network model to obtain a pedestrian intrusion behavior recognition result output by the convolutional neural network model specifically comprises:

7. The intrusion analysis method according to claim 1, wherein, after capturing and storing the target recognition image according to the new coordinate frame to obtain a corresponding batch of target images, the method further comprises:

8. An intrusion analysis device, comprising:

the suppression processing module is used for performing suppression processing on the candidate coordinate frame according to a non-maximum suppression algorithm to obtain a coordinate frame of each pedestrian in the target identification image;

and the recognition output module is used for capturing and storing the target recognition image according to the new coordinate frame to obtain a group of corresponding target pictures, inputting the target pictures into a preset convolutional neural network model, and obtaining a pedestrian intrusion behavior recognition result output by the convolutional neural network model.

9. An intrusion analysis device, comprising: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line; the at least one processor invokes the instructions in the memory to cause the intrusion analysis device to perform the intrusion analysis method of any one of claims 1-7.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the intrusion analysis method according to any one of claims 1 to 7.