CN112861998A

CN112861998A - Neural network model construction method, safety channel abnormity monitoring method and system

Info

Publication number: CN112861998A
Application number: CN202110281503.0A
Authority: CN
Inventors: 李斯; 赵齐辉
Original assignee: Dongpu Software Co Ltd
Current assignee: Dongpu Software Co Ltd
Priority date: 2021-03-16
Filing date: 2021-03-16
Publication date: 2021-05-28

Abstract

The application provides a neural network model construction method, which is suitable for monitoring the abnormity of a safety channel and comprises the following steps: providing an initial neural network model, wherein the initial neural network model is constructed by introducing an SP P module into a YOLOv3 network; manufacturing a safe channel sample data set, wherein the safe channel sample data set comprises a training data set and a verification data set; and respectively inputting the training data set and the verification data set in the safe channel sample data set into an initial neural network model for training and evaluation to obtain a target neural network model. In addition, the application also provides a method and a system for monitoring the abnormity of the safety channel by applying the neural network model.

Description

Neural network model construction method, safety channel abnormity monitoring method and system

Technical Field

The application relates to the technical field of security protection, in particular to a neural network model construction method, and a safety channel abnormity monitoring method and system applying the neural network model.

Background

Safe evacuation refers to the process of guiding people to evacuate to a safe area, for example, when a fire occurs, people are guided to evacuate to a place which is not threatened by the fire. In order to ensure safe evacuation from the dangerous area, the building should be provided with necessary evacuation facilities, such as a level door, evacuation stairs, a platform bridge, escape holes, evacuation protection areas and the like. The method has special regulations on safe evacuation allowable time, exit quantity, width and distance in places such as malls, office buildings, schools and the like with dense personnel.

In recent years, the express delivery industry has been developed rapidly, and logistics distribution has become an important field of high-speed development. In the short decades of the development of the express industry, the living habits and consumption habits of people are gradually changed, and the express is well integrated into the work and life of people and becomes a part of the work and life of people. Express delivery work is divided into: the basic functions of modern logistics, such as warehousing, collection and payment, information processing and the like, are added in four traditional links of 'marketing', 'distribution', 'transportation' and 'delivery'. With the business expansion of distribution and express delivery network points, the problems in personnel evacuation design are increasingly shown, and the function of a safe channel is particularly important for ensuring that personnel can be rapidly evacuated to a safe area when a fire disaster happens. However, in many cases, the safety passage is piled up with sundries to affect evacuation and escape. A monitoring device is usually arranged at a safety channel of an existing logistics center, security personnel check video information in a monitoring room in time, however, manual video streaming analysis cannot be fully utilized, and only an auxiliary effect is achieved. The risk of untimely monitoring and missing report exists in the manual monitoring. Based on this, there is a need for an improvement of existing secure channel monitoring methods.

Disclosure of Invention

The application aims to provide a neural network model construction method suitable for monitoring the abnormity of a safety channel, and the real-time performance and the accuracy of the monitoring of the safety channel are realized.

The purpose of the application is realized by adopting the following technical scheme:

in a first aspect, the present application provides a neural network model building method, which is suitable for monitoring a security channel anomaly, and includes: providing an initial neural network model, wherein the initial neural network model is constructed by introducing an SPP module into a YOLOv3 network; manufacturing a safe channel sample data set, wherein the safe channel sample data set comprises a training data set and a verification data set; and respectively inputting the training data set and the verification data set in the safe channel sample data set into an initial neural network model for training and evaluation to obtain a target neural network model. The technical scheme has the advantages that under the condition that the difficulty of deep network training of the YOLOv3 network is reduced, an SPP module consisting of a plurality of parallel pooling layers with different Kernel sizes is added into the YOLOv3 network, and the SPP module is matched with a residual block consisting of continuous convolutional layers, so that the network is deepened and the precision is improved; and the SPP module respectively sends the intermediate feature map output by the YOLOv3 network into different scale size pooling layers for down-sampling to obtain a multi-scale feature map. The multi-scale characteristic graph forms a large-scale characteristic graph after being spliced, and the idea of cross-layer splicing is further strengthened, so that the subsequent training of the neural network is converged more quickly, and the multi-scale characteristic graph has better characteristic learning capability and lower Loss value.

In some alternative embodiments, the initial neural network model is based on a PyTorch framework. The technical scheme has the advantages that the initial neural network model is built under a Pythrch frame environment, the model comprises a data set processing file, a network model file, a network training file, a network test file, a visual file and the like, decompression, division, optimization, testing, visualization and the like of data are completed, the model is built more quickly, the debugging speed is reduced, and the research and development period is shortened.

In some optional embodiments, the backbone network of the yollov 3 network employs GhostNet. The technical scheme has the advantages that the trunk network can perform different block pooling on one frame of image by adopting GhostNet, and one feature is extracted from each block to serve as a dimension, so that the dimension of the finally obtained feature is ensured to be consistent, and the problems of information loss and non-uniform dimension are avoided; and setting a pooling kernel to obtain a plurality of characteristics of each graph, realizing the characteristic fusion of local characteristics and global characteristics, improving the detection precision of the characteristic graph and being beneficial to obtaining an accurate detection frame.

In a second aspect, the present application provides a method for monitoring an anomaly of a secure tunnel, including: acquiring video information of a safe channel area in real time, wherein the video information comprises multi-frame images of the safe channel area; inputting at least one frame of image of the video information into a target neural network model to obtain an initial detection result, wherein the target neural network model is obtained by training the initial neural network model, and the initial neural network model is obtained by introducing an SPP module into a YOLOv3 network for construction; and judging whether the safety channel is abnormal or not according to the initial detection result. The technical scheme has the advantages that the video information of the safety channel acquired in real time is automatically analyzed, whether the safety channel is abnormal or not is automatically determined by the analyzed image through the initial neural network model, long-time manual monitoring of monitoring personnel can be omitted, the monitoring efficiency of the safety channel is improved, the result is automatically generated, and the real-time performance is high; the artificial careless omission caused by artificial monitoring is avoided, and the accuracy and the safety of the safety channel monitoring are improved; by introducing the SPP module into the YOLOv3 network, the training of the subsequent neural network can be converged more quickly, and the training has better feature learning capability and lower Loss value.

In some optional embodiments, the determining whether the secure channel has an exception according to the initial detection result includes: scoring the initial detection result by using a softening non-maximum inhibition method; and judging whether the safety channel is abnormal or not according to the initial detection result and the score corresponding to the initial detection result. Compared with the existing multi-scale sliding window detection, the softening non-maximum value inhibition method has the advantages that detection frames are generated for specific object types and classified, overlapping threshold values are set for the specific object types respectively, and false positive of detection results can be reduced; the method does not need additional training, is easy to realize, and can be easily integrated into the image detection process of the safe channel.

In some optional embodiments, the initial detection result includes a detection result of a plurality of frames of images detected continuously, and the determining, according to the initial detection result and a score corresponding to the initial detection result, whether the secure channel is abnormal includes:

if the detection result of the first frame image in the multi-frame images is 'no obstacle' and the score corresponding to the first frame image is above a first threshold value, judging that the safety channel is 'normal'; alternatively, the first and second electrodes may be,

if the detection result of the first frame image in the multiple frame images is 'obstacle existence' and the score corresponding to the first frame image is above a second threshold value, and the detection results of the second frame image to the Nth frame image are 'no obstacle existence', judging that the safety channel is 'normal', wherein N is an integer greater than 2; alternatively, the first and second electrodes may be,

and if the detection result of the first frame image in the multi-frame images is 'no obstacle' and the score corresponding to the first frame image is below a first threshold value, and the detection results of the second frame image to the Nth frame image are 'no obstacle', judging that the safety channel is 'normal'. The technical scheme has the advantages that different threshold values are respectively set for the detection results without obstacles and with obstacles, and when the detection result of the first frame image is 'no obstacles' and the corresponding score of the first frame image is higher, the state of the safe channel is determined to be 'normal'; when the detection result of the first frame image is 'obstacle existence' or the detection result of the first frame image is 'obstacle nonexistence' but the corresponding score is lower than the set threshold, the dynamic detection of the safety channel condition is realized by continuously detecting the multiple frame images, and the method is applicable to the detection of moving/moving objects in the safety channel and solves the error problem caused by temporary passing of people.

In some optional embodiments, the determining, according to the initial detection result and the score corresponding to the initial detection result, whether the secure channel is abnormal further includes:

if the detection result of the first frame image in the multi-frame images is 'no obstacle' and the corresponding score is below the first threshold value, continuing to detect a second frame image and a third frame image in the multi-frame images;

and if the detection results of the second frame image and the third frame image in the multi-frame images are both 'obstacle existence' and the corresponding scores of the second frame image and the third frame image are both above a second threshold value, judging that the safety channel is 'abnormal'. The technical scheme has the advantages that the multiple images are jointly detected, the abnormity of the safety channel is determined according to the detection results of the multiple images, and the risk of false report caused by the analysis error of the individual image frame can be avoided.

In some optional embodiments, the initial detection result includes a plurality of detection frames with class marks, and the determining whether the secure channel has an abnormality according to the initial detection result includes: determining overlapping detection boxes existing in the plurality of detection boxes with class marks; for the overlapped detection frames, reducing the scores of the detection frames with the overlapping rate higher than a set threshold value by a softening non-maximum value inhibition method to obtain an optimal detection frame; and judging whether the safety channel is abnormal or not according to the optimal detection frame. The technical scheme has the advantages that the scores of the detection frames in the adjacent areas are adjusted instead of completely inhibited, so that the accuracy rate under the condition of high retrieval rate can be improved; considering that the non-maximum value inhibition method which completely inhibits adjacent areas is easier to miss the object to be detected in a higher overlapping environment, the softening non-maximum value inhibition method can still obviously improve the object detection performance at a low retrieval rate.

The third aspect provides a safe passageway anomaly monitoring system, is applicable to express delivery and distributes the center, includes: the video acquisition equipment is arranged towards the safe channel and is used for acquiring video information of a safe channel area, and the video information comprises multi-frame images of the safe channel area; the processor is connected with the video acquisition equipment and comprises a target neural network model, the target neural network model is obtained through training of an initial neural network model, the initial neural network model is obtained by introducing an SPP module into a YOLOv3 network, a trunk network of the YOLOv3 network adopts GhostNet, and the processor is used for inputting at least one frame of image of the video information into the target neural network model to obtain an initial detection result; and judging whether the safety channel is abnormal or not according to the initial detection result. The technical scheme has the advantages that the video acquisition equipment is used for acquiring flow information to analyze whether the safe channel area of the express distribution center is abnormal or not, monitoring personnel can perform long-time manual monitoring, and monitoring efficiency is improved; the artificial careless omission caused by artificial monitoring is avoided, and the monitoring accuracy and safety are improved; to the condition that this kind of staff of express delivery allotment center is come in and go out densely, the goods is more, the monitoring real-time is higher.

In a fourth aspect, the present application provides an electronic device comprising a memory and a processor, the memory storing a computer program, the processor implementing the steps of any of the above methods when executing the computer program.

In a fifth aspect, the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, performs the steps of any of the methods described above.

Drawings

The present application is further described below with reference to the drawings and examples.

Fig. 1 is a schematic flowchart of a method for constructing a neural network model according to an embodiment of the present disclosure;

FIG. 2 is a schematic structural diagram of a YOLOv3-SPP network model constructed and formed according to an embodiment of the present application;

FIG. 3 is a schematic structural diagram of an SPP module provided in an embodiment of the present application;

fig. 4 is a schematic structural diagram of GhostNet provided in an embodiment of the present application;

fig. 5 is a schematic structural diagram of a Ghost bottompiece provided in an embodiment of the present application;

fig. 6 is a schematic flowchart of a method for monitoring an anomaly of a secure tunnel according to an embodiment of the present application;

fig. 7 is a schematic flowchart of another method for monitoring an anomaly of a secure tunnel according to an embodiment of the present application;

FIG. 8 is a schematic diagram of processing a frame of image using a target neural network model according to an embodiment of the present application;

fig. 9 is a schematic flowchart of a process of determining whether an exception exists in a secure channel according to an initial detection result according to an embodiment of the present application;

FIG. 10 is a schematic diagram of a plurality of class-labeled detection boxes obtained by using a target neural network model according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of a safety channel anomaly monitoring device 100 according to an embodiment of the present application;

fig. 12 is a schematic structural diagram of a state determination module 120 provided in an embodiment of the present application;

fig. 13 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

fig. 14 is a schematic structural diagram of a program product for implementing a secure channel anomaly monitoring/neural network model building method according to an embodiment of the present application;

fig. 15 is a schematic structural diagram of a safety channel anomaly monitoring system according to an embodiment of the present application;

fig. 16 is a schematic structural diagram of another safety channel abnormality monitoring system according to an embodiment of the present application.

Detailed Description

The present application is further described with reference to the accompanying drawings and the detailed description, and it should be noted that, in the present application, the embodiments or technical features described below may be arbitrarily combined to form a new embodiment without conflict.

Referring to fig. 1, an embodiment of the present application provides a neural network model construction method, which is applicable to security channel anomaly monitoring and is specifically applicable to specific places such as security systems of shopping malls, security systems of schools, security systems of factories, security systems of logistics distribution centers, and security systems of hospitals. The neural network model construction method comprises the steps of S101-S103.

Step S101: providing an initial neural network model, wherein the initial neural network model is constructed by introducing an SPP module into a YOLOv3 network;

step S102: manufacturing a safe channel sample data set, wherein the safe channel sample data set comprises a training data set and a verification data set;

step S103: and respectively inputting the training data set and the verification data set in the sample data set of the safe channel into the initial neural network model for training and evaluation to obtain a target neural network model.

In a specific embodiment, as shown in fig. 2, the initial neural network model introduces a Spatial Pyramid Pooling (SPP) module into the YOLOv3 network, and constructs and forms a YOL Ov3-SPP network model, specifically, the SPP module is integrated between the 5 th convolution and the 6 th convolution of the first detection scale in the YOLOv3 network, so as to complete the construction of the YOLOv 3-SPP-based target detection model.

As shown in FIG. 3, the SPP module in the embodiment of the present application is composed of Kernel Size 1 and Φ₁，Φ₂···Φ_kIs formed by a plurality of parallel pooling layers. The method is characterized in that YOLOv3 forms a residual block by more continuous 3 × 3 and 1 × 1 convolutional layers under the condition of reducing difficulty in training a deep network, the network is further deepened, the number of network layers is increased to 53, the improvement of precision is obvious, YOLOv3 adopts a multi-scale fusion mode for prediction, YOLOv3 further strengthens the idea of cross-layer splicing of YOLOv2, the features of a shallow layer of a neural network are spliced with the features of a deep layer of the neural network, and finally 3 scale feature maps of y1, y2, y3 and the like are adopted for detection, the size of a y1 feature map is 13 × 13, the size of a y2 feature map is 26 × 26, and the size of a y3 feature map is 52 × 52.

Before the SPP module, a YOLOv3 network outputs a feature map with the size of 23 x 13 x 512, the SPP module respectively sends the feature map into different scale size pooling layers for down-sampling to obtain feature maps with 4 scales and the size of 23 x 13 x 512, and finally the SPP module splices the extracted multi-scale depth features to obtain feature maps with the size of 23 x 13 x 2048 and the fusion size of different scales. In the embodiment of the application, the SPP module is introduced into the YOLOv3 network, so that the training of the subsequent neural network can be converged faster, the characteristic learning capability is better, and the Loss value is lower.

In one embodiment, the initial neural network model is based on the PyTorch framework: and (3) building an initial neural network model under a Pythrch framework environment, wherein the model comprises a data set processing file, a network model file, a network training file, a network test file, a visual file and the like. The data set processing file is used for teaching the data set used for neural network training and testing to decompress and process according to a specified rule, and converting and dividing the data set; the network model file is used for defining the structure of the neural network, and the structure of the neural network comprises the number of filters of each layer, the size and parameters of the filters and an initialization mode; the network training file is used for defining that the neural network model is trained by using a training data set in the training process, and the training result is fed back to the model file to optimize the neural network model; the network test file is used for defining the trained network to use the test data set to test the network model, and if necessary, the network model can be trained again after debugging and tested; the visual file is used for defining the visualization of the network test effect in a chart form, so that a builder can directly observe the test effect and debug the neural network conveniently. In the embodiment of the application, the initial neural network model is based on the PyTor frame, and the model is built more quickly, so that the debugging speed is reduced, and the research and development period is shortened.

In one embodiment, the backbone network of the YOLOv3 network employs GhostNet, and reference is made to fig. 4 for the specific network parameters of GhostNet. Due to the fact that the high-resolution secure channel image frames can cause information loss or size inconsistency in preprocessing and multi-scale prediction. In the application, GhostNet can perform different block pooling on one frame of image, and one feature is extracted from each block to serve as a dimension, so that the finally obtained features are ensured to be consistent in dimension, and the problems of information loss and nonuniform dimension are avoided. In this embodiment, in order to implement feature fusion of local features and global features, the maximum pooling kernel is set to be 13, and the remaining two kernels are respectively decreased to be 9 and 5, so that a plurality of features of each graph are obtained, the accuracy of feature graph detection is improved, and an accurate detection frame is favorably obtained.

The trunk network in the embodiment of the application adopts GhostNet, and has great difference compared with the prior trunk network adopting a convolution scheme: compared with the convolution widely using 1x 1pointwise, the basic operation in the GhostNet module can have a self-defined kernel module; in the conventional convolution method, pointwise convolution is adopted to process characteristics across channels, depthwise convolution is adopted to process spatial information, and in contrast, GhostNet adopts original convolution to firstly generate a small amount of intrinsic characteristic graphs and then utilizes cheap linear operation to expand the characteristics and increase the channels; the operation of processing each feature graph by the conventional convolution method is limited by depthwise operation or shift operation in the prior high-efficiency framework, but linear operation in GhostNet has great diversity; the identity mapping is equivalent to linear changes in the Ghost module, and an intrinsic feature map can be stored.

In one embodiment, the Ghost net model may include a convolutional layer, a pooling layer, and a fully-connected layer, wherein the convolutional layer may include a plurality of Ghost BottleNeck modules. Illustratively, the GhostNet model specifically includes: conv2d3x3, multiple Ghost BottleNeck modules, Conv2d1x1, AvgPool 7x7, Conv2d1x1, and a fully connected layer. Referring to fig. 5, a schematic structural diagram of the Ghost BottleNeck module is shown. Wherein, the Ghost BottleNeck module Stride is 2. The Ghost bottompiece is mainly composed of two stacked Gh ost modules: the first one is used as an expansion layer, so that the number of channels is increased; the second is to reduce the number of channels to match the shortcut path. Then, shortcut is used to connect the inputs and outputs of the two Ghost modules. In the embodiment of the application, the shortcut path is realized by a depth convolution of a downsampling layer and Stride being 2, and the Gh ost bottomblock module has the functions of learning features and downsampling. Based on a set of original feature maps, the present application applies a series of linear transformations to generate, at little cost, a number of "phantom" feature maps (Ghost feature maps) that can exploit the required information from the original features. The Ghost module is plug-and-play, and a Ghost botleneck is obtained by stacking the Ghost module, so that a lightweight neural network, namely Ghost Net, is built.

Making a secure channel sample dataset in an embodiment may comprise:

firstly, preselecting and collecting enough sample images without obstacles in the safety channel and sample images with obstacles in the safety channel;

secondly, labeling the sample image to form a sample data set. Specifically, the labellimg tool is adopted to mark the position of a safe channel and the position of an obstacle in each sample image, so that a sample data set is formed. Specifically, the data set needs to be converted into a format of a VOC2007 data set, respectively establishing different storage spaces, wherein: la bels stores label information such as security access doors, cargo, indicator lights, walls, etc.; ImageSets stores a training data set, a testing data set and a verification data set, JPEGImages and images store original sample images, and exceptions store the labeling information of each original sample image.

Subsequently, the safe channel sample data set is divided into different categories such as a training data set, a verification data set and a test data set through script codes. Specifically, the training model may be saved in a pth format file, and the classification result includes: the training data set, the verification data set and the test data set can respectively account for 60%, 30% and 10% of the proportion.

The embodiment of the present application further provides a method for monitoring an anomaly of a secure tunnel, as shown in fig. 6, the method includes:

step S201: and acquiring video information of a safe channel area in real time, wherein the video information comprises a plurality of frames of images of the safe channel area.

Optionally, the video information in the safe passage area can be acquired in real time through a common gun type camera, an infrared night vision waterproof camera, a spherical camera and other camera devices. The video acquisition equipment can acquire panoramic video information and the video acquisition equipment can be distributed in a plurality of different positions; correspondingly, the panoramic video information can be videos of all angles respectively collected by a plurality of video collecting devices in different directions or different angles, and then the videos are spliced. Specifically, the video acquisition device can adopt a wide-angle camera with a field angle exceeding 180 degrees, so as to be beneficial to fully acquiring video information of an area corresponding to the equipment cabinet. Of course, the video capture device may also select a network camera or a local camera. For example, the network camera adopts an RJ45 interface and takes streams through GB 28181; the local camera adopts a USB interface or a CMOS Serial Interface (CSI), and video frames are taken through file handles.

Optionally, a multi-frame image may be obtained by analyzing the video information: loading video information, and creating an AVasset object according to the video information, wherein the AVasset object is a special abstract class for modeling synchronous audio-visual media, and a user can set an AVasset attribute according to actual requirements; analyzing the video information based on the AVasset object to acquire basic information of the video information; and intercepting a plurality of frames of images from the basic information according to preset time and size, wherein each frame of image has complete image data of the secure channel. Optionally, a clustering method may be used to further remove redundancy of the multiple frames of images, so as to reduce a large amount of redundant information between the image frames and save storage space.

Step S202: inputting at least one frame of image of video information into a target neural network model to obtain an initial detection result, wherein the target neural network model is obtained through training of the initial neural network model, and the initial neural network model is obtained by introducing an SPP module into a YOLOv3 network for construction.

Step S203: and judging whether the safety channel is abnormal or not according to the initial detection result.

In the embodiment of the application, the video information of the safety channel acquired in real time is automatically analyzed, and the analyzed image automatically determines whether the safety channel is abnormal or not by adopting the initial neural network model, so that long-time manual monitoring of monitoring personnel can be avoided, the monitoring efficiency of the safety channel is improved, the result is automatically generated, and the real-time performance is high; the artificial careless omission caused by artificial monitoring is avoided, and the accuracy and the safety of the safety channel monitoring are improved; by introducing the SPP module into the YOLOv3 network, the training of the subsequent neural network can be converged more quickly, and the training has better feature learning capability and lower Loss value.

In one embodiment, the backbone network of the YOLOv3 network employs GhostNet as shown in fig. 4. The trunk network adopts GhostNet to perform different block pooling on a frame of image, and one feature is extracted from each block to serve as a dimension, so that the dimension of the finally obtained feature is ensured to be consistent, and the problems of information loss and non-uniform dimension are avoided; and setting a pooling kernel to obtain a plurality of characteristics of each graph, realizing the characteristic fusion of local characteristics and global characteristics, improving the detection precision of the characteristic graph and being beneficial to obtaining an accurate detection frame.

In an embodiment, the method for monitoring the safety channel abnormality, as shown in fig. 7, includes:

step S301: acquiring video information of a safe channel area in real time, wherein the video information comprises a plurality of frames of images of the safe channel area;

step S302: inputting at least one frame of image of the video information into a target neural network model to obtain an initial detection result;

step S303: scoring the initial detection result by using a softening non-maximum inhibition method;

step S304: and judging whether the safety channel is abnormal or not according to the initial detection result and the score corresponding to the initial detection result.

In the embodiment of the application, compared with the existing multi-scale sliding window detection, the softening non-maximum inhibition method generates detection frames for the objects of specific classes and classifies the detection frames, and overlapping threshold values are respectively set for the specific object classes, so that the false positive of the detection result can be reduced; the method does not need additional training, is easy to realize, and can be easily integrated into the image detection process of the safe channel.

Specifically, as shown in fig. 8, a frame of image of the secure channel processed by the target neural network model is taken as an example for explanation:

step S401: inputting the current frame image into a target neural network model to obtain a first detection result;

step S402: judging whether an obstacle exists or not, specifically judging whether the obstacle exists in the current frame image or not according to the first detection result, and if the judgment result is ' Y ' (the obstacle exists in the current frame image '), entering the step S405; if the determination result is "N" (no obstacle is present in the current frame image), the process proceeds to step S403;

step S403: scoring the first detection result by adopting a softening non-maximum inhibition method;

step S404: it is determined whether the score is greater than or equal to a first threshold. If the judgment result is "Y" (the score corresponding to the first detection result exceeds or equals to the first threshold), go to step S406; if the judgment result is "N" (the score corresponding to the first detection result is less than the first threshold), go to step S405;

step S405: performing next frame image detection;

step S406: confirm the secure channel is "normal".

In an embodiment, taking a current frame image as a first frame image in a plurality of frame images as an example for explanation, if a detection result of the first frame image in the plurality of frame images is "no obstacle exists" and a score corresponding to the first frame image is above a first threshold, it is determined that the security channel is "normal", that is, the detection result of the first frame image has extremely high reliability, and subsequent image frame detection is not required; if the detection result of the first frame image in the multiple frame images is "no obstacle", but the score corresponding to the first frame image is below the first threshold value, and the reliability of the detection result of the first frame image is to be confirmed, the detection result needs to be combined with other subsequent frame images to determine the state of the secure channel. Further, a "normal" command for the safety channel may be generated, and the command may be executed by a signal indicator, for example, a signal lamp displays green after receiving the command. It is to be understood that, in the embodiment of the present application, the score being above the first threshold includes the case where the score is equal to or greater than the first threshold, and the score being below the first threshold further includes the case where the score is less than the first threshold; a score above the second threshold includes a score equal to or greater than the second threshold, and a score below the second threshold further includes a score less than the second threshold.

In one embodiment, if the detection result of the first frame image is "obstacle present" and the corresponding score is above the second threshold, the detection of the subsequent one or more frames of images is continued. For example, the second frame image may be detected first, if the detection result of the second frame image is that no obstacle exists, the third frame image, · · nth frame image continues to be detected, and if the detection results of the second frame image to nth frame image are all "no obstacle exists", it is determined that the secure tunnel is "normal", and N is an integer greater than 2. In this embodiment, the first threshold corresponds to a case where the detection result is "no obstacle" and the second threshold corresponds to a case where the detection result is "obstacle present", that is, corresponding thresholds are set for the two detection results, respectively. Optionally, the specific size of the first threshold and the specific size of the second threshold may be set according to actual situations, for example, the first threshold and the second threshold are set to be equal or not equal.

In one embodiment, if the detection result of the first frame image is "no obstacle exists" but the corresponding score is below the first threshold, the subsequent frame or frames of images are continuously detected. For example, the second frame image may be detected first, if the detection result of the second frame image is "no obstacle exists", the third frame image, · · nth frame image continues to be detected, and if the detection results of the second frame image to nth frame image are "no obstacle", the safety channel is determined to be "normal". In this embodiment, dynamic detection of the status of the secure channel is realized by continuously detecting multiple frames of images, and the method is applicable to detection of moving/moving objects in the secure channel, and solves the problem of errors caused by temporary passing of people. Of course, in order to further improve the accuracy of the detection results, the detection results of the second frame image and the nth frame image may be scored, and the security channel may be determined to be "normal" only when all the detection results of the N-1 frame image are "no obstacle" and the score of the detection result of each frame image is equal to or greater than the first threshold value.

In an embodiment, if the detection result of the first frame image is "obstacle present" and the corresponding score is below the second threshold, the second frame image and the third frame image are continuously detected, and if the detection results of the second frame image and the third frame image are both "obstacle present" and the corresponding score is above the second threshold, the safety channel is determined to be "abnormal". The multiple images are jointly detected, the abnormity of the safety channel is determined according to the detection results of the multiple images, and the risk of false reporting caused by the analysis error of individual image frames can be avoided. Further, an 'abnormal' instruction of the safety channel can be correspondingly generated, the instruction can be executed by a signal indicator light, and if the signal light receives the instruction, the red color is displayed; the alarm can also be executed by a buzzer, such as a buzzer for sounding an alarm.

In an embodiment, the initial detection result may include a plurality of detection frames with class marks, and correspondingly determine whether there is an abnormality in the secure channel according to the initial detection result, as shown in fig. 9, including:

step S501: determining overlapping detection frames existing in a plurality of detection frames with class marks;

step S502: for the overlapped detection frames, reducing the scores of the detection frames with the overlapping rate higher than a set threshold value by a softening non-maximum value inhibition method to obtain an optimal detection frame;

step S503: and judging whether the safety channel is abnormal or not according to the optimal detection frame. Whether the safe channel has an exception or not comprises the following steps: "normal" and "abnormal" conditions. "normal" may indicate that the secure tunnel is clear or no obstruction exists, and "abnormal" may indicate that the secure tunnel is blocked or an obstruction exists.

Fig. 10 shows a plurality of detection boxes with class labels obtained by detecting a secure channel image using a target neural network model (YOLOv3-SSP) according to an embodiment of the present application. Wherein, the detection frame with the category label can comprise: a safe exit sign detection frame; an illuminating lamp detection frame; a wall detection frame; a security access door detection frame, etc. Further, in some places, there is overlap of the detection frames.

In one embodiment, let iou (M, b)_i) Representing a weight function based on the degree of overlap of the detection boxes, N_tIs a threshold value, s_iRepresents the score of the detection box, i represents the number of the detection box and is a positive integer, 0 < i ≦ N, N is the number of all picture overlapping detection boxes, and in this embodiment, the softening non-maximum suppression (Soft-NMS) method has a score reset function expressed as:

by the fraction reset function, io between framesu is less than threshold N_tWhen the current time is over, the processing is not carried out; when iou between frames is larger than threshold N_tAt that time, score s will be scored_iAnd (1-iou) to obtain a processed score. Since 1-IOU is between 0-1, this achieves a high IOU candidate box score reduction operation. In this way, the attenuation degree of the detection frame close to M is large, and the detection frame far away from M is not affected.

In the embodiment of the application, the fraction of the detection frame in the adjacent area is adjusted instead of completely inhibited, so that the accuracy rate under the condition of high retrieval rate can be improved; considering that the non-maximum value inhibition method which completely inhibits adjacent areas is easier to miss the object to be detected in a higher overlapping environment, the softening non-maximum value inhibition method can still obviously improve the object detection performance at a low retrieval rate.

It should be noted that the above description of the monitoring method flow is provided for illustrative purposes only and is not intended to limit the scope of the present application. Many variations and modifications may be made to the teachings of the present application by those of ordinary skill in the art in light of the present disclosure. However, such changes and modifications do not depart from the scope of the present application. In some embodiments, the secure tunnel anomaly monitoring method flow may be accomplished with one or more additional operations not described and/or without one or more of the operations discussed. For example, a weighting function based on the degree of overlap of detection boxes may employ a score resetting function of more parameters, such as a Gompertz function or the like. Also for example, the Ghost bottomenck module may be set to Stride 1, or both.

Referring to fig. 11, an embodiment of the present application further provides a device 100 for monitoring an anomaly of a secure tunnel, where a specific implementation manner of the device is consistent with the implementation manner and the achieved technical effect described in the embodiment of the foregoing method, and details are not repeated.

The secure tunnel abnormality monitoring apparatus 100 includes: an obtaining module 110, configured to obtain video information of a secure channel area in real time, where the video information includes a multi-frame image of the secure channel area; the state determining module 120 is configured to input at least one frame of image of the video information into a target neural network model to obtain an initial detection result, and determine whether the security channel is abnormal according to the initial detection result, where the target neural network model is obtained by training the initial neural network model, the initial neural network model is obtained by introducing an SPP module into a YOLOv3 network, and a backbone network of the YOLOv3 network employs ghost net.

Referring to fig. 12, in a specific implementation, the state determination module 120 may include: the analysis unit 121 is configured to analyze the video information in the secure channel area into a multi-frame image; a target neural network model unit 122, in which a target neural network model is stored, the target neural network model is constructed by introducing an SPP module into the YOLOv3 network, and a backbone network of the YOLOv3 network adopts ghestnet, and meanwhile, the target neural network model unit 122 processes at least one frame of image of the video information to obtain an initial detection result; and a judging unit 123, configured to judge whether the secure channel is abnormal according to the initial detection result.

Further, the status determining module 120 may further include a Soft-NMS unit 124 for determining overlapping detection boxes existing in the plurality of detection boxes with class labels; and for the overlapped detection frames, reducing the scores of the detection frames with the overlapping rate higher than a set threshold value by a softening non-maximum value inhibition method to obtain the optimal detection frame. Further, the determining unit 123 may determine whether the secure channel is abnormal according to the initial detection result and the optimal detection frame.

Further, the apparatus may further include a prompt instruction generating module 130, configured to determine whether to generate a prompt message according to the state of the secure channel. Specifically, when it is determined that the safety channel is "abnormal" (an obstacle or a blockage is present), the prompt instruction generation module 130 generates a prompt message or a prompt instruction. When the safe channel is determined to be "normal" (no obstacle or clear), the prompt instruction generation module 130 does not generate prompt information or prompt instructions.

Referring to fig. 13, an embodiment of the present application further provides an electronic device 200, where the electronic device 200 includes at least one memory 210, at least one processor 220, and a bus 230 connecting different platform systems.

The memory 210 may include readable media in the form of volatile memory, such as Random Access Memory (RAM)211 and/or cache memory 212, and may further include Read Only Memory (ROM) 213.

The memory 210 further stores a computer program, and the computer program can be executed by the processor 220, so that the processor 220 executes the steps of the model obtaining method in the embodiment of the present application, and a specific implementation manner of the method is consistent with the implementation manner and the achieved technical effect described in the embodiment of the model obtaining method, and details of some contents are not repeated.

Memory 210 may also include a utility 214 having at least one program module 215, such program modules 215 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

Accordingly, the processor 220 may execute the computer programs described above, and may execute the utility 214.

Bus 230 may be a local bus representing one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or any other type of bus structure.

The electronic device 200 may also communicate with one or more external devices 240, such as a keyboard, pointing device, bluetooth device, etc., and may also communicate with one or more devices capable of interacting with the electronic device 200, and/or with any devices (e.g., routers, modems, etc.) that enable the electronic device 200 to communicate with one or more other computing devices. Such communication may be through input-output interface 250. Also, the electronic device 200 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 260. The network adapter 260 may communicate with other modules of the electronic device 200 via the bus 230. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 200, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, and data backup storage platforms, to name a few.

The embodiments of the present application further provide a computer-readable storage medium, where the computer-readable storage medium is used to store a computer program, and when the computer program is executed, the steps of any one of the methods in the embodiments of the present application are implemented, and a specific implementation manner of the steps is consistent with the implementation manner and the achieved technical effect described in the embodiments of the methods, and some details are not repeated.

Fig. 14 shows a program product 300 for implementing the above-mentioned secure channel anomaly monitoring/neural network model building method provided by this embodiment, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be executed on a terminal device, such as a personal computer. However, the program product 300 of the present invention is not so limited, and in this application, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. Program product 300 may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

A computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that can communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing. Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the C language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

The embodiment of the present application further provides a system for monitoring an abnormal safe passage, which is suitable for an express distribution center, and as shown in fig. 15, the system includes a video acquisition device 310 and a processor 320 connected to the video acquisition device 310. Video acquisition equipment 310, the safe access door towards express delivery allocation center sets up for obtain the video information in the equipment cabinet corresponding region in real time, this video information includes safe access door, safety indicator, light, the wall of safe access etc. of safe access. The processor 320 is configured to receive the video information and determine a status of the secure channel according to the video information, where the status of the secure channel includes "normal" or "abnormal" (obstacle, jam) of the secure channel. The arrows in fig. 15 indicate the direction of instruction or information propagation. The connection of adjacent devices may be any form of wired or wireless network, or any combination thereof. By way of example only, video capture device 310 and processor 320 may be connected by a network, which may include a cable network, a wired network, a fiber optic network, a telecommunications network, an intranet, the internet, a Local Area Network (LAN), a Wide Area Network (WAN), a Wireless Local Area Network (WLAN), a Metropolitan Area Network (MAN), a Public Switched Telephone Network (PSTN), a bluetooth network, a zigbee network, a Near Field Communication (NFC) network, and the like, or any combination thereof. In some embodiments, the network may include one or more network access points. For example, the network may include wired or wireless network access points, such as base stations and/or internet switching points.

In an embodiment, the video capture device 310 is disposed over the secure tunnel for real-time capturing video information of the secure tunnel area. The specific process may refer to the description of step S201.

Processor 320 may be a single server or a group of servers. The server groups may be centralized or distributed (e.g., processor 320 may be a distributed system). For example, the processor 320 may be local or remote. Also for example, processor 320 may be implemented on a cloud platform. By way of example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an internal cloud, a multi-tiered cloud, and the like, or any combination thereof. In particular, processor 320 may include a processing engine. A processing engine may process information and/or data to perform one or more functions described herein. For example, the processing engine includes a target neural network model, and the processing engine may parse video information into multiple frames of images, obtain an initial detection result according to at least one frame of image of the video information, and determine whether the security channel is abnormal according to the initial detection result. More specifically, processor 320 may include the aforementioned Yolov3-SPP network model as shown in FIG. 2. The processing engine may include one or more processing engines (e.g., a single chip processing engine or a multi-chip processing engine). By way of example only, the processing engine may include one or more hardware processors, such as a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), an application specific instruction set processor (ASIP), an image processing unit (GPU), a physical arithmetic processing unit (PPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), a Programmable Logic Device (PLD), a controller, a microcontroller unit, a Reduced Instruction Set Computer (RISC), a microprocessor, and the like, or any combination thereof.

In a specific embodiment, the processor 320 is further configured to perform: determining overlapping detection boxes existing in the plurality of detection boxes with class marks; for the overlapped detection frames, reducing the scores of the detection frames with the overlapping rate higher than a set threshold value by a softening non-maximum value inhibition method to obtain an optimal detection frame; and judging whether the safety channel is abnormal or not according to the optimal detection frame. And the prompt message generation module is used for determining whether to generate the prompt message according to the state of the secure channel. The processor 320 is also configured to execute, when it is determined that the safe channel is "abnormal" (an obstacle exists), generating a prompt message or a prompt instruction. When the safe channel is judged to be normal (no barrier exists), no prompt information or prompt instruction is generated. The prompt message may include one or more of a sound alarm message, a light alarm message, or a pop-up prompt message.

Fig. 16 is a schematic diagram of a safety monitoring system suitable for an equipment cabinet according to another embodiment of the present application. Compared with the fig. 15, the difference is that the express delivery system further comprises a display 340 connected with the processor 320, the display 340 can be arranged in a control center of the express delivery distribution center, is located in a different room from the alarm device 330 and the video capture device 310, and can display the image frame of the first video information corresponding to the prompt message. Optionally, the display 340 includes one or a combination of Liquid Crystal Displays (LCDs), Light Emitting Diode (LED) based displays, flat panel displays, curved screens, television devices, Cathode Ray Tubes (CRTs), touch screens, and the like.

The display 340 may also be accompanied by a keyboard, a mouse, a touch screen, a microphone, and the like, so as to realize interaction between the control center of the express distribution center and the device.

While the present application is described in terms of various aspects, including exemplary embodiments, the principles of the invention should not be limited to the disclosed embodiments, but are also intended to cover various modifications, equivalents, and alternatives falling within the spirit and scope of the invention as defined by the appended claims.

Claims

1. A neural network model construction method is suitable for monitoring the abnormity of a safety channel, and is characterized by comprising the following steps:

providing an initial neural network model, wherein the initial neural network model is constructed by introducing an SPP module into a YOLOv3 network;

manufacturing a safe channel sample data set, wherein the safe channel sample data set comprises a training data set and a verification data set;

and respectively inputting the training data set and the verification data set in the safe channel sample data set into an initial neural network model for training and evaluation to obtain a target neural network model.

2. The method of claim 1, wherein the initial neural network model is based on a PyTorch framework.

3. The neural network model building method of claim 1, wherein the backbone network of the YOL Ov3 network employs GhostNet.

4. A safety channel abnormity monitoring method is characterized by comprising the following steps:

acquiring video information of a safe channel area in real time, wherein the video information comprises multi-frame images of the safe channel area;

inputting at least one frame of image of the video information into a target neural network model to obtain an initial detection result, wherein the target neural network model is obtained by training the initial neural network model, and the initial neural network model is obtained by introducing an SPP module into a YOLOv3 network for construction;

and judging whether the safety channel is abnormal or not according to the initial detection result.

5. The secure channel anomaly monitoring method according to claim 4, wherein the initial neural network model is based on a PyTorch framework.

6. The method for monitoring the abnormity of the safety channel according to claim 4, wherein the step of judging whether the safety channel has abnormity according to the initial detection result comprises the following steps:

scoring the initial detection result by using a softening non-maximum inhibition method;

and judging whether the safety channel is abnormal or not according to the initial detection result and the score corresponding to the initial detection result.

7. The method for monitoring the abnormity of the safety channel according to claim 6, wherein the initial detection result comprises a detection result of a plurality of continuously detected images, and the judging whether the safety channel has abnormity according to the initial detection result and the score corresponding to the initial detection result comprises:

and if the detection result of the first frame image in the multi-frame images is 'no obstacle' and the score corresponding to the first frame image is below a first threshold value, and the detection results of the second frame image to the Nth frame image are 'no obstacle', judging that the safety channel is 'normal'.

8. The method for monitoring the abnormity of the safety channel according to claim 7, wherein the step of judging whether the safety channel has abnormity according to the initial detection result and the score corresponding to the initial detection result further comprises the following steps:

and if the detection results of the second frame image and the third frame image in the multi-frame images are both 'obstacle existence' and the corresponding scores of the second frame image and the third frame image are both above a second threshold value, judging that the safety channel is 'abnormal'.

9. The method for monitoring the exception of the secure channel according to claim 4, wherein the initial detection result includes a plurality of detection frames with class marks, and the determining whether the exception exists in the secure channel according to the initial detection result includes:

determining overlapping detection boxes existing in the plurality of detection boxes with class marks;

for the overlapped detection frames, reducing the scores of the detection frames with the overlapping rate higher than a set threshold value by a softening non-maximum value inhibition method to obtain an optimal detection frame;

and judging whether the safety channel is abnormal or not according to the optimal detection frame.

10. The utility model provides an unusual monitoring system of escape route, is applicable to express delivery and distributes the center, its characterized in that includes:

the video acquisition equipment is arranged towards the safe channel and is used for acquiring video information of a safe channel area, and the video information comprises multi-frame images of the safe channel area;

the processor is connected with the video acquisition equipment and comprises a target neural network model, the target neural network model is obtained through training of an initial neural network model, the initial neural network model is obtained by introducing an SPP module into a YOLOv3 network, a trunk network of the YOLOv3 network adopts GhostNet, and the processor is used for inputting at least one frame of image of the video information into the target neural network model to obtain an initial detection result; and judging whether the safety channel is abnormal or not according to the initial detection result.