CN113052139A - Deep learning double-flow network-based climbing behavior detection method and system - Google Patents

Deep learning double-flow network-based climbing behavior detection method and system Download PDF

Info

Publication number
CN113052139A
CN113052139A CN202110448771.7A CN202110448771A CN113052139A CN 113052139 A CN113052139 A CN 113052139A CN 202110448771 A CN202110448771 A CN 202110448771A CN 113052139 A CN113052139 A CN 113052139A
Authority
CN
China
Prior art keywords
detection
pedestrian
network
frame
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110448771.7A
Other languages
Chinese (zh)
Inventor
张泉
赵曼
刘海峰
任广鑫
张明
季坤
吴迪
甄超
王坤
王刘芳
郑浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Zhongke Leinao Intelligent Technology Co ltd
State Grid Anhui Electric Power Co Ltd
Original Assignee
Hefei Zhongke Leinao Intelligent Technology Co ltd
State Grid Anhui Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Zhongke Leinao Intelligent Technology Co ltd, State Grid Anhui Electric Power Co Ltd filed Critical Hefei Zhongke Leinao Intelligent Technology Co ltd
Priority to CN202110448771.7A priority Critical patent/CN113052139A/en
Publication of CN113052139A publication Critical patent/CN113052139A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Psychiatry (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a climbing behavior detection method and system based on a deep learning double-flow network, belonging to the technical field of behavior identification by machine vision and comprising the following steps: s1: target detection, tracking and numbering; s2: cutting a target video clip; s3: random sampling; s4: and (5) classifying the actions. The classification network obtained through learning has good robustness, and can be accurately classified under different illumination and different weather conditions, so that multi-user behavior detection under complex conditions is realized; and the video is cut, redundant background information is removed, algorithm execution efficiency is greatly improved, detection efficiency is effectively improved by utilizing a pedestrian tracking random sampling method, and the method is worthy of being popularized and used.

Description

Deep learning double-flow network-based climbing behavior detection method and system
Technical Field
The invention relates to the technical field of behavior recognition by machine vision, in particular to a climbing behavior detection method and system based on a deep learning double-flow network.
Background
The climbing behavior detection is an important module in the field of intelligent video monitoring, and is widely applied to video monitoring systems in public places. Climbing behavior detection finds timely that people climb fence enclosing wall behaviors, automatically sends out corresponding warning or notice, and reduces investment of security and protection human resources. The climbing behavior identification mainly solves two problems, namely, the detection problem is that whether a person exists in an image is detected by using a detector; and secondly, identifying the problem, namely extracting the motion characteristics of the person and identifying the behavior of the person through a classifier.
The existing method for detecting the climbing behavior of the personnel utilizes the behavior recognition, calculates the star-shaped skeleton characteristics of the human body according to the silhouette of the human body, then classifies the skeleton characteristics into 4 states of walking, climbing, crossing and descending, and considers that the behavior of the personnel crossing the enclosure occurs when 3 states of climbing, crossing and descending continuously appear. This method is ideal and can only be used in an ideal environment with only one person, and is very poor in practical application environment. Some traditional visual methods are used for extracting human behavior features, and HMM or Bayesian networks are used for modeling and classifying the features, but the method also faces the problems that the target occlusion is serious and the manually designed behavior features are difficult to extract. The optical flow is calculated for a moving object, then the HMM or the Bayesian network is used for modeling the optical flow and the optical flow is analyzed by the classifier, and then some abnormal behaviors are detected. Therefore, a climbing behavior detection method based on a deep learning double-flow network is provided.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: how to solve the problems of poor application effect, high calculation complexity and the like of a behavior recognition method for detecting the climbing behavior of personnel, and provides a climbing behavior detection method based on a deep learning double-flow network.
The invention solves the technical problems through the following technical scheme, and the invention comprises the following steps:
s1: target detection, tracking and numbering
Carrying out pedestrian detection on the original video by using a target detection network to obtain a pedestrian detection frame; tracking by using the detection frame and the time sequence information of the video to obtain a target number;
s2: cropping a target video segment
Cutting the original video according to the detection frame and the target number, removing the area outside the detection frame, and storing the pedestrian with each number into a video segment again;
s3: random sampling
Randomly sampling and setting the number of sampling frames for each set frame number aiming at each video clip in the step S2, and calculating by using a dense optical flow method to obtain optical flow information of each pixel of the sampling frames;
s4: action classification
And (3) sending the color image and the optical flow information of each sampling frame into a climbing binary double-flow network, classifying the set frame number, and determining whether the climbing behavior exists.
Further, in the step S1, the adopted target detection network is a YOLO network, and a plurality of pedestrian targets in the video are simultaneously processed through the YOLO network, and the contour region of each pedestrian target is selected as a subsequent processing candidate region in a form of a rectangular surrounding frame.
Further, the specific process of step S2 is as follows:
s21: cutting the candidate area obtained in the step S1, cutting a square area by taking the maximum value of the length or width of the rectangular surrounding frame of each area as the side length and the central point of the rectangular frame as the cutting center, and then adjusting the size of the image to a set size;
s22: an image buffer pool of 30 is created for each person based on the pedestrian number obtained in step S1, the resized image is placed in the buffer pool, and when the image buffer reaches 30 sheets, step S3 is performed.
Furthermore, the buffer pool is used for temporary video storage established for each pedestrian target, corresponding to each pedestrian number, when the storage amount reaches 30 ℃, the buffer pool is emptied, the storage of a new picture is restarted, and the corresponding pedestrian number is unchanged. And when the content of the buffer pool cannot be updated for a long time, the pedestrian with the number corresponding to the buffer pool leaves the video monitoring range, and the buffer pool is destroyed after the preset updating time is exceeded.
Further, in step S3, the dense optical flow calculation uses a Farneback algorithm to obtain images of two adjacent frames, regards the images as a function of two-dimensional signals, sets a neighborhood (generally a square area of 2n + 1) around each pixel point, and uses a least square method to construct a functional relation between a gray value and a position, so as to convert a two-dimensional signal space of an original cartesian coordinate system image to another vector space, and obtain a pixel displacement difference between the two frames to obtain an optical flow.
Further, in the step S4, the training process of the dual-stream network is as follows:
s41: making a binary data set for training, extracting each frame of the acquired video segment, performing pedestrian detection, pedestrian tracking and cutting according to the previous method, calculating the optical flow of each frame, and storing the cut original image and the optical flow image of the corresponding position according to the manually marked type, wherein the image is climbed as a positive sample and the image is not climbed as a negative sample;
s42: and randomly selecting 3 cut original frames and optical flows of corresponding areas from a positive sample library as positive samples, selecting negative samples by using the same method, sending the negative samples into a double-flow network for classification training, and obtaining and storing the double-flow network after training.
Further, in the step S4, the structure of the dual-stream network is as follows:
the invention also provides a climbing behavior detection system based on the deep learning double-flow network, which adopts the detection method to detect the climbing behavior and comprises the following steps:
the target detection module is used for detecting pedestrians in the original video by using a target detection network to obtain a pedestrian detection frame; tracking by using the detection frame and the time sequence information of the video to obtain a target number;
the segment cutting module is used for cutting the original video according to the detection frame and the target number, removing the area outside the detection frame and storing the pedestrian with each number into a video segment again;
the random sampling module is used for randomly sampling and setting the number of sampling frames for each set frame number aiming at each video clip, and calculating by using a dense optical flow method to obtain the optical flow information of each pixel of the sampling frames;
the motion classification module is used for sending the color image and the optical flow information of each sampling frame into a climbing two-classification double-flow network, classifying the set frame number and determining whether climbing behaviors exist or not;
the central processing module is used for sending instructions to other modules to complete related actions;
the target detection module, the fragment cutting module, the random sampling module and the action classification module are all electrically connected with the central processing module.
Compared with the prior art, the invention has the following advantages: according to the climbing behavior detection method based on the deep learning double-flow network, the classification network obtained through learning has good robustness, accurate classification can be achieved under different illumination and different weather, and multi-person behavior detection under complex conditions is achieved; and the video is cut, redundant background information is removed, algorithm execution efficiency is greatly improved, detection efficiency is effectively improved by utilizing a pedestrian tracking random sampling method, and the method is worthy of being popularized and used.
Drawings
FIG. 1 is a schematic overall flow chart of a second embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating classification of an image buffer pool by climbing a two-class bi-flow network according to a second embodiment of the present invention;
fig. 3 is a structure diagram of a TSN dual-flow network in the second embodiment of the present invention;
FIG. 4a is a structural diagram of a spatial convolution network according to a second embodiment of the present invention;
fig. 4b is a structural diagram of a time convolution network in the second embodiment of the present invention.
Detailed Description
The following examples are given for the detailed implementation and specific operation of the present invention, but the scope of the present invention is not limited to the following examples.
The embodiment provides a technical scheme: a climbing behavior detection method based on a deep learning double-flow network comprises the following steps:
s1: target detection, tracking and numbering
Carrying out pedestrian detection on the original video by using a target detection network to obtain a pedestrian detection frame; tracking by using the detection frame and the time sequence information of the video to obtain a target number;
s2: cropping a target video segment
Cutting the original video according to the detection frame and the target number, removing the area outside the detection frame, and storing the pedestrian with each number into a video segment again;
s3: random sampling
Randomly sampling and setting the number of sampling frames for each set frame number aiming at each video clip in the step S2, and calculating by using a dense optical flow method to obtain optical flow information of each pixel of the sampling frames;
s4: action classification
And (3) sending the color image and the optical flow information of each sampling frame into a climbing binary double-flow network, classifying the set frame number, and determining whether the climbing behavior exists.
In this embodiment, in the step S1, the adopted target detection network is a YOLO network, and a plurality of pedestrian targets in the video are simultaneously processed through the YOLO network, and the contour region of each pedestrian target is selected as the subsequent processing candidate region in a frame form of a rectangular surrounding frame.
In this embodiment, the specific process of step S2 is as follows:
s21: cutting the candidate area obtained in the step S1, cutting a square area by taking the maximum value of the length or width of the rectangular surrounding frame of each area as the side length and the central point of the rectangular frame as the cutting center, and then adjusting the size of the image to a set size;
s22: an image buffer pool of 30 is created for each person based on the pedestrian number obtained in step S1, the resized image is placed in the buffer pool, and when the image buffer reaches 30 sheets, step S3 is performed.
In this embodiment, the buffer pool is used for temporary video storage established for each pedestrian target, and corresponds to each pedestrian number, when the storage amount reaches 30, the buffer pool is emptied, storage of a new picture is restarted, and the corresponding pedestrian number is unchanged. And when the content of the buffer pool cannot be updated for a long time, the pedestrian with the number corresponding to the buffer pool leaves the video monitoring range, and the buffer pool is destroyed after the preset updating time is exceeded.
In this embodiment, in step S3, the dense optical flow calculation uses a Farneback algorithm to obtain images of two adjacent frames, the images are regarded as a function of two-dimensional signals, a neighborhood (generally, a square region of 2n + 1) is set around each pixel, a function relation between a gray value and a position is constructed by using a least square method, and then a two-dimensional signal space of an original cartesian coordinate system image is converted into another vector space to obtain a pixel displacement difference between the two frames, so as to obtain an optical flow.
In this embodiment, in step S4, the training process of the dual-stream network is as follows:
s41: making a binary data set for training, extracting each frame of the acquired video segment, performing pedestrian detection, pedestrian tracking and cutting according to the previous method, calculating the optical flow of each frame, and storing the cut original image and the optical flow image of the corresponding position according to the manually marked type, wherein the image is climbed as a positive sample and the image is not climbed as a negative sample;
s42: and randomly selecting 3 cut original frames and optical flows of corresponding areas from a positive sample library as positive samples, selecting negative samples by using the same method, sending the negative samples into a double-flow network for classification training, and obtaining and storing the double-flow network after training.
The embodiment also provides a system for detecting the climbing behavior based on the deep learning double-flow network, which detects the climbing behavior by adopting the detection method, and comprises the following steps:
the target detection module is used for detecting pedestrians in the original video by using a target detection network to obtain a pedestrian detection frame; tracking by using the detection frame and the time sequence information of the video to obtain a target number;
the segment cutting module is used for cutting the original video according to the detection frame and the target number, removing the area outside the detection frame and storing the pedestrian with each number into a video segment again;
the random sampling module is used for randomly sampling and setting the number of sampling frames for each set frame number aiming at each video clip, and calculating by using a dense optical flow method to obtain the optical flow information of each pixel of the sampling frames;
the motion classification module is used for sending the color image and the optical flow information of each sampling frame into a climbing two-classification double-flow network, classifying the set frame number and determining whether climbing behaviors exist or not;
the central processing module is used for sending instructions to other modules to complete related actions;
the target detection module, the fragment cutting module, the random sampling module and the action classification module are all electrically connected with the central processing module.
Example two
As shown in fig. 1, the embodiment provides a method for detecting a climbing behavior based on a deep learning dual-flow network, which includes the following specific processes:
step 1: firstly, using a target detection network to detect pedestrians in an original video to obtain a detection frame of the pedestrians; tracking by using the detection frame and the time sequence information of the video to obtain a target number;
the target detection network adopted in the embodiment is a YOLO network, the YOLO network can simultaneously detect a plurality of pedestrian targets by using a YOLO target detection algorithm, and the outline area of each pedestrian target is selected by using a rectangular bounding box to serve as a subsequent processing candidate area.
Step 2, cutting the original video according to the detection frame and the target number, removing the area outside the detection frame, and storing the pedestrian with each number into a video segment again;
the method comprises the following specific steps:
cutting the candidate regions obtained in the step 1, cutting a square region by taking the maximum value of the length or width of a rectangular frame of each region as the side length and the central point of the rectangular frame as a cutting center, and then adjusting the size of the image to 224 × 224;
and (3) establishing an image buffer pool for each person according to the pedestrian number obtained in the step (1) and the sampling frequency set during the double-current network training, and placing the image with the adjusted size into the buffer pool. And when the image buffer reaches the set number, performing the step 3. The size of the buffer pool is set according to the data sampling frequency during the training of the TSN dual-flow network, and if the TSN is trained and every 30 frames are sliced, the size of the buffer pool is also 30 during inference.
The buffer pool is used for temporarily storing the video established for each pedestrian target, corresponds to each pedestrian number, is emptied when the storage capacity reaches 30, and restarts the storage of new pictures without changing the corresponding pedestrian number. And when the content of the buffer pool cannot be updated for a long time, the pedestrian with the number corresponding to the buffer pool leaves the video monitoring range, and the buffer pool is destroyed after the preset updating time is exceeded.
The existing algorithm can only be applied to an ideal environment with only one person in a video generally, and has poor effect in an actual environment. The method can detect the climbing behavior under the condition that a plurality of people exist in the original video, and has better practical application effect. As shown in fig. 2, n pedestrians exist in the original video, the n pedestrians are detected through the target detection network, the detection frames are marked, tracking is performed by using the detection frames and the time sequence information of the video, and each pedestrian is numbered. Then, the video is edited again according to the detection frames and the serial numbers, the pedestrian with each serial number is saved into a video segment again, each video segment only keeps the area in the pedestrian detection frame with the corresponding serial number, for example, the video I is the area in the detection frame of the person I cut out from the original video, and the video I
Figure BDA0003037969960000061
Cutting out people from an original video
Figure BDA0003037969960000062
Regions within the frame are detected.
And 3, randomly sampling 3 frames in every 30 frames of each video segment, and calculating by using a dense optical flow method to obtain optical flow information of each pixel of the 3 frames.
The dense optical flow calculation adopts a Farneback algorithm, takes images of two adjacent frames, regards the images as a function of a two-dimensional signal, sets a neighborhood (generally a square area with 2n +1, and if n is 2, the neighborhood is a square area with 5 x 5) around each pixel point, and adopts a least square method to construct a function relation of a gray value and a position. And converting the two-dimensional signal space of the original Cartesian coordinate system image into other vector spaces, and obtaining the pixel displacement difference between two frames to obtain the optical flow.
Compared with a sparse optical flow method, the dense optical flow method has a better registration effect, can more accurately compare the action change of pedestrians in the image, and provides more accurate time sequence information for the next climbing identification. Accurate time sequence information reduces the requirement of the number of samples, and the detection efficiency can be effectively improved by adopting a random sampling method.
And 4, sending the color image and the optical flow information of the 3 frames into a climbing two-class double-flow network, classifying the 30 frames, and determining whether climbing exists.
The climbing recognition belongs to a binary task, and the image buffer pool is divided into a climbing state or an unsmoothed state.
The double-flow network training process firstly needs to make a binary data set for training, extracts each frame of an acquired video segment, performs pedestrian detection, pedestrian tracking and cutting according to the previous method, calculates the optical flow of each frame, stores the cut original image and the optical flow image of the corresponding position according to the manually marked type, wherein climbing is used as a positive sample, and non-climbing is used as a negative sample. And during training, randomly selecting 3 cut original frames from a positive sample library and optical flows of corresponding areas as positive samples, selecting negative samples by using the same method, and sending the negative samples into a double-current network for classification training.
As shown in fig. 3, the dual-stream network adopted in this embodiment is a TSN dual-stream network, the TSN network is a variant of a twin network, and is divided into a time convolution network and a space convolution network, and parameters of the two convolution networks are not shared. The three color pictures respectively pass through a spatial convolution network, then three output results are fused through a segment consensus function to generate spatial segment consensus, the optical flow information of the three color pictures respectively passes through a temporal convolution network, and then the three output results are fused through the segment consensus function to generate time segment consensus. The prediction fusion of all modes then yields the final prediction result.
As shown in fig. 4a and 4b, the structure diagrams of the spatial convolution network and the temporal convolution network in the TSN dual-flow network are shown, the two convolution networks only have a slight difference in the input layer, the input data of the spatial convolution network is 224 × 3, which is three channels of RGB, and the input data of the temporal convolution network is 224 × 2, which is two channels of optical flow, vertical and horizontal.
After training, the dual-flow network can perform secondary classification on 3 frames of RGB images and optical flow images with the size of 224 × 224, so as to achieve the purpose of classifying the buffer pool.
The classification network obtained through learning has good robustness, and can be accurately classified under different illumination and different weather conditions.
In summary, in the method for detecting a climbing behavior based on a deep learning dual-flow network according to the embodiment, the classification network obtained through learning has good robustness, and can be accurately classified under different illumination and different weather conditions, so that multi-user behavior detection under complex conditions is realized; and the video is cut, redundant background information is removed, algorithm execution efficiency is greatly improved, detection efficiency is effectively improved by utilizing a pedestrian tracking random sampling method, and the method is worthy of being popularized and used.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (7)

1. A climbing behavior detection method based on a deep learning double-flow network is characterized by comprising the following steps:
s1: target detection, tracking and numbering
Carrying out pedestrian detection on the original video by using a target detection network to obtain a pedestrian detection frame; tracking by using the detection frame and the time sequence information of the video to obtain a target number;
s2: cropping a target video segment
Cutting the original video according to the detection frame and the target number, removing the area outside the detection frame, and storing the pedestrian with each number into a video segment again;
s3: random sampling
Randomly sampling and setting the number of sampling frames for each set frame number aiming at each video clip in the step S2, and calculating by using a dense optical flow method to obtain optical flow information of each pixel of the sampling frames;
s4: action classification
And (3) sending the color image and the optical flow information of each sampling frame into a climbing binary double-flow network, classifying the set frame number, and determining whether the climbing behavior exists.
2. The method for detecting the climbing behavior based on the deep learning dual-flow network according to claim 1, characterized in that: in step S1, the adopted target detection network is a YOLO network, and a plurality of pedestrian targets in the video are simultaneously processed through the YOLO network, and the contour region of each pedestrian target is framed in a form of a rectangular bounding box and is used as a subsequent processing candidate region.
3. The method for detecting the climbing behavior based on the deep learning dual-flow network as claimed in claim 2, wherein: the specific process of step S2 is as follows:
s21: cutting the candidate area obtained in the step S1, cutting a square area by taking the maximum value of the length or width of the rectangular surrounding frame of each area as the side length and the central point of the rectangular frame as the cutting center, and then adjusting the size of the image to a set size;
s22: an image buffer pool of 30 is created for each person based on the pedestrian number obtained in step S1, the resized image is placed in the buffer pool, and when the image buffer reaches 30 sheets, step S3 is performed.
4. The method for detecting the climbing behavior based on the deep learning dual-flow network as claimed in claim 3, wherein: the buffer pool is used for temporary video storage established for each pedestrian target, corresponds to each pedestrian number, is emptied when the storage capacity reaches 30, and restarts storage of new pictures without changing the corresponding pedestrian number; and when the content of the buffer pool cannot be updated for a long time, the pedestrian with the number corresponding to the buffer pool leaves the video monitoring range, and the buffer pool is destroyed after the preset updating time is exceeded.
5. The method for detecting the climbing behavior based on the deep learning dual-flow network according to claim 1, characterized in that: in step S3, the dense optical flow calculation uses a Farneback algorithm to obtain images of two adjacent frames, regards the images as a function of two-dimensional signals, sets a neighborhood around each pixel, and uses a least square method to construct a function relation between a gray value and a position, so as to convert a two-dimensional signal space of an original cartesian coordinate system image to another vector space, and obtain a pixel displacement difference between the two frames to obtain an optical flow.
6. The method for detecting the climbing behavior based on the deep learning dual-flow network according to claim 1, characterized in that: in step S4, the training process of the dual-stream network is as follows:
s41: making a binary data set for training, extracting each frame of the acquired video segment, performing pedestrian detection, pedestrian tracking and cutting according to the previous method, calculating the optical flow of each frame, and storing the cut original image and the optical flow image of the corresponding position according to the manually marked type, wherein the image is climbed as a positive sample and the image is not climbed as a negative sample;
s42: and randomly selecting 3 cut original frames and optical flows of corresponding areas from a positive sample library as positive samples, selecting negative samples by using the same method, sending the negative samples into a double-flow network for classification training, and obtaining and storing the double-flow network after training.
7. A climbing behavior detection system based on a deep learning dual-flow network is characterized in that the climbing behavior is detected by the detection method according to any one of claims 1 to 6, and the method comprises the following steps:
the target detection module is used for detecting pedestrians in the original video by using a target detection network to obtain a pedestrian detection frame; tracking by using the detection frame and the time sequence information of the video to obtain a target number;
the segment cutting module is used for cutting the original video according to the detection frame and the target number, removing the area outside the detection frame and storing the pedestrian with each number into a video segment again;
the random sampling module is used for randomly sampling and setting the number of sampling frames for each set frame number aiming at each video clip, and calculating by using a dense optical flow method to obtain the optical flow information of each pixel of the sampling frames;
the motion classification module is used for sending the color image and the optical flow information of each sampling frame into a climbing two-classification double-flow network, classifying the set frame number and determining whether climbing behaviors exist or not;
the central processing module is used for sending instructions to other modules to complete related actions;
the target detection module, the fragment cutting module, the random sampling module and the action classification module are all electrically connected with the central processing module.
CN202110448771.7A 2021-04-25 2021-04-25 Deep learning double-flow network-based climbing behavior detection method and system Pending CN113052139A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110448771.7A CN113052139A (en) 2021-04-25 2021-04-25 Deep learning double-flow network-based climbing behavior detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110448771.7A CN113052139A (en) 2021-04-25 2021-04-25 Deep learning double-flow network-based climbing behavior detection method and system

Publications (1)

Publication Number Publication Date
CN113052139A true CN113052139A (en) 2021-06-29

Family

ID=76520585

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110448771.7A Pending CN113052139A (en) 2021-04-25 2021-04-25 Deep learning double-flow network-based climbing behavior detection method and system

Country Status (1)

Country Link
CN (1) CN113052139A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113537165A (en) * 2021-09-15 2021-10-22 湖南信达通信息技术有限公司 Detection method and system for pedestrian alarm
WO2023000253A1 (en) * 2021-07-22 2023-01-26 京东方科技集团股份有限公司 Climbing behavior early-warning method and apparatus, electrode device, and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509859A (en) * 2018-03-09 2018-09-07 南京邮电大学 A kind of non-overlapping region pedestrian tracting method based on deep neural network
CN110378259A (en) * 2019-07-05 2019-10-25 桂林电子科技大学 A kind of multiple target Activity recognition method and system towards monitor video
US20200218901A1 (en) * 2019-01-03 2020-07-09 James Harvey ELDER System and method for automated video processing of an input video signal using tracking of a single moveable bilaterally-targeted game-object
CN111611912A (en) * 2020-05-19 2020-09-01 北京交通大学 Method for detecting pedestrian head lowering abnormal behavior based on human body joint points
US10814815B1 (en) * 2019-06-11 2020-10-27 Tangerine Innovation Holding Inc. System for determining occurrence of an automobile accident and characterizing the accident
CN111985402A (en) * 2020-08-20 2020-11-24 广东电网有限责任公司电力科学研究院 Substation security fence crossing behavior identification method, system and equipment
CN112183240A (en) * 2020-09-11 2021-01-05 山东大学 Double-current convolution behavior identification method based on 3D time stream and parallel space stream
CN112270310A (en) * 2020-11-24 2021-01-26 上海工程技术大学 Cross-camera pedestrian multi-target tracking method and device based on deep learning

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509859A (en) * 2018-03-09 2018-09-07 南京邮电大学 A kind of non-overlapping region pedestrian tracting method based on deep neural network
US20200218901A1 (en) * 2019-01-03 2020-07-09 James Harvey ELDER System and method for automated video processing of an input video signal using tracking of a single moveable bilaterally-targeted game-object
US10814815B1 (en) * 2019-06-11 2020-10-27 Tangerine Innovation Holding Inc. System for determining occurrence of an automobile accident and characterizing the accident
CN110378259A (en) * 2019-07-05 2019-10-25 桂林电子科技大学 A kind of multiple target Activity recognition method and system towards monitor video
CN111611912A (en) * 2020-05-19 2020-09-01 北京交通大学 Method for detecting pedestrian head lowering abnormal behavior based on human body joint points
CN111985402A (en) * 2020-08-20 2020-11-24 广东电网有限责任公司电力科学研究院 Substation security fence crossing behavior identification method, system and equipment
CN112183240A (en) * 2020-09-11 2021-01-05 山东大学 Double-current convolution behavior identification method based on 3D time stream and parallel space stream
CN112270310A (en) * 2020-11-24 2021-01-26 上海工程技术大学 Cross-camera pedestrian multi-target tracking method and device based on deep learning

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023000253A1 (en) * 2021-07-22 2023-01-26 京东方科技集团股份有限公司 Climbing behavior early-warning method and apparatus, electrode device, and storage medium
US11990010B2 (en) 2021-07-22 2024-05-21 Boe Technology Group Co., Ltd. Methods and apparatuses for early warning of climbing behaviors, electronic devices and storage media
CN113537165A (en) * 2021-09-15 2021-10-22 湖南信达通信息技术有限公司 Detection method and system for pedestrian alarm
CN113537165B (en) * 2021-09-15 2021-12-07 湖南信达通信息技术有限公司 Detection method and system for pedestrian alarm

Similar Documents

Publication Publication Date Title
CN111767882B (en) Multi-mode pedestrian detection method based on improved YOLO model
CN103824070B (en) A kind of rapid pedestrian detection method based on computer vision
WO2021098261A1 (en) Target detection method and apparatus
CN103839065B (en) Extraction method for dynamic crowd gathering characteristics
CN108062525B (en) Deep learning hand detection method based on hand region prediction
CN112001339A (en) Pedestrian social distance real-time monitoring method based on YOLO v4
CN110929593B (en) Real-time significance pedestrian detection method based on detail discrimination
CN113936256A (en) Image target detection method, device, equipment and storage medium
CN111814638B (en) Security scene flame detection method based on deep learning
CN109859246B (en) Low-altitude slow unmanned aerial vehicle tracking method combining correlation filtering and visual saliency
CN110852222A (en) Campus corridor scene intelligent monitoring method based on target detection
CN104392461B (en) A kind of video tracing method based on textural characteristics
CN107103299B (en) People counting method in monitoring video
CN113052139A (en) Deep learning double-flow network-based climbing behavior detection method and system
CN114399734A (en) Forest fire early warning method based on visual information
CN115841649A (en) Multi-scale people counting method for urban complex scene
Zhu et al. Towards automatic wild animal detection in low quality camera-trap images using two-channeled perceiving residual pyramid networks
CN112560619A (en) Multi-focus image fusion-based multi-distance bird accurate identification method
CN111008994A (en) Moving target real-time detection and tracking system and method based on MPSoC
CN111582074A (en) Monitoring video leaf occlusion detection method based on scene depth information perception
CN111723773A (en) Remnant detection method, device, electronic equipment and readable storage medium
CN108898098A (en) Early stage video smoke detection method based on monitor supervision platform
CN110414430B (en) Pedestrian re-identification method and device based on multi-proportion fusion
CN110348329B (en) Pedestrian detection method based on video sequence interframe information
CN111461076A (en) Smoke detection method and smoke detection system combining frame difference method and neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination