CN112163537B - Pedestrian abnormal behavior detection method, system, terminal and storage medium - Google Patents

Pedestrian abnormal behavior detection method, system, terminal and storage medium Download PDF

Info

Publication number
CN112163537B
CN112163537B CN202011069611.3A CN202011069611A CN112163537B CN 112163537 B CN112163537 B CN 112163537B CN 202011069611 A CN202011069611 A CN 202011069611A CN 112163537 B CN112163537 B CN 112163537B
Authority
CN
China
Prior art keywords
pedestrian
key point
space
skeleton key
time diagram
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011069611.3A
Other languages
Chinese (zh)
Other versions
CN112163537A (en
Inventor
胡金星
杨戈
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN202011069611.3A priority Critical patent/CN112163537B/en
Publication of CN112163537A publication Critical patent/CN112163537A/en
Application granted granted Critical
Publication of CN112163537B publication Critical patent/CN112163537B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Traffic Control Systems (AREA)

Abstract

The application relates to a pedestrian abnormal behavior detection method, a pedestrian abnormal behavior detection system, a terminal and a storage medium. Comprising the following steps: extracting skeleton key point information of pedestrians from all video frames under at least two paths of monitoring videos, and acquiring three-dimensional space coordinate information of each skeleton key point under the same coordinate system; constructing a skeleton key point space-time diagram of the pedestrian according to the three-dimensional space information of each skeleton key point; inputting the skeleton key point space-time diagram of the pedestrian into a space-time diagram neural network model, calculating to obtain an abnormal score value of the pedestrian, and judging whether the pedestrian behaves abnormally according to the abnormal score value. The application can reduce the loss of shooting projection information, improve the pedestrian action detail detection capability and more effectively identify the pedestrian abnormal behavior.

Description

Pedestrian abnormal behavior detection method, system, terminal and storage medium
Technical Field
The application belongs to the technical field of pedestrian pattern recognition, and particularly relates to a pedestrian abnormal behavior detection method, a system, a terminal and a storage medium.
Background
As the urban scale is continuously enlarged, the urban public space is more complex, the urban population is more huge, and great challenges are brought to urban safety management. With the continuous construction and development of smart cities, surveillance videos are widely popularized and continuously upgraded, and pedestrian anomaly detection based on the surveillance videos plays a significant role in the field of urban safety management.
At present, pedestrian abnormal behavior detection based on a monitoring video is mainly based on a single viewpoint monitoring video, and abnormal behavior analysis of motion characteristics and texture characteristics is carried out on the basis of the whole or the whole video frame of a pedestrian, or detection and identification are carried out on the basis of key points of a human body. The method for analyzing the abnormality of the key points of the human body by adopting single video monitoring has certain limitation in practical use, and is specifically characterized in the following points:
1. The pedestrian behavior mode monitoring analysis of the cross-video cannot be performed, and the pedestrian three-dimensional key points are difficult to effectively recover by combining a plurality of viewpoint monitoring videos for analysis.
2. Due to the lack of video geographic calibration, modeling analysis, situation analysis and predictive early warning are difficult to perform on large-scale pedestrian behaviors.
3. Because manual annotation data requires a lot of manpower, a coarser-granularity annotation method (for example, only annotating a video segment including a pedestrian anomaly event but not annotating a specific time and a specific pedestrian) is generally preferred to save labor cost, such coarse-granularity annotation data is still lacking at present, and an anomaly detection model is trained in a weak supervised learning manner.
Disclosure of Invention
The application provides a pedestrian abnormal behavior detection method, a pedestrian abnormal behavior detection system, a pedestrian abnormal behavior detection terminal and a storage medium, and aims to at least solve one of the technical problems in the prior art to a certain extent.
In order to solve the problems, the application provides the following technical scheme:
a pedestrian abnormal behavior detection method, comprising:
extracting skeleton key point information of pedestrians from all video frames under at least two paths of monitoring videos, and acquiring three-dimensional space coordinate information of each skeleton key point under the same coordinate system;
constructing a skeleton key point space-time diagram of the pedestrian according to the three-dimensional space information of each skeleton key point;
Inputting the skeleton key point space-time diagram of the pedestrian into a space-time diagram neural network model, calculating to obtain an abnormal score value of the pedestrian, and judging whether the pedestrian behaves abnormally according to the abnormal score value.
The technical scheme adopted by the embodiment of the application further comprises the following steps: the method for extracting the skeleton key point information of the pedestrian from all video frames under at least two paths of monitoring videos, and acquiring three-dimensional space coordinate information of each skeleton key point under the same coordinate system further comprises the following steps:
And respectively carrying out geographic calibration on at least two paths of monitoring videos based on a unified space geographic coordinate frame, and converting image pixel coordinates and world geographic coordinates of the monitoring videos.
The technical scheme adopted by the embodiment of the application further comprises the following steps: the converting the image pixel coordinates and the world geographic coordinates of the monitoring video comprises the following steps:
Respectively selecting control points of each path of monitoring video by adopting a geographic coordinate system, and acquiring pixel coordinates and geographic coordinates of the control points;
acquiring camera parameters and correcting image distortion by using a camera calibration method;
selecting a rectangular interested ground area, performing perspective transformation and geographic registration on the monitoring video, and performing conversion between image pixel coordinates and world geographic coordinates:
In the above formula, x' i,y′i is the plane coordinate of the rectangular angular point image, x i,yi is the world ground plane coordinate of the rectangular angular point image obtained by the control point, M is the reference matrix of the known camera, s is the known scale factor, R is the unknown rotation matrix, and T is the unknown translation vector; solving R and T through 4 corner coordinate pairs of the rectangle to obtain the mapping from the image coordinate in the rectangle to the world coordinate on the image:
The technical scheme adopted by the embodiment of the application further comprises the following steps: the extracting skeleton key point information of pedestrians from all video frames under at least two paths of monitoring videos and acquiring three-dimensional space coordinate information of each skeleton key point under the same coordinate system comprises the following steps:
Extracting J pieces of skeleton key point information of m (m is more than or equal to 1) pedestrians based on a human body posture estimation model in an optimized mode, wherein the skeleton key point information comprises image coordinates (u, v) of key points and confidence degrees c of the key points;
Identifying the same pedestrians under different viewpoints of the same time stamp and under the same viewpoint of different time stamps according to the at least two paths of monitoring videos, taking the minimum outsourcing rectangle of the skeleton key point information of each pedestrian as a boundary frame of each pedestrian, solving the geographic coordinates of the pixel coordinates in the bottom edge of the boundary frame, and taking the geographic coordinates as the pedestrian target geographic position;
extracting the re-recognition features of the images under the boundary boxes of each pedestrian based on ResNet re-recognition feature extraction models;
Performing target matching on the same pedestrian in at least two viewpoints based on the pedestrian target geographic position and the re-recognition feature;
the projection matrix p of each viewpoint is obtained through a calibration method, and the two-dimensional coordinates of each homonymous key point J (J is more than or equal to 1 and less than or equal to J) of the same line of people under different viewpoints and the same time stamp t are obtained through a direct linear transformation algorithm Transformation into unique three-dimensional space coordinates/>And solving the three-dimensional space coordinates of each skeleton key point to obtain the three-dimensional space coordinate information of each skeleton key point.
The technical scheme adopted by the embodiment of the application further comprises the following steps: the solving of the three-dimensional space coordinates of each bone key point comprises:
minX‖AX‖
In the above-mentioned formula(s), Is three-dimensional space coordinates,/>N≥2,/>The kth row of the projection matrix for the nth viewpoint, u N、vN, is the bone key point image coordinate component at viewpoint N, respectively.
The technical scheme adopted by the embodiment of the application further comprises the following steps: the constructing the skeleton key point space-time diagram of the pedestrian according to the three-dimensional space information of each skeleton key point comprises the following steps:
Based on the three-dimensional space information of each bone key point Taking the skeleton key points of the same pedestrian as the vertexes of the skeleton key point space-time diagram of the pedestrian, wherein the characteristic information f (f epsilon R 1×4) of each vertex consists of a three-dimensional coordinate system and confidence coefficient under the same space coordinate system;
Establishing edge connection for each vertex according to the edge connection construction rule to obtain a pedestrian skeleton key point space-time diagram Wherein/>An adjacency matrix for the space-time diagram;
The edge connection construction rule includes:
Establishing edge connection for skeleton key points with natural connection structures of human bodies under the same video frame or timestamp;
and establishing edge connection for the adjacent key points of the bones with the same name in time sequence.
The technical scheme adopted by the embodiment of the application further comprises the following steps: inputting the skeleton key point space-time diagram of the pedestrian into a space-time diagram neural network model, calculating to obtain an abnormal score value of the pedestrian, and judging whether the behavior of the pedestrian is abnormal according to the abnormal score value comprises the following steps:
The input of the space-time diagram neural network model is a skeleton key point space-time diagram of each pedestrian The space-time diagram neural network model is trained in a weak supervision mode, when a training sample is manufactured, skeleton key points extracted from monitoring videos with the duration of L frames are divided into n skeleton key point space-time diagram segments in time to form a multi-example packet, an example packet containing a pedestrian abnormal behavior event is marked as a positive example multi-example packet B a, and an example packet not containing the pedestrian abnormal behavior event is marked as a negative example multi-example packet B n; when the model is trained, randomly selecting m positive example multi-example packages and m negative example multi-example packages as a mini-batch to input the space-time diagram neural network model, outputting an abnormal score value of each pedestrian, and judging that the pedestrian is abnormal when the abnormal score value of each pedestrian is greater than or equal to a set abnormal score threshold value T.
The embodiment of the application adopts another technical scheme that: a pedestrian abnormal behavior detection system comprising:
a skeleton key point extraction module: the method comprises the steps of extracting skeleton key point information of pedestrians from all video frames under at least two paths of monitoring videos, and obtaining three-dimensional space coordinate information of each skeleton key point under the same coordinate system;
and a space-time diagram construction module: the skeleton key point space-time diagram is used for constructing the pedestrian according to the three-dimensional space information of each skeleton key point;
An abnormality detection module: and the method is used for inputting the skeleton key point space diagram of the pedestrian into a space diagram neural network model, calculating to obtain an abnormal score value of the pedestrian, and judging whether the pedestrian behaves abnormally according to the abnormal score value.
The embodiment of the application adopts the following technical scheme: a terminal comprising a processor, a memory coupled to the processor, wherein,
The memory stores program instructions for implementing the pedestrian abnormal behavior detection method;
the processor is used for executing the program instructions stored in the memory to control pedestrian abnormal behavior detection.
The embodiment of the application adopts the following technical scheme: a storage medium storing program instructions executable by a processor for performing the pedestrian abnormal behavior detection method.
Compared with the prior art, the embodiment of the application has the beneficial effects that: according to the pedestrian abnormal behavior detection method, system, terminal and storage medium, three-dimensional space coordinate information of skeleton key points is obtained through combining multi-angle monitoring videos, a time connection relation and a space connection relation of the skeleton key points are constructed in a space-time diagram structure mode, a space-time diagram neural network model is adopted to fit mapping between space-time sequences of the skeleton key points and abnormal score values, information loss caused by shooting projection can be reduced, action detail detection capability of pedestrians is improved, and abnormal behaviors of the pedestrians are detected more effectively.
Drawings
FIG. 1 is a flowchart of a pedestrian abnormal behavior detection method of an embodiment of the application;
FIG. 2 is a schematic diagram of a pedestrian abnormal behavior detection system according to an embodiment of the present application;
Fig. 3 is a schematic diagram of a terminal structure according to an embodiment of the present application;
Fig. 4 is a schematic structural diagram of a storage medium according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
Referring to fig. 1, a flowchart of a pedestrian abnormal behavior detection method according to an embodiment of the application is shown. The pedestrian abnormal behavior detection method provided by the embodiment of the application comprises the following steps of:
S10: respectively carrying out geographic calibration on the multi-channel monitoring video based on a unified space geographic coordinate frame, and converting image pixel coordinates and world geographic coordinates of the monitoring video;
In the step, in a three-dimensional video monitoring network formed by a plurality of monitoring cameras, the invention carries out multi-viewpoint monitoring on each pedestrian in the same time period and the same area, namely, each pedestrian is monitored by at least two monitoring cameras at the same time. The specific implementation process of the conversion of the image pixel coordinates and the world geographic coordinates comprises the following steps:
S11: adopting a unified geographic coordinate system (such as WGS 84), selecting control points for each path of monitoring video respectively, and acquiring pixel coordinates and geographic coordinates of the control points;
Step S12: obtaining camera parameters and correcting image distortion by using a camera calibration method (such as Zhang Zhengyou camera calibration method);
Step S13: selecting a rectangular region of interest ground, performing perspective transformation and geographic registration on the monitoring video, and realizing the conversion between the pixel coordinates of the image and the geographic coordinates of the world; the 4 corner coordinate pairs of the rectangular interesting ground area have the following perspective transformation relation:
in the formula (1), x' i,y′i is the plane coordinate of the rectangular angular point image, x i,yi is the world ground plane coordinate of the rectangular angular point image (acquired by a control point), M is the reference matrix of the known camera, s is the known scale factor, R is the unknown rotation matrix, and T is the unknown translation vector; solving R and T through 4 corner coordinate pairs of the rectangle to obtain the mapping from the image coordinate in the rectangle to the world coordinate on the image:
S20: extracting skeleton key point information of each pedestrian from all video frames under the multipath monitoring video, and acquiring three-dimensional space coordinate information of each skeleton key point under the same coordinate system;
in this step, skeletal key points include human body parts such as the head (left eye, right eye, nose, left ear, right ear), neck, left shoulder, right shoulder, left elbow, right elbow, left hand, right hand, left hip, right hip, left knee, right knee, left foot, right foot, and the like. The three-dimensional space coordinate information of the skeleton key points under the same coordinate system comprises the three-dimensional space coordinates of each key point under the same coordinate system and the corresponding confidence coefficient. The three-dimensional space coordinate information extraction method specifically comprises the following steps:
S21: j skeletal key points of m (m is more than or equal to 1) pedestrians are optimally extracted based on a human body posture estimation model (such as Openpose and the like), wherein the skeletal key points comprise image coordinates (u, v) of the key points and confidence degrees c of the key points;
S22: identifying the same pedestrians under different viewpoints of the same time stamp and under the same viewpoint of different time stamps according to the multi-channel monitoring video, taking the minimum outsourcing rectangle of the extracted skeleton key point information of each pedestrian as a boundary frame corresponding to each pedestrian, solving the geographic coordinate of the pixel coordinate in the bottom edge of the boundary frame, and taking the geographic coordinate as a pedestrian target geographic position;
s23: extracting re-recognition features of the images under each pedestrian boundary box based on ResNet re-recognition feature extraction models;
S24: performing target matching on the same pedestrian under different viewpoints based on the geographic position and the re-recognition characteristic of the pedestrian target;
Further, the pedestrian target matching includes two cases of pedestrian target matching under the same time stamp and different viewpoints and pedestrian target matching under the same viewpoint and different time stamps. Wherein,
The pedestrian target matching under the same time stamp and different viewpoints is specifically as follows: taking a pedestrian with the radius of R m in the next view point as a matching candidate object set { l m } of the target pedestrian for the target pedestrian in the current view point; calculating the re-recognition feature distance between the target pedestrian and each pedestrian in each matching candidate object set { l m }, if the re-recognition feature distance between the target pedestrian and a certain pedestrian in the matching candidate object set is minimum under a certain viewpoint, the target pedestrian is considered to be successfully matched, the skeleton key point marks of the matched pedestrians are marked with the pedestrian label l m together, otherwise, the matched pedestrians of the target pedestrian are continuously searched under another viewpoint; and discarding the target pedestrian with unsuccessful matching when the pedestrian matching is unsuccessful after all the viewpoints are traversed. In the embodiment of the application, a Hungary algorithm and a circular consistency strategy can be adopted to obtain a better pedestrian matching result.
The pedestrian target matching under the same viewpoint and different time stamps is specifically as follows: extracting re-identification features of target pedestrians identified under different time stampsAnd calculates the re-recognition feature/>Feature distance of each re-identification feature in the matched candidate object set { l m }, taking the pedestrian with the nearest feature distance as the matched pedestrian of the target pedestrian, namelyWhere distance () is a feature distance function.
In the embodiment of the application, the pedestrian targets under the same time stamp and different viewpoints are matched with the pedestrian targets under different time stamps and the same viewpoint respectively through two independent re-recognition models; during model training, the pedestrian re-identification models under the same time stamp and different viewpoints are trained by adopting images under different viewpoints to learn global matching features, and the pedestrian re-identification models under the same viewpoint and different time stamps are trained by adopting images under the same viewpoints to learn local matching features.
In the embodiment of the application, the pedestrian target matching under the same viewpoint and different time stamps can also be matched based on the optimization of the multi-target tracking model, and if the intersection ratio of the boundary box of a certain pedestrian under a certain time stamp and the minimum outsourcing rectangle of a certain skeleton key point is maximum, the two pedestrians are matched into the same pedestrian.
S25: the projection matrix p of the view point (camera) N is obtained through a calibration method, and a direct linear transformation algorithm is used for enabling two-dimensional coordinates of a plurality of homonymous key points J (J is more than or equal to 1 and less than or equal to J) of the same line of people under N different view points and the same time stamp tTransformation into unique three-dimensional space coordinates/>Solving the three-dimensional space coordinates of each skeleton key point to obtain three-dimensional space coordinate information of each skeleton key point;
Wherein, since each bone key point has N confidence degrees under N viewpoints, the N confidence degrees of each bone key point are averaged to be used as new confidence degrees Thereby obtaining three-dimensional space information/>, of each bone key point jThe three-dimensional space coordinate solving equation is:
minX‖AX‖ (3)
In the formula (3), Is three-dimensional space coordinates,/>N≥2,/>The kth row of the projection matrix for the nth camera, u N、vN, is the bone key point image coordinate component at viewpoint N, respectively.
Based on the above, the embodiment of the application performs pedestrian target matching under different viewpoints based on the geographic position and the re-recognition feature, solves the problem of limited matching performance based on single feature, and improves the robustness of pedestrian target matching under different viewpoints.
S30: based on three-dimensional spatial information of each bone key pointConstructing a space-time diagram of key points of bones of pedestrians in a cross-video scene;
In the step, the construction mode of the pedestrian skeleton key point space-time diagram specifically comprises the following steps:
S31: based on three-dimensional spatial information of each bone key point Taking the skeleton key points of the same pedestrian l m as the vertexes of the skeleton key point space-time diagram, wherein the characteristic information f (f epsilon R 1×4) of each vertex consists of a three-dimensional coordinate system and a confidence coefficient (namely (x, y, z, c) under the same space coordinate system, wherein x, y and z are three-dimensional space coordinate components, and c is the confidence coefficient);
s32: establishing edge connection for each vertex according to the edge connection construction rule to obtain a pedestrian skeleton key point space-time diagram Wherein/>An adjacency matrix for the space-time diagram;
the edge connection construction rule comprises the following steps:
a: establishing edge connection for skeleton key points with natural connection structures of human bodies under the same video frame or timestamp;
b: and establishing edge connection for the adjacent key points of the bones with the same name in time sequence.
S40: inputting a space-time diagram of skeleton key points of each pedestrian into a space-time diagram neural network model, calculating to obtain an abnormal score value of each pedestrian, and judging whether the pedestrian behaviors are abnormal according to a set abnormal score value threshold;
In the step, the space-time diagram neural network model consists of a plurality of convolution layers, a pooling layer and a full connection layer, the space-time diagram of the key point of the global pedestrian is calculated through the plurality of space-time convolution layers and the pooling layer to obtain hidden characteristics, and the mapping between the hidden characteristics and abnormal score values is established through the plurality of full connection layers. The pedestrian abnormal behavior detection is a regression task model based on a space-time diagram neural network model f, and the model is input into a skeleton key point space-time diagram of each pedestrian The output is an anomaly score value for the pedestrian.
In the embodiment of the application, a space-time diagram neural network model is trained in a weak supervision mode (such as multi-example learning), when a training sample is manufactured, skeleton key points extracted from a monitoring video with each time length of L frames are divided into n skeleton key point space-time diagram segments in time to form a multi-example packet, an example packet containing a pedestrian abnormal behavior event is marked as a positive example multi-example packet B a, and an example packet without the pedestrian abnormal behavior event is marked as a negative example multi-example packet B n; when training a neural network model, randomly selecting m positive example multi-example packages and m negative example multi-example packages as a mini-batch input space-time diagram neural network model, outputting an abnormal score value of a pedestrian by the model, judging that the pedestrian acts abnormally when the abnormal score value of the pedestrian is larger than or equal to a set abnormal score threshold value T, and judging that the pedestrian acts normally when the abnormal score value of the pedestrian is smaller than the set abnormal score threshold value T. In the embodiment of the application, taking video application sampled by 30 frames per second as an example, the time length of human body key point space-time diagram segments in each example packet of the training sample is 150 frames, namelyM=30, in practical applications, the video length and the number of example packets may be optimized according to the length of time of the abnormal behavior of interest, etc.
In the embodiment of the application, the loss function of the space-time diagram neural network model is as follows:
In formula (4), f (·) is a space-time diagram neural network, G a is a global pedestrian key-point space-time diagram in a positive example packet, G n is a global pedestrian key-point space-time diagram in a negative example packet, and i e B a,j∈Bn1、λ2 is a weight parameter.
Further, the last layer of the space-time diagram neural network model is a single neuron full-connection layer with an activation function of sigmoid function, the anomaly score threshold t=0.5, the interval of the anomaly score value is (0, 1), and the basis for distinguishing the anomaly behavior is as follows: when the abnormality score value is (0.5, 1), it is determined that the behavior of the pedestrian is abnormal, and when the abnormality score value is (0, 0.5), it is determined that the behavior of the pedestrian is normal.
Based on the above, the pedestrian abnormal behavior detection method of the embodiment of the application acquires the three-dimensional space coordinate information of the skeleton key points by combining the multi-angle monitoring videos, constructs the time connection relation and the space connection relation of the skeleton key points in the form of a space-time diagram structure, adopts the space-time diagram neural network model to fit the mapping between the space-time sequences of the skeleton key points and the abnormal score values, can reduce the information loss caused by shooting projection, improves the action detail detection capability of pedestrians, more effectively detects the abnormal behaviors of the pedestrians, and simultaneously trains the space-time diagram neural network model in a weak supervision mode to reduce the labor cost of fine labeling data.
Fig. 2 is a schematic structural diagram of a pedestrian abnormal behavior detection system according to an embodiment of the application. The pedestrian abnormal behavior detection system 40 of the embodiment of the application includes:
Coordinate conversion module 41: the method is used for respectively carrying out geographic calibration on the multipath monitoring videos based on the unified space geographic coordinate frame, and converting image pixel coordinates and world geographic coordinates of the monitoring videos;
Bone keypoint extraction module 42: the method comprises the steps of extracting skeleton key point information of each pedestrian from all video frames under a multi-path monitoring video, and obtaining three-dimensional space coordinate information of each skeleton key point under the same coordinate system;
space-time diagram construction module 43: the method comprises the steps of constructing a pedestrian skeleton key point space-time diagram under a cross-video scene according to three-dimensional space information of each skeleton key point;
Abnormality detection module 44: the method is used for inputting the skeleton key point space diagram of each pedestrian into the space diagram neural network model, calculating to obtain the abnormal score value of each pedestrian, and judging whether the pedestrian behaviors are abnormal according to the set abnormal score value threshold.
Fig. 3 is a schematic diagram of a terminal structure according to an embodiment of the application. The terminal 50 includes a processor 51, a memory 52 coupled to the processor 51.
The memory 52 stores program instructions for implementing the pedestrian abnormal behavior detection method described above.
The processor 51 is configured to execute program instructions stored in the memory 52 to control pedestrian abnormal behavior detection.
The processor 51 may also be referred to as a CPU (Central Processing Unit ). The processor 51 may be an integrated circuit chip with signal processing capabilities. Processor 51 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Fig. 4 is a schematic structural diagram of a storage medium according to an embodiment of the application. The storage medium of the embodiment of the present application stores a program file 61 capable of implementing all the methods described above, where the program file 61 may be stored in the storage medium in the form of a software product, and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes, or a terminal device such as a computer, a server, a mobile phone, a tablet, or the like.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (9)

1. A pedestrian abnormal behavior detection method, characterized by comprising:
extracting skeleton key point information of pedestrians from all video frames under at least two paths of monitoring videos, and acquiring three-dimensional space coordinate information of each skeleton key point under the same coordinate system;
constructing a skeleton key point space-time diagram of the pedestrian according to the three-dimensional space information of each skeleton key point;
inputting a space-time diagram of the skeleton key points of the pedestrian into a space-time diagram neural network model, calculating to obtain an abnormal score value of the pedestrian, and judging whether the pedestrian behaves abnormally according to the abnormal score value;
The extracting skeleton key point information of pedestrians from all video frames under at least two paths of monitoring videos and acquiring three-dimensional space coordinate information of each skeleton key point under the same coordinate system comprises the following steps:
Extracting J pieces of skeleton key point information of m (m is more than or equal to 1) pedestrians based on a human body posture estimation model in an optimized mode, wherein the skeleton key point information comprises image coordinates (u, v) of key points and confidence degrees c of the key points;
Identifying the same pedestrians under different viewpoints of the same time stamp and under the same viewpoint of different time stamps according to the at least two paths of monitoring videos, taking the minimum outsourcing rectangle of the skeleton key point information of each pedestrian as a boundary frame of each pedestrian, solving the geographic coordinates of the pixel coordinates in the bottom edge of the boundary frame, and taking the geographic coordinates as the target geographic position of the pedestrian;
extracting the re-recognition features of the images under the boundary boxes of each pedestrian based on ResNet re-recognition feature extraction models;
Performing target matching on the same pedestrian in at least two viewpoints based on the pedestrian target geographic position and the re-recognition feature;
the projection matrix p of each viewpoint is obtained through a calibration method, and the two-dimensional coordinates of each homonymous key point J (J is more than or equal to 1 and less than or equal to J) of the same line of people under different viewpoints and the same time stamp t are obtained through a direct linear transformation algorithm Transformation into unique three-dimensional space coordinates/>And solving the three-dimensional space coordinates of each skeleton key point to obtain the three-dimensional space coordinate information of each skeleton key point.
2. The pedestrian abnormal behavior detection method according to claim 1, wherein the steps of extracting skeleton key point information of a pedestrian from all video frames under at least two paths of monitoring videos, and acquiring three-dimensional space coordinate information of each skeleton key point under the same coordinate system further comprise:
And respectively carrying out geographic calibration on at least two paths of monitoring videos based on a unified space geographic coordinate frame, and converting image pixel coordinates and world geographic coordinates of the monitoring videos.
3. The pedestrian abnormal behavior detection method according to claim 2, wherein the converting the monitoring video of the image pixel coordinates and the world geographic coordinates includes:
Respectively selecting control points of each path of monitoring video by adopting a geographic coordinate system, and acquiring pixel coordinates and geographic coordinates of the control points;
acquiring camera parameters and correcting image distortion by using a camera calibration method;
selecting a rectangular interested ground area, performing perspective transformation and geographic registration on the monitoring video, and performing conversion between image pixel coordinates and world geographic coordinates:
In the above formula, x i ,yi is the plane coordinate of the rectangular angular point image, x i,yi is the world ground plane coordinate of the rectangular angular point image obtained by the control point, M is the reference matrix of the known camera, s is the known scale factor, R is the unknown rotation matrix, and T is the unknown translation vector; solving R and T through 4 corner coordinate pairs of the rectangle to obtain the mapping from the image coordinate in the rectangle to the world coordinate on the image:
4. The pedestrian anomaly detection method of claim 1, wherein the solving the three-dimensional spatial coordinates of each skeletal keypoint comprises:
minX||AX||
In the above-mentioned formula(s), Is three-dimensional space coordinates,/>N≥2,/>The kth row of the projection matrix for the nth viewpoint, u N、vN, is the bone key point image coordinate component at viewpoint N, respectively.
5. The pedestrian abnormal behavior detection method according to claim 4, wherein the constructing a skeleton key point space-time diagram of the pedestrian from the three-dimensional spatial information of each skeleton key point comprises:
Based on the three-dimensional space information of each bone key point Taking the skeleton key points of the same pedestrian as the vertexes of the skeleton key point space-time diagram of the pedestrian, wherein the characteristic information f (f epsilon R 1×4) of each vertex consists of a three-dimensional coordinate system and confidence coefficient under the same space coordinate system;
Establishing edge connection for each vertex according to the edge connection construction rule to obtain a pedestrian skeleton key point space-time diagram Wherein/>An adjacency matrix for the space-time diagram;
The edge connection construction rule includes:
Establishing edge connection for skeleton key points with natural connection structures of human bodies under the same video frame or timestamp;
and establishing edge connection for the adjacent key points of the bones with the same name in time sequence.
6. The pedestrian abnormal behavior detection method according to claim 5, wherein the inputting the skeleton key point space-time diagram of the pedestrian into the space-time diagram neural network model, calculating an abnormal score value of the pedestrian, and judging whether the behavior of the pedestrian is abnormal according to the abnormal score value comprises:
The input of the space-time diagram neural network model is a skeleton key point space-time diagram of each pedestrian The space-time diagram neural network model is trained in a weak supervision mode, when a training sample is manufactured, skeleton key points extracted from monitoring videos with the duration of L frames are divided into n skeleton key point space-time diagram segments in time to form a multi-example packet, an example packet containing a pedestrian abnormal behavior event is marked as a positive example multi-example packet B a, and an example packet not containing the pedestrian abnormal behavior event is marked as a negative example multi-example packet B n; when the model is trained, randomly selecting m positive example multi-example packages and m negative example multi-example packages as a mini-batch to input the space-time diagram neural network model, outputting an abnormal score value of each pedestrian, and judging that the pedestrian is abnormal when the abnormal score value of each pedestrian is greater than or equal to a set abnormal score threshold value T.
7. A pedestrian abnormal behavior detection system for implementing the pedestrian abnormal behavior detection method according to any one of claims 1 to 6; the detection system includes:
a skeleton key point extraction module: the method comprises the steps of extracting skeleton key point information of pedestrians from all video frames under at least two paths of monitoring videos, and obtaining three-dimensional space coordinate information of each skeleton key point under the same coordinate system;
and a space-time diagram construction module: the skeleton key point space-time diagram is used for constructing the pedestrian according to the three-dimensional space information of each skeleton key point;
An abnormality detection module: and the method is used for inputting the skeleton key point space diagram of the pedestrian into a space diagram neural network model, calculating to obtain an abnormal score value of the pedestrian, and judging whether the pedestrian behaves abnormally according to the abnormal score value.
8. A terminal comprising a processor, a memory coupled to the processor, wherein,
The memory stores program instructions for implementing the pedestrian abnormal behavior detection method of any one of claims 1 to 6;
the processor is used for executing the program instructions stored in the memory to control pedestrian abnormal behavior detection.
9. A storage medium storing program instructions executable by a processor for performing the pedestrian abnormal behavior detection method according to any one of claims 1 to 6.
CN202011069611.3A 2020-09-30 2020-09-30 Pedestrian abnormal behavior detection method, system, terminal and storage medium Active CN112163537B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011069611.3A CN112163537B (en) 2020-09-30 2020-09-30 Pedestrian abnormal behavior detection method, system, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011069611.3A CN112163537B (en) 2020-09-30 2020-09-30 Pedestrian abnormal behavior detection method, system, terminal and storage medium

Publications (2)

Publication Number Publication Date
CN112163537A CN112163537A (en) 2021-01-01
CN112163537B true CN112163537B (en) 2024-04-26

Family

ID=73861126

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011069611.3A Active CN112163537B (en) 2020-09-30 2020-09-30 Pedestrian abnormal behavior detection method, system, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN112163537B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113011510B (en) * 2021-03-25 2021-12-24 推想医疗科技股份有限公司 Bronchial classification and model training method and device and electronic equipment
CN113095196B (en) * 2021-04-02 2022-09-30 山东师范大学 Human body abnormal behavior detection method and system based on graph structure attitude clustering
CN113128383A (en) * 2021-04-07 2021-07-16 杭州海宴科技有限公司 Recognition method for campus student cheating behavior
CN113065515B (en) * 2021-04-22 2023-02-03 上海交通大学 Abnormal behavior intelligent detection method and system based on similarity graph neural network
CN113139504B (en) * 2021-05-11 2023-02-17 支付宝(杭州)信息技术有限公司 Identity recognition method, device, equipment and storage medium
CN113869123A (en) * 2021-08-27 2021-12-31 浙江大华技术股份有限公司 Crowd-based event detection method and related device
CN114187666B (en) * 2021-12-23 2022-09-02 中海油信息科技有限公司 Identification method and system for watching mobile phone while walking
CN114550287B (en) * 2022-01-27 2024-06-21 福建和盛高科技产业有限公司 Method for detecting abnormal behaviors of personnel in transformer substation scene based on key points of human body
CN114973097A (en) * 2022-06-10 2022-08-30 广东电网有限责任公司 Method, device, equipment and storage medium for recognizing abnormal behaviors in electric power machine room
CN116307743B (en) * 2023-05-23 2023-08-04 浙江安邦护卫科技服务有限公司 Escort safety early warning method, system, equipment and medium based on data processing
CN117292329B (en) * 2023-11-24 2024-03-08 烟台大学 Method, system, medium and equipment for monitoring abnormal work of building robot

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108648237A (en) * 2018-03-16 2018-10-12 中国科学院信息工程研究所 A kind of space-location method of view-based access control model
CN109858390A (en) * 2019-01-10 2019-06-07 浙江大学 The Activity recognition method of human skeleton based on end-to-end space-time diagram learning neural network
CN110084161A (en) * 2019-04-17 2019-08-02 中山大学 A kind of rapid detection method and system of skeleton key point
CN110135319A (en) * 2019-05-09 2019-08-16 广州大学 A kind of anomaly detection method and its system
CN111079600A (en) * 2019-12-06 2020-04-28 长沙海格北斗信息技术有限公司 Pedestrian identification method and system with multiple cameras
CN111191599A (en) * 2019-12-27 2020-05-22 平安国际智慧城市科技股份有限公司 Gesture recognition method, device, equipment and storage medium
CN111462200A (en) * 2020-04-03 2020-07-28 中国科学院深圳先进技术研究院 Cross-video pedestrian positioning and tracking method, system and equipment
CN111582022A (en) * 2020-03-26 2020-08-25 深圳大学 Fusion method and system of mobile video and geographic scene and electronic equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160042227A1 (en) * 2014-08-06 2016-02-11 BAE Systems Information and Electronic Systems Integraton Inc. System and method for determining view invariant spatial-temporal descriptors for motion detection and analysis

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108648237A (en) * 2018-03-16 2018-10-12 中国科学院信息工程研究所 A kind of space-location method of view-based access control model
CN109858390A (en) * 2019-01-10 2019-06-07 浙江大学 The Activity recognition method of human skeleton based on end-to-end space-time diagram learning neural network
CN110084161A (en) * 2019-04-17 2019-08-02 中山大学 A kind of rapid detection method and system of skeleton key point
CN110135319A (en) * 2019-05-09 2019-08-16 广州大学 A kind of anomaly detection method and its system
CN111079600A (en) * 2019-12-06 2020-04-28 长沙海格北斗信息技术有限公司 Pedestrian identification method and system with multiple cameras
CN111191599A (en) * 2019-12-27 2020-05-22 平安国际智慧城市科技股份有限公司 Gesture recognition method, device, equipment and storage medium
CN111582022A (en) * 2020-03-26 2020-08-25 深圳大学 Fusion method and system of mobile video and geographic scene and electronic equipment
CN111462200A (en) * 2020-04-03 2020-07-28 中国科学院深圳先进技术研究院 Cross-video pedestrian positioning and tracking method, system and equipment

Also Published As

Publication number Publication date
CN112163537A (en) 2021-01-01

Similar Documents

Publication Publication Date Title
CN112163537B (en) Pedestrian abnormal behavior detection method, system, terminal and storage medium
WO2022067606A1 (en) Method and system for detecting abnormal behavior of pedestrian, and terminal and storage medium
CN114842028B (en) Cross-video target tracking method, system, electronic equipment and storage medium
CN109190508B (en) Multi-camera data fusion method based on space coordinate system
JP7266106B2 (en) Image coordinate system transformation method and its device, equipment and computer program
JP6614611B2 (en) Apparatus, program, and method for tracking object in consideration of similarity between images
CN108062574B (en) Weak supervision target detection method based on specific category space constraint
Vieira et al. On the improvement of human action recognition from depth map sequences using space–time occupancy patterns
JP7292492B2 (en) Object tracking method and device, storage medium and computer program
JP2008519357A (en) Human posture estimation with data-driven probability propagation
Slama et al. Grassmannian representation of motion depth for 3D human gesture and action recognition
CN113822254B (en) Model training method and related device
CN111310728B (en) Pedestrian re-identification system based on monitoring camera and wireless positioning
JP2018013999A (en) Pose estimation device, method, and program
CN111461222B (en) Method and device for obtaining track similarity of target object and electronic equipment
CN115376034A (en) Motion video acquisition and editing method and device based on human body three-dimensional posture space-time correlation action recognition
CN115527050A (en) Image feature matching method, computer device and readable storage medium
Mantini et al. UHCTD: A comprehensive dataset for camera tampering detection
Du The computer vision simulation of athlete’s wrong actions recognition model based on artificial intelligence
CN117809109A (en) Behavior recognition method based on multi-scale time features
JP7488674B2 (en) OBJECT RECOGNITION DEVICE, OBJECT RECOGNITION METHOD, AND OBJECT RECOGNITION PROGRAM
CN113111778A (en) Large-scale crowd analysis method with video and wireless integration
CN113361392B (en) Unsupervised multi-mode pedestrian re-identification method based on camera and wireless positioning
Miao et al. Robust monocular 3D car shape estimation from 2D landmarks
Puchała et al. Feature engineering techniques for skeleton-based two-person interaction classification in video

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant