CN112200081A - Abnormal behavior identification method and device, electronic equipment and storage medium - Google Patents

Abnormal behavior identification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112200081A
CN112200081A CN202011077906.5A CN202011077906A CN112200081A CN 112200081 A CN112200081 A CN 112200081A CN 202011077906 A CN202011077906 A CN 202011077906A CN 112200081 A CN112200081 A CN 112200081A
Authority
CN
China
Prior art keywords
abnormal behavior
pedestrian
image
coordinate frame
coordinate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011077906.5A
Other languages
Chinese (zh)
Inventor
侯冰基
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN202011077906.5A priority Critical patent/CN112200081A/en
Publication of CN112200081A publication Critical patent/CN112200081A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of artificial intelligence, and provides an abnormal behavior identification method, an abnormal behavior identification device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring an image dataset; based on a pedestrian target detector, carrying out pedestrian detection on each image in the image data set to obtain a pedestrian coordinate frame; clustering the pedestrian coordinate frames according to the inter-class distance, and combining the clustered pedestrian coordinate frames to obtain a new coordinate frame; intercepting the image according to the new coordinate frame to obtain a plurality of screenshot samples; training a convolutional neural network based on a plurality of screenshot samples to obtain an abnormal behavior classifier model; identifying an image to be detected according to the pedestrian target detector and the abnormal behavior classifier model to obtain the probability of the abnormal behavior; and determining an abnormal behavior recognition result according to the abnormal behavior probability. The method can be applied to the fields of intelligent security, intelligent traffic, intelligent communities, intelligent life and the like which need abnormal behavior identification, so that the development of intelligent cities is promoted.

Description

Abnormal behavior identification method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an abnormal behavior identification method and device, electronic equipment and a storage medium.
Background
With the population density becoming higher and higher, many clustering abnormal events such as fighting behaviors and the like often occur in communities and parks, which brings great threat to social security. At present, when abnormal behaviors are recognized, an algorithm is usually adopted based on attitude estimation, a classifier is retrained according to extracted key points, the method strongly depends on the recognition accuracy of key points of a human body, but when a fighting behavior occurs, a plurality of shielding situations can be generated, and particularly when people gather, recognition is inaccurate.
Therefore, how to improve the recognition rate of the abnormal behavior is an urgent technical problem to be solved.
Disclosure of Invention
In view of the above, it is desirable to provide an abnormal behavior recognition method, apparatus, electronic device and storage medium, which can improve the recognition rate of abnormal behavior.
A first aspect of the present invention provides an abnormal behavior recognition method, including:
acquiring an image dataset;
based on a pre-trained pedestrian target detector, carrying out pedestrian detection on each image in the image data set to obtain a pedestrian coordinate frame;
clustering the pedestrian coordinate frames according to the set inter-class distance, and combining the clustered pedestrian coordinate frames to obtain a new coordinate frame;
intercepting the image according to the new coordinate frame to obtain a plurality of screenshot samples;
training a convolutional neural network based on the plurality of screenshot samples to obtain an abnormal behavior classifier model;
according to the pedestrian target detector and the abnormal behavior classifier model, identifying an image to be detected to obtain abnormal behavior probability;
and determining an abnormal behavior identification result according to the abnormal behavior probability.
In one possible implementation manner, before the acquiring the image dataset, the abnormal behavior identification method further includes:
acquiring a training set and a verification set;
setting an initial learning rate and an adjusted learning rate, wherein the initial learning rate is greater than the adjusted learning rate;
training a YOLOv3 framework by using the training set based on the initial learning rate and an Adam optimization algorithm to obtain an intermediate model;
inputting the verification set into the intermediate model, and training the intermediate model based on the adjusted learning rate and the random gradient descent (SGD) algorithm when a loss function of the verification set reaches convergence;
and when the loss functions of the verification set reach convergence, determining the model at the current convergence as a pedestrian target detector.
In a possible implementation manner, the clustering the pedestrian coordinate frame according to the set inter-class distance includes:
sequencing the pedestrian coordinate frames according to the sequence of the probability of each pedestrian coordinate frame from high to low to obtain sequencing frames;
determining the pedestrian coordinate frame with the highest probability as a reference frame;
sequentially traversing the rest frames sequenced behind the reference frame in the sequencing frames, and calculating the Euclidean distance between the reference frame and each rest frame;
and clustering the pedestrian coordinate frames according to the set inter-class distance and the Euclidean distance.
In a possible implementation manner, the merging the clustered pedestrian coordinate frames to obtain a new coordinate frame includes:
merging the clustered pedestrian coordinate frames to obtain a first coordinate frame;
judging whether the width of the first coordinate frame is smaller than a first threshold value or not, and judging whether the height of the first coordinate frame is smaller than a second threshold value or not;
if the width of the first coordinate frame is smaller than a first threshold value and the height of the first coordinate frame is smaller than a second threshold value, acquiring a central point of the first coordinate frame and acquiring preset width and preset height;
judging whether a second coordinate frame based on the central point, the preset width and the preset height exceeds the boundary coordinate of the image or not;
and if the second coordinate frame based on the central point, the preset width and the preset height does not exceed the boundary coordinate of the image, determining that the second coordinate frame is a new coordinate frame.
In a possible implementation manner, the abnormal behavior identification method further includes:
and if a second coordinate frame based on the central point, the preset width and the preset height exceeds the boundary coordinate of the image, determining a new coordinate frame based on the central point and the boundary coordinate.
In one possible implementation, the training a convolutional neural network based on the plurality of screenshot samples, and obtaining an abnormal behavior classifier model includes:
receiving a positive sample and a negative sample which are manually marked for the plurality of screenshot samples, wherein the positive sample is a sample with abnormal behaviors, and the negative sample is a sample without abnormal behaviors;
setting a learning rate, a random gradient descent SGD algorithm and a training time threshold;
training a ResNet18 framework based on the learning rate, the SGD algorithm, using the positive samples and the negative samples;
and when the training times reach the training time threshold, determining that the training is finished, and determining the model at the end as the abnormal behavior classifier model.
In a possible implementation manner, the recognizing, according to the pedestrian target detector and the abnormal behavior classifier model, an image to be detected, and obtaining the abnormal behavior probability includes:
detecting an image to be detected by using the pedestrian target detector to obtain a target coordinate frame of each pedestrian in the image to be detected;
clustering and combining the target coordinate frames to obtain coordinate frames to be detected;
intercepting a screenshot to be detected corresponding to the coordinate frame to be detected from the image to be detected;
preprocessing the screenshot to be detected to obtain a preprocessed image;
and identifying the preprocessed image by using the abnormal behavior classifier model to obtain the abnormal behavior probability.
A second aspect of the present invention provides an abnormal behavior recognition apparatus including:
an acquisition module for acquiring an image dataset;
the detection module is used for detecting pedestrians for each image in the image data set based on a pre-trained pedestrian target detector to obtain a pedestrian coordinate frame;
the merging module is used for clustering the pedestrian coordinate frames according to the set inter-class distance and merging the clustered pedestrian coordinate frames to obtain a new coordinate frame;
the intercepting module is used for intercepting the image according to the new coordinate frame to obtain a plurality of screenshot samples;
the training module is used for training a convolutional neural network based on the plurality of screenshot samples to obtain an abnormal behavior classifier model;
the identification module is used for identifying the image to be detected according to the pedestrian target detector and the abnormal behavior classifier model to obtain the probability of abnormal behavior;
and the determining module is used for determining the abnormal behavior recognition result according to the abnormal behavior probability.
A third aspect of the present invention provides an electronic device comprising a processor and a memory, the processor being configured to implement the abnormal behavior identification method when executing a computer program stored in the memory.
A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the abnormal behavior recognizing method.
According to the technical scheme, the method can be applied to the fields of intelligent security, intelligent traffic, intelligent communities, intelligent life and the like which need abnormal behavior identification, so that the development of intelligent cities is promoted. According to the method, a plurality of screenshot samples are obtained through an image data set which is easy to obtain instead of a video data set and model processing, a convolutional neural network is trained, more and more complete data support can be provided for model training, meanwhile, as the key point characteristics of artificial design in the prior art are reduced, the trained model is more accurate, and the accuracy of abnormal behavior identification can be improved.
Drawings
Fig. 1 is a flowchart of a method for identifying abnormal behavior according to a preferred embodiment of the present invention.
Fig. 2 is a functional block diagram of an abnormal behavior recognition apparatus according to a preferred embodiment of the present disclosure.
Fig. 3 is a schematic structural diagram of an electronic device implementing the abnormal behavior recognition method according to a preferred embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first" and "second" in the description and claims of the present application and the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
The electronic device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware thereof includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like. The electronic device may also include a network device and/or a user device. The network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a Cloud Computing (Cloud Computing) based Cloud consisting of a large number of hosts or network servers. The user device includes, but is not limited to, any electronic product that can interact with a user through a keyboard, a mouse, a remote controller, a touch pad, or a voice control device, for example, a personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), or the like.
Referring to fig. 1, fig. 1 is a flowchart illustrating an abnormal behavior recognition method according to a preferred embodiment of the present invention. The order of the steps in the flowchart may be changed, and some steps may be omitted.
And S11, acquiring an image data set.
Wherein the image dataset comprises a plurality of images.
Specifically, the acquiring the image dataset includes:
acquiring a video stream from video monitoring equipment, performing frame extraction on the video stream to obtain a plurality of images after frame extraction, and constructing an image data set; or
And acquiring a plurality of images from the network through a crawler technology, and constructing an image data set.
In this alternative embodiment, a video stream may be obtained from the video monitoring device, where the video stream is typically rtsp (Real Time Streaming Protocol) in url (Uniform Resource Locator), and the server may parse the rtsp stream, and convert the rtsp stream into a video image of one frame in a cyclic reading manner to construct an image data set; or, optionally, a plurality of images can be acquired from the network through a crawler technology, and an image data set is constructed. Compared with the traditional video data set, the image data set acquired in the embodiment is easier to acquire, more and more complete data support can be provided for subsequent model training, and the accuracy of the model training is improved.
And S12, based on a pre-trained pedestrian target detector, carrying out pedestrian detection on each image in the image data set to obtain a pedestrian coordinate frame.
Specifically, each image in the image dataset may be scaled to a fixed size (e.g., 416 × 416) and input to a trained pedestrian target detector, resulting in tens of thousands of anchor-based original coordinate boxes, each having a corresponding probability value. Screening is performed according to the probability values of the original coordinate frames, the original coordinate frames with the probability values larger than a preset threshold value are reserved, the original coordinate frames with the probability values smaller than the preset threshold value are removed, and Non-maximum suppression is performed on the reserved original coordinate frames, for example, the Non-maximum suppression algorithm (NMS) algorithm is used for suppression. Sequencing the reserved original coordinate frames from high to low according to the probability, selecting the original coordinate frame with the highest current probability as a reference frame, traversing the original coordinate frames sequenced behind the reference frame, comparing the intersection ratio of each original coordinate frame sequenced behind the reference frame and the reference frame, removing the original coordinate frames with high coincidence, and repeating the traversal to obtain the pedestrian coordinate frame corresponding to each person in the final image.
Optionally, the method further includes:
acquiring a training set and a verification set;
setting an initial learning rate and an adjusted learning rate, wherein the initial learning rate is greater than the adjusted learning rate;
training a YOLOv3 framework by using the training set based on the initial learning rate and an Adam optimization algorithm to obtain an intermediate model;
inputting the verification set into the intermediate model, and training the intermediate model based on the adjusted learning rate and the random gradient descent (SGD) algorithm when a loss function of the verification set reaches convergence;
and when the loss functions of the verification set reach convergence, determining the model at the current convergence as a pedestrian target detector.
In this alternative embodiment, a pedestrian data set is obtained in advance, and the pedestrian data set includes an image and a tag, and the image is in a common format such as jpg or png. The image horizontal direction is used as an x axis, the image vertical direction is used as a y axis, the upper left corner of the image is used as an origin, and the label is used for marking the coordinate of the center point of the rectangular frame containing each pedestrian in each image and the width/height of the rectangular frame. The format of the tag can be xml or txt file. Further, the pedestrian data set is divided into a training set and a validation set, the training set is used for training the pedestrian target detector, and the validation set is used for judging when training is finished. Optionally, the optimization method in the training stage is designed as Adam + SGD, that is, an Adam algorithm is used to increase the convergence rate in the early stage of training, and a Stochastic Gradient Descent (SGD) algorithm is used to ensure model convergence in the later stage. Setting probability loss as a cross entropy loss function, setting coordinate loss as a mean square error loss function, setting L2 regularization to inhibit overfitting, and setting a data enhancement mode, wherein the data enhancement mode can include but is not limited to random horizontal turning, image color temperature saturation adjustment and random rotation. The learning rate decreasing manner is set, specifically, the initial learning rate is set, for example, 0.001, and the adjusted learning rate is set to 0.0001. When training is carried out by using the initial learning rate, the loss value is the first minimum value when the loss function of the verification set reaches the first convergence, and then, when training is carried out by using the adjustment learning rate, the loss function of the verification set changes from the convergence state and gradually converges again along with the progress of the training, and when the loss function of the verification set reaches the second convergence, the loss value is the second minimum value, and then the training is finished. Wherein the first minimum value is greater than the second minimum value.
And S13, clustering the pedestrian coordinate frames according to the set inter-class distance, and combining the clustered pedestrian coordinate frames to obtain a new coordinate frame.
Specifically, the clustering the pedestrian coordinate frame according to the set inter-class distance includes:
sequencing the pedestrian coordinate frames according to the sequence of the probability of each pedestrian coordinate frame from high to low to obtain sequencing frames;
determining the pedestrian coordinate frame with the highest probability as a reference frame;
sequentially traversing the rest frames sequenced behind the reference frame in the sequencing frames, and calculating the Euclidean distance between the reference frame and each rest frame;
and clustering the pedestrian coordinate frames according to the set inter-class distance and the Euclidean distance.
In this alternative embodiment, the number of classes of the clusters need not be set, and only the inter-class distance needs to be set. The method comprises the steps of obtaining a standard frame, determining whether the standard frame and a residual frame belong to the same category, wherein one fifth of the square of the area of an image can be set as an inter-class distance according to the size of the image and the size of a pedestrian in the image, and when the Euclidean distance between the central point of a certain residual frame and the central point of the standard frame is smaller than the set inter-class distance, determining that the residual frame and the standard frame belong to the same category. And repeatedly traversing the rest frames and clustering the pedestrian coordinate frames according to the set inter-class distance and the Euclidean distance until all the pedestrian coordinate frames belong to a certain class.
Specifically, the merging the clustered pedestrian coordinate frames to obtain a new coordinate frame includes:
merging the clustered pedestrian coordinate frames to obtain a first coordinate frame;
judging whether the width of the first coordinate frame is smaller than a first threshold value or not, and judging whether the height of the first coordinate frame is smaller than a second threshold value or not;
if the width of the first coordinate frame is smaller than a first threshold value and the height of the first coordinate frame is smaller than a second threshold value, acquiring a central point of the first coordinate frame and acquiring preset width and preset height;
judging whether a second coordinate frame based on the central point, the preset width and the preset height exceeds the boundary coordinate of the image or not;
and if the second coordinate frame based on the central point, the preset width and the preset height does not exceed the boundary coordinate of the image, determining that the second coordinate frame is a new coordinate frame.
In this alternative embodiment, the pedestrian coordinate frames classified as one type need to be merged into a new coordinate frame set as the frame having the smallest area including the frame of this type. Specifically, for each class, the smallest abscissa/ordinate is found from the top left vertices of all the frames of the class and is denoted as x1 and y1, and the largest abscissa/ordinate is found from the bottom right vertices of all the frames of the class and is denoted as x2 and y2, (x1, y1), (x2, y2) are the top left vertex and the bottom right vertex of the first coordinate frame. In addition, a first threshold and a second threshold may be preset, for example, both set to 224 pixels, if the width of a first coordinate frame is smaller than the first threshold and the height of the first coordinate frame is smaller than the second threshold, it is further necessary to determine whether a second coordinate frame based on the center point, the preset width, and the preset height exceeds the boundary coordinates of the image, and if the second coordinate frame does not exceed the boundary coordinates of the image, the second coordinate frame may be determined to be a new coordinate frame. The preset width and the preset height may be set according to the first threshold and the second preset, for example, the preset width and the preset height are 224 pixels. By the method, the accuracy of the subsequent model for identifying the abnormal behavior can be improved.
Optionally, the method further includes:
and if a second coordinate frame based on the central point, the preset width and the preset height exceeds the boundary coordinate of the image, determining a new coordinate frame based on the central point and the boundary coordinate.
In this alternative embodiment, if the boundary coordinates of the image are exceeded, truncation is required with the boundary coordinates of the image as a boundary, i.e. a new coordinate frame is determined based on the center point and the boundary coordinates. By the method, the accuracy of the subsequent model for identifying the abnormal behavior can be improved.
And S14, intercepting the image according to the new coordinate frame to obtain a plurality of screenshot samples.
And capturing the bit sub-image corresponding to the new coordinate frame from the image as a screenshot sample.
And S15, training a convolutional neural network based on the plurality of screenshot samples, and obtaining an abnormal behavior classifier model.
Specifically, the training a convolutional neural network based on the plurality of screenshot samples to obtain an abnormal behavior classifier model includes:
receiving a positive sample and a negative sample which are manually marked for the plurality of screenshot samples, wherein the positive sample is a sample with abnormal behaviors, and the negative sample is a sample without abnormal behaviors;
setting a learning rate, a random gradient descent SGD algorithm and a training time threshold;
training a ResNet18 framework based on the learning rate, the SGD algorithm, using the positive samples and the negative samples;
and when the training times reach the training time threshold, determining that the training is finished, and determining the model at the end as the abnormal behavior classifier model.
In this optional embodiment, after a plurality of screenshot samples are obtained, the unique screenshot samples can be classified in a manual labeling manner, the screenshot samples with abnormal behaviors are divided into positive samples, and the screenshot samples without abnormal behaviors are divided into negative samples. A convolutional neural network such as ResNet18 can be selected and loaded with pre-trained model parameters on ImageNet. Training parameters may be set, such as setting the input image zoom size to 300 × 300, setting the random cropping to 224 × 224, setting the random inversion, setting the learning rate to 0.001, setting the optimization method to SGD, setting the training time threshold, and reducing the learning rate to 0.0001 when the training reaches the predetermined training time threshold. And after mean value and variance of the positive sample and the negative sample are removed, starting model training until training reaches a preset training time threshold value, and finishing training.
And S16, identifying the image to be detected according to the pedestrian target detector and the abnormal behavior classifier model, and obtaining the probability of the abnormal behavior.
Specifically, the identifying the image to be detected according to the pedestrian target detector and the abnormal behavior classifier model to obtain the abnormal behavior probability includes:
detecting an image to be detected by using the pedestrian target detector to obtain a target coordinate frame of each pedestrian in the image to be detected;
clustering and combining the target coordinate frames to obtain coordinate frames to be detected;
intercepting a screenshot to be detected corresponding to the coordinate frame to be detected from the image to be detected;
preprocessing the screenshot to be detected to obtain a preprocessed image;
and identifying the preprocessed image by using the abnormal behavior classifier model to obtain the abnormal behavior probability.
In this alternative embodiment, the preprocessing may include scaling the screenshots to be detected to a size that is randomly clipped during training, such as to a fixed size of 224 × 224, and performing a de-averaging and variance process, where the de-averaging and variance process is consistent with the data during training.
And S17, determining the abnormal behavior recognition result according to the abnormal behavior probability.
The range of the abnormal behavior probability is [0, 1], and the higher the probability value is, the higher the probability that the image to be detected has a frame.
Optionally, the method further includes:
if the abnormal behavior probability exceeds a preset probability threshold, packaging the coordinate frame to be detected, the screenshot to be detected and the abnormal behavior probability into structured data;
and sending the structured data to safety early warning equipment.
In this optional implementation, when the probability of an abnormal behavior exceeds a preset probability threshold (for example, 0.5), the coordinate frame to be detected, the screenshot to be detected, and the probability of the abnormal behavior may be packaged as structured data and sent to the safety precaution device, which is beneficial for a user of the safety precaution device to find an abnormality in time and to take corresponding preventive or preventive measures, for example, alarm, so that social safety may be improved and personal safety may be ensured.
Optionally, for data privacy and security, the abnormal behavior recognition result may be uploaded to the blockchain.
In the method flow described in fig. 1, the convolutional neural network is trained through an image data set which is easy to obtain, so that more and more complete data support can be provided for model training, the accuracy of model training is improved, and meanwhile, the images to be detected are identified through the trained pedestrian target detector and the abnormal behavior classifier model, so that the characteristics set by people are greatly reduced, the accuracy of identifying abnormal behaviors is higher, and the identification rate of abnormal behaviors is improved.
From the above embodiments, the method and the system can be applied to the fields of intelligent security, intelligent traffic, intelligent communities, intelligent life and the like which need to identify abnormal behaviors, so as to promote the development of intelligent cities.
The above description is only a specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and it will be apparent to those skilled in the art that modifications may be made without departing from the inventive concept of the present invention, and these modifications are within the scope of the present invention.
Referring to fig. 2, fig. 2 is a functional block diagram of an abnormal behavior recognition apparatus according to a preferred embodiment of the present invention.
In some embodiments, the abnormal behavior recognizing apparatus is run in an electronic device. The abnormal behavior recognizing means may include a plurality of functional modules composed of program code segments. The program codes of the program segments in the abnormal behavior recognition apparatus may be stored in the memory and executed by at least one processor to perform part or all of the steps in the abnormal behavior recognition method described in fig. 1, which may specifically refer to the related description of fig. 1 and will not be described herein again.
In this embodiment, the abnormal behavior recognizing apparatus may be divided into a plurality of functional modules according to the functions executed by the abnormal behavior recognizing apparatus. The functional module may include: an acquisition module 201, a detection module 202, a merging module 203, a truncation module 204, a training module 205, a recognition module 206, and a determination module 207. The module referred to herein is a series of computer program segments capable of being executed by at least one processor and capable of performing a fixed function and is stored in memory.
An obtaining module 201 is configured to obtain an image data set.
And the detection module 202 is configured to perform pedestrian detection on each image in the image data set based on a pre-trained pedestrian target detector, so as to obtain a pedestrian coordinate frame.
And the merging module 203 is configured to cluster the pedestrian coordinate frames according to the set inter-class distance, and merge the clustered pedestrian coordinate frames to obtain a new coordinate frame.
And the intercepting module 204 is configured to intercept the image according to the new coordinate frame to obtain a plurality of screenshot samples.
And the training module 205 is configured to train a convolutional neural network based on the plurality of screenshot samples to obtain an abnormal behavior classifier model.
And the identification module 206 is configured to identify the image to be detected according to the pedestrian target detector and the abnormal behavior classifier model, so as to obtain the probability of the abnormal behavior.
And the determining module 207 is configured to determine an abnormal behavior recognition result according to the abnormal behavior probability.
In the abnormal behavior recognition device described in fig. 2, the convolutional neural network is trained through the image data set which is easy to obtain, so that more and more complete data support can be provided for model training, the accuracy of the model training is improved, and meanwhile, the image to be detected is recognized through the trained pedestrian target detector and the abnormal behavior classifier model, so that the characteristics set manually are greatly reduced, the accuracy of recognizing the abnormal behavior is higher, and the recognition rate of the abnormal behavior is improved.
As shown in fig. 3, fig. 3 is a schematic structural diagram of an electronic device implementing the abnormal behavior recognition method according to a preferred embodiment of the present invention. The electronic device 3 comprises a memory 31, at least one processor 32, a computer program 33 stored in the memory 31 and executable on the at least one processor 32, and at least one communication bus 34.
Those skilled in the art will appreciate that the schematic diagram shown in fig. 3 is merely an example of the electronic device 3, and does not constitute a limitation of the electronic device 3, and may include more or less components than those shown, or combine some components, or different components, for example, the electronic device 3 may further include an input/output device, a network access device, and the like.
The at least one Processor 32 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The processor 32 may be a microprocessor or the processor 32 may be any conventional processor or the like, and the processor 32 is a control center of the electronic device 3 and connects various parts of the whole electronic device 3 by various interfaces and lines.
The memory 31 may be used to store the computer program 33 and/or the module/unit, and the processor 32 may implement various functions of the electronic device 3 by running or executing the computer program and/or the module/unit stored in the memory 31 and calling data stored in the memory 31. The memory 31 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data) created according to the use of the electronic device 3, and the like. In addition, the memory 31 may include non-volatile and volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other storage devices.
With reference to fig. 1, the memory 31 of the electronic device 3 stores a plurality of instructions to implement an abnormal behavior recognition method, and the processor 32 can execute the plurality of instructions to implement:
acquiring an image dataset;
based on a pre-trained pedestrian target detector, carrying out pedestrian detection on each image in the image data set to obtain a pedestrian coordinate frame;
clustering the pedestrian coordinate frames according to the set inter-class distance, and combining the clustered pedestrian coordinate frames to obtain a new coordinate frame;
intercepting the image according to the new coordinate frame to obtain a plurality of screenshot samples;
training a convolutional neural network based on the plurality of screenshot samples to obtain an abnormal behavior classifier model;
according to the pedestrian target detector and the abnormal behavior classifier model, identifying an image to be detected to obtain abnormal behavior probability;
and determining an abnormal behavior identification result according to the abnormal behavior probability.
Specifically, the processor 32 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the instruction, which is not described herein again.
In the electronic device 3 described in fig. 3, the convolutional neural network is trained through an image data set which is easy to obtain, so that more and more complete data support can be provided for model training, the accuracy of the model training is improved, and meanwhile, the to-be-detected image is identified through the trained pedestrian target detector and the abnormal behavior classifier model, so that the characteristics set manually are greatly reduced, the accuracy of identifying abnormal behaviors is higher, and the identification rate of the abnormal behaviors is improved.
The integrated modules/units of the electronic device 3 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, and Read-Only Memory (ROM), Random Access Memory (RAM), etc.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned. The units or means recited in the system claims may also be implemented by software or hardware.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. An abnormal behavior recognition method, comprising:
acquiring an image dataset;
based on a pre-trained pedestrian target detector, carrying out pedestrian detection on each image in the image data set to obtain a pedestrian coordinate frame;
clustering the pedestrian coordinate frames according to the set inter-class distance, and combining the clustered pedestrian coordinate frames to obtain a new coordinate frame;
intercepting the image according to the new coordinate frame to obtain a plurality of screenshot samples;
training a convolutional neural network based on the plurality of screenshot samples to obtain an abnormal behavior classifier model;
according to the pedestrian target detector and the abnormal behavior classifier model, identifying an image to be detected to obtain abnormal behavior probability;
and determining an abnormal behavior identification result according to the abnormal behavior probability.
2. The abnormal behavior recognition method of claim 1, wherein prior to the acquiring the image dataset, the abnormal behavior recognition method further comprises:
acquiring a training set and a verification set;
setting an initial learning rate and an adjusted learning rate, wherein the initial learning rate is greater than the adjusted learning rate;
training a YOLOv3 framework by using the training set based on the initial learning rate and an Adam optimization algorithm to obtain an intermediate model;
inputting the verification set into the intermediate model, and training the intermediate model based on the adjusted learning rate and the random gradient descent (SGD) algorithm when a loss function of the verification set reaches convergence;
and when the loss functions of the verification set reach convergence, determining the model at the current convergence as a pedestrian target detector.
3. The abnormal behavior recognition method according to claim 1, wherein the clustering the pedestrian coordinate frame according to the set inter-class distance comprises:
sequencing the pedestrian coordinate frames according to the sequence of the probability of each pedestrian coordinate frame from high to low to obtain sequencing frames;
determining the pedestrian coordinate frame with the highest probability as a reference frame;
sequentially traversing the rest frames sequenced behind the reference frame in the sequencing frames, and calculating the Euclidean distance between the reference frame and each rest frame;
and clustering the pedestrian coordinate frames according to the set inter-class distance and the Euclidean distance.
4. The abnormal behavior recognition method according to claim 1, wherein the merging the clustered pedestrian coordinate frames to obtain a new coordinate frame comprises:
merging the clustered pedestrian coordinate frames to obtain a first coordinate frame;
judging whether the width of the first coordinate frame is smaller than a first threshold value or not, and judging whether the height of the first coordinate frame is smaller than a second threshold value or not;
if the width of the first coordinate frame is smaller than a first threshold value and the height of the first coordinate frame is smaller than a second threshold value, acquiring a central point of the first coordinate frame and acquiring preset width and preset height;
judging whether a second coordinate frame based on the central point, the preset width and the preset height exceeds the boundary coordinate of the image or not;
and if the second coordinate frame based on the central point, the preset width and the preset height does not exceed the boundary coordinate of the image, determining that the second coordinate frame is a new coordinate frame.
5. The abnormal behavior recognition method according to claim 4, further comprising:
and if a second coordinate frame based on the central point, the preset width and the preset height exceeds the boundary coordinate of the image, determining a new coordinate frame based on the central point and the boundary coordinate.
6. The abnormal behavior recognition method of claim 1, wherein training a convolutional neural network based on the plurality of screenshot samples to obtain an abnormal behavior classifier model comprises:
receiving a positive sample and a negative sample which are manually marked for the plurality of screenshot samples, wherein the positive sample is a sample with abnormal behaviors, and the negative sample is a sample without abnormal behaviors;
setting a learning rate, a random gradient descent SGD algorithm and a training time threshold;
training a ResNet18 framework based on the learning rate, the SGD algorithm, using the positive samples and the negative samples;
and when the training times reach the training time threshold, determining that the training is finished, and determining the model at the end as the abnormal behavior classifier model.
7. The abnormal behavior recognition method according to claim 1, wherein the recognizing the image to be detected according to the pedestrian target detector and the abnormal behavior classifier model to obtain the abnormal behavior probability comprises:
detecting an image to be detected by using the pedestrian target detector to obtain a target coordinate frame of each pedestrian in the image to be detected;
clustering and combining the target coordinate frames to obtain coordinate frames to be detected;
intercepting a screenshot to be detected corresponding to the coordinate frame to be detected from the image to be detected;
preprocessing the screenshot to be detected to obtain a preprocessed image;
and identifying the preprocessed image by using the abnormal behavior classifier model to obtain the abnormal behavior probability.
8. An abnormal behavior recognition apparatus, characterized in that the abnormal behavior recognition apparatus comprises:
an acquisition module for acquiring an image dataset;
the detection module is used for detecting pedestrians for each image in the image data set based on a pre-trained pedestrian target detector to obtain a pedestrian coordinate frame;
the merging module is used for clustering the pedestrian coordinate frames according to the set inter-class distance and merging the clustered pedestrian coordinate frames to obtain a new coordinate frame;
the intercepting module is used for intercepting the image according to the new coordinate frame to obtain a plurality of screenshot samples;
the training module is used for training a convolutional neural network based on the plurality of screenshot samples to obtain an abnormal behavior classifier model;
the identification module is used for identifying the image to be detected according to the pedestrian target detector and the abnormal behavior classifier model to obtain the probability of abnormal behavior;
and the determining module is used for determining the abnormal behavior recognition result according to the abnormal behavior probability.
9. An electronic device, characterized in that the electronic device comprises a processor and a memory, the processor being configured to execute a computer program stored in the memory to implement the abnormal behavior recognition method according to any one of claims 1 to 7.
10. A computer-readable storage medium storing at least one instruction which, when executed by a processor, implements the abnormal behavior recognition method according to any one of claims 1 to 7.
CN202011077906.5A 2020-10-10 2020-10-10 Abnormal behavior identification method and device, electronic equipment and storage medium Pending CN112200081A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011077906.5A CN112200081A (en) 2020-10-10 2020-10-10 Abnormal behavior identification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011077906.5A CN112200081A (en) 2020-10-10 2020-10-10 Abnormal behavior identification method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112200081A true CN112200081A (en) 2021-01-08

Family

ID=74013254

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011077906.5A Pending CN112200081A (en) 2020-10-10 2020-10-10 Abnormal behavior identification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112200081A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112861767A (en) * 2021-02-26 2021-05-28 北京农业信息技术研究中心 Small-volume pest detection method and system on pest sticking plate image
CN113158858A (en) * 2021-04-09 2021-07-23 苏州爱可尔智能科技有限公司 Behavior analysis method and system based on deep learning
CN113269055A (en) * 2021-05-06 2021-08-17 中国矿业大学 Method for calculating loss function of fallen leaf detection prediction frame
CN113297910A (en) * 2021-04-25 2021-08-24 云南电网有限责任公司信息中心 Distribution network field operation safety belt identification method
CN113326793A (en) * 2021-06-15 2021-08-31 上海有个机器人有限公司 Long-distance pedestrian position identification method, system and storage medium
CN113362005A (en) * 2021-06-21 2021-09-07 山东产研信息与人工智能融合研究院有限公司 Intelligent inventory method and system for goods in unmanned warehouse based on environment perception
CN114693606A (en) * 2022-03-07 2022-07-01 华南理工大学 Safety equipment wearing detection method based on pedestrian region merging
WO2023273075A1 (en) * 2021-06-30 2023-01-05 深圳市商汤科技有限公司 Behavior recognition method and apparatus, and computer device and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106651955A (en) * 2016-10-10 2017-05-10 北京小米移动软件有限公司 Method and device for positioning object in picture
KR101731461B1 (en) * 2015-12-09 2017-05-11 고려대학교 산학협력단 Apparatus and method for behavior detection of object
CN108932479A (en) * 2018-06-06 2018-12-04 上海理工大学 A kind of human body anomaly detection method
CN110414313A (en) * 2019-06-06 2019-11-05 平安科技(深圳)有限公司 Abnormal behaviour alarm method, device, server and storage medium
CN110502988A (en) * 2019-07-15 2019-11-26 武汉大学 Group positioning and anomaly detection method in video
CN110705400A (en) * 2019-09-19 2020-01-17 安徽七天教育科技有限公司 Method for automatically splitting examination paper layout questions
CN111259742A (en) * 2020-01-09 2020-06-09 南京理工大学 Abnormal crowd detection method based on deep learning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101731461B1 (en) * 2015-12-09 2017-05-11 고려대학교 산학협력단 Apparatus and method for behavior detection of object
CN106651955A (en) * 2016-10-10 2017-05-10 北京小米移动软件有限公司 Method and device for positioning object in picture
CN108932479A (en) * 2018-06-06 2018-12-04 上海理工大学 A kind of human body anomaly detection method
CN110414313A (en) * 2019-06-06 2019-11-05 平安科技(深圳)有限公司 Abnormal behaviour alarm method, device, server and storage medium
CN110502988A (en) * 2019-07-15 2019-11-26 武汉大学 Group positioning and anomaly detection method in video
CN110705400A (en) * 2019-09-19 2020-01-17 安徽七天教育科技有限公司 Method for automatically splitting examination paper layout questions
CN111259742A (en) * 2020-01-09 2020-06-09 南京理工大学 Abnormal crowd detection method based on deep learning

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112861767A (en) * 2021-02-26 2021-05-28 北京农业信息技术研究中心 Small-volume pest detection method and system on pest sticking plate image
CN113158858A (en) * 2021-04-09 2021-07-23 苏州爱可尔智能科技有限公司 Behavior analysis method and system based on deep learning
CN113297910B (en) * 2021-04-25 2023-04-18 云南电网有限责任公司信息中心 Distribution network field operation safety belt identification method
CN113297910A (en) * 2021-04-25 2021-08-24 云南电网有限责任公司信息中心 Distribution network field operation safety belt identification method
CN113269055A (en) * 2021-05-06 2021-08-17 中国矿业大学 Method for calculating loss function of fallen leaf detection prediction frame
CN113269055B (en) * 2021-05-06 2024-02-13 中国矿业大学 Calculation method for loss function of fallen leaf detection prediction frame
CN113326793A (en) * 2021-06-15 2021-08-31 上海有个机器人有限公司 Long-distance pedestrian position identification method, system and storage medium
CN113326793B (en) * 2021-06-15 2024-04-05 上海有个机器人有限公司 Remote pedestrian position identification method, system and storage medium
CN113362005A (en) * 2021-06-21 2021-09-07 山东产研信息与人工智能融合研究院有限公司 Intelligent inventory method and system for goods in unmanned warehouse based on environment perception
CN113362005B (en) * 2021-06-21 2022-11-11 山东产研信息与人工智能融合研究院有限公司 Intelligent inventory method and system for goods in unmanned warehouse based on environment perception
WO2023273075A1 (en) * 2021-06-30 2023-01-05 深圳市商汤科技有限公司 Behavior recognition method and apparatus, and computer device and storage medium
CN114693606A (en) * 2022-03-07 2022-07-01 华南理工大学 Safety equipment wearing detection method based on pedestrian region merging
CN114693606B (en) * 2022-03-07 2024-04-23 华南理工大学 Pedestrian area merging-based safety equipment wearing detection method

Similar Documents

Publication Publication Date Title
CN112200081A (en) Abnormal behavior identification method and device, electronic equipment and storage medium
CN109858371B (en) Face recognition method and device
CN112734775B (en) Image labeling, image semantic segmentation and model training methods and devices
CN108229321B (en) Face recognition model, and training method, device, apparatus, program, and medium therefor
CN109948497B (en) Object detection method and device and electronic equipment
CN107862270B (en) Face classifier training method, face detection method and device and electronic equipment
CN111813997B (en) Intrusion analysis method, device, equipment and storage medium
CN110853033B (en) Video detection method and device based on inter-frame similarity
CN109978893A (en) Training method, device, equipment and the storage medium of image, semantic segmentation network
CN110851835A (en) Image model detection method and device, electronic equipment and storage medium
CN108108731B (en) Text detection method and device based on synthetic data
CN112052837A (en) Target detection method and device based on artificial intelligence
CN108197544B (en) Face analysis method, face filtering method, face analysis device, face filtering device, embedded equipment, medium and integrated circuit
CN111680753A (en) Data labeling method and device, electronic equipment and storage medium
Chiu et al. A background subtraction algorithm in complex environments based on category entropy analysis
CN116311214B (en) License plate recognition method and device
CN111753642B (en) Method and device for determining key frame
WO2019106095A1 (en) Hierarchical image interpretation system
CN114005019B (en) Method for identifying flip image and related equipment thereof
CN112528903B (en) Face image acquisition method and device, electronic equipment and medium
CN114360182B (en) Intelligent alarm method, device, equipment and storage medium
CN111741329B (en) Video processing method, device, equipment and storage medium
CN116189063B (en) Key frame optimization method and device for intelligent video monitoring
CN116052189A (en) Text recognition method, system and storage medium
CN115037790A (en) Abnormal registration identification method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination