CN108108688A - A kind of limbs conflict behavior detection method based on the extraction of low-dimensional space-time characteristic with theme modeling - Google Patents
A kind of limbs conflict behavior detection method based on the extraction of low-dimensional space-time characteristic with theme modeling Download PDFInfo
- Publication number
- CN108108688A CN108108688A CN201711366304.XA CN201711366304A CN108108688A CN 108108688 A CN108108688 A CN 108108688A CN 201711366304 A CN201711366304 A CN 201711366304A CN 108108688 A CN108108688 A CN 108108688A
- Authority
- CN
- China
- Prior art keywords
- video
- word
- pixel
- theme
- document
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/28—Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/254—Analysis of motion involving subtraction of images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/269—Analysis of motion using gradient-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20224—Image subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30232—Surveillance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/44—Event detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Probability & Statistics with Applications (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The invention belongs to technical field of video monitoring,It is related to limbs conflict behavior detection method of the kind based on the extraction of low-dimensional space-time characteristic with theme modeling,The step of detection is to need first to define a word sheet,The location of pixels of re-quantization object,The size of foreground target in scene is described,Determine the motion conditions of foreground pixel,By completing complete this foundation of word and the foundation of corpus after above-mentioned steps,The judgement of limbs conflict behavior is carried out by above-mentioned calculation,This method combines the data characteristics expression of low-dimensional and the complex scene analysis based on model,Utilize the variation of position of human body information in action,Learn a mass motion model unrelated with body part,By analyzing mass motion model,Parameter in the result and model that detect is compared,And then judge human motion state,The present invention is compared with prior art,This method design concept is ingenious,Testing principle science,Detection mode is simple and detects accuracy height,Great market prospects.
Description
Technical field:
The invention belongs to technical field of video monitoring, are related to a kind of limbs conflict behavior detection method, more particularly to a kind of
Limbs conflict behavior detection method based on the extraction of low-dimensional space-time characteristic with theme modeling.
Background technology:
In recent years, increasing with various safe accidents, the promotion of popular awareness of safety, the artificial intelligence of simultaneous
The energy infiltration of theory and the continuous maturation of artificial intelligence technology, intelligent monitoring have been to be concerned by more and more people.Traditional monitoring
System mainly realizes the safety management to public arena by way of manually monitoring, and lacks real-time and initiative.Very much
In the case of, video monitoring is to play the role of video backup not accomplishing the responsibility supervised due to unattended.In addition, with
It the popularization of monitoring camera and lays extensively, traditional artificial monitor mode cannot meet the needs of modern monitoring.For
Solve the problems, such as this, masses are devoted to seek solution to replace manual operation.At present, with Video Supervision Technique and
The continuous development of information science, there is significant progress in the fields such as video monitoring, human-computer interaction, video search, automatically-monitored
It is increasingly becoming a research topic with wide application prospect.Unusual checking is the important content monitored automatically, is compared
It being concentrated in usually Human bodys' response in the identification of the conventional action of people, abnormal behaviour is usually sudden with height, and
Duration is shorter, it is difficult to the characteristics of obtaining behavioural characteristic.
In recent years, for the detection of abnormal behaviour, researchers propose different methods, abnormal in early stage behavioral value
Research work is focused primarily upon describes human body behavior using simple set model, such as based on two-dimensional silhouette model, three dimensional cylinder
Model etc.;In addition to static geometric model, researcher also attempts to be modeled using some features for describing human motion, such as shape
The features such as shape, angle, position, movement velocity, the direction of motion, movement locus carry out behavior description and differentiation, and using bag
It includes the subspace method including Principal Component Analysis, independent component analysis method etc. and dimensionality reduction and screening is carried out to the feature of extraction, from
And carry out behavioural analysis.The existing invention for unusual checking exists and fails the inherent characteristics for getting a real idea of abnormal behaviour,
Thus existing unusual checking model can not abnormal reaction behavior completely essence, so as to cause according to existing abnormal behaviour
Accuracy of detection that detection model obtains and not up to ideal effect, therefore, design is a kind of to be extracted and main based on low-dimensional space-time characteristic
The limbs conflict behavior detection method of modeling is inscribed, computational methods are accurate, and testing result is accurate.
The content of the invention:
It is an object of the invention to overcome defect existing in the prior art, seek design one kind and carried based on low-dimensional space-time characteristic
The limbs conflict behavior detection method with theme modeling is taken, calculation is simple, and computational accuracy is high, can be rapidly and accurately to limb
Body conflict behavior is detected, and being capable of timely early warning.
To achieve these goals, a kind of limbs based on the extraction of low-dimensional space-time characteristic with theme modeling of the present invention
The processing step that conflict behavior detection method specifically includes is as follows:
The definition of S1, word sheet
First go out to meet the semantic understanding of human cognitive from original monitor video extracting data, pass through the algorithm of the present invention
Design, which automatically analyzes, understands video data, and analytic process is divided into the extraction of foreground target, target signature represents and behavioural analysis is returned
Class, this method is based on LDMA models for human body unusual checking in video monitoring, to the pixel position of each object in video
It puts and is described, to each pixel decimation feature vector, position of this feature vector comprising each pixel, the speed of movement and side
To, be under the jurisdiction of the size of target object, ultimately form visual information word sheet and document, and define an effective word sheet, as
Cover the dictionary that the pixel in monitor video can inquire about;
S2, the location of pixels for quantifying object
In the video obtained in video monitoring, behavior is substantially characterized by the position of behavior hair survivor, therefore, this hair
In the bright structure that location information is considered to word sheet, the location of pixels of object in video is quantized into the thin of nonoverlapping 10*10
In cell element, for the object video of M × N, therefore M/10 × N/10 cell tuple can be obtained;
S3, description scene in foreground target size
In order to accurately represent foreground target in object video, which kind of each foreground pixel and the pixel belonged to by the present invention
Foreground target connects, and in the video data obtained in video monitoring, the prospect frame observed is sized to based on them
Two classes are divided into, one kind is small prospect frame, mainly pedestrian, and one kind is big prospect frame, mainly includes vehicle or a group
Pedestrian;Therefore, the present invention clusters to classify the size of prospect frame using K-means, so as to obtain the prospect that each pixel is subordinate to
Target takes the cluster numbers k=2 in K-means, final to describe the size of the target in scene, i.e., 1 using cluster label 1 and 2
It is big target for Small object, 2;
S4, the motion conditions for determining foreground pixel
For the scene in video monitoring, the content of analysis is directed to foreground target, it is necessary to carry out before background subtraction obtains
Scene element, and each foreground pixel to obtaining solves the Optic flow information of the pixel according to Lucas-Kanade optical flow algorithms, leads to
The threshold value of setting light stream vectors size is crossed to define prospect static pixels (static labels) and dynamic pixel;Again dynamic picture
Element is quantized into the motion state described with 4 kinds of direction of motion, track, position, speed sports immunology words, therefore, for detection
The foreground pixel arrived has and determines prospect picture with the direction of motion, track, position, speed and static 5 kinds of possible sports immunology words
The motion conditions of element;
S5, video sequence and pixel are defined
Video sequence under scene in video monitoring is denoted asIt willIt is divided into several video sequences
Row, wherein,For m-th of video segment of segmentation, video sequenceRegard current corpus asThen Document (document) in corresponding corpus, in video segmentIn, definition pixel is word (word), and each word corresponds to
One theme (topic), then with the variation of time t,In, each word theme generates transfer to other themes or shifts certainly
State, from MCMC (Markov Chain MonteCarlo) characteristic, this characteristic can reach one after a period of time has passed
Kind Stationary Distribution;
S6, word sheet is established
There are the expression of M/10 × N/10 kinds, fortune for the position of each pixel of the object video of M × N according to above-mentioned steps
Dynamic form has 5 kinds of descriptions, big target and Small object there are two types of statement, and the word that can be obtained is expressed as M/10 × N/10 × 5 × 2
Kind form, i.e., for some foreground pixel, existKind of describing mode, but at a time under, each pixel
Movable information and the target that is subordinate to there is independence, i.e., for video segment, the different masters formed with the variation of time t
Topic, theme independent should respectively obtain, and therefore, each position (location) can use union feature (to move, greatly
It is small) it representsIt will move and the feature of size cascades, then the word as each cell member
Set, uses VcIt represents, this is meant that when building a video-frequency band, and a pixel will provide two kinds of features simultaneously to this position
Word --- the target sizes for moving and being subordinate to, then final word originally can be expressed as M/10 × N/10 × (5+2) form;Therefore, one
The Feature Words of a pixel can be defined as wc,aC is thin cell location, and a is forms of motion and the union feature of size;
The foundation of S7, corpus
Monitor video is divided into several short video-frequency bands, each video-frequency band is as a document, t at any time in video-frequency band
The pixel of variation is expressed as the subject content that the word occurred in document and a series of this word represent, then is generated with each pixel
Word sheet for foundation, if total word frequency in corpus is N, in all N number of words, if paying close attention to each word viGeneration
Frequency frequency ni, then
Then the probability of each language material is in corpus:
Wherein, P (n) refers to the probability of the frequency number that each word occurs in corpus;
So, for each specific themeAnd the probability of vocabulary in corpus is generated by the themeIt is then final
The probability that corpus generates is exactly to each themeThe cumulative summation of the vocabulary probability of upper generation:
In corpus WObey multinomial distribution,ThemeIt obeys
One probability distributionThis distribution becomes parameterPrior distribution, prior distributionSelect the conjugation of multinomial distribution
Distribution --- Dirichlet is distributed;According to the regularity of distribution of Dirichlet, the generation probability to calculate corpus of text is:
Wherein,Represent the parameter of Dirichlet prior distributions;The corpus of text is into corpus by sets of documentation
Regard video sequence as a document (document), document is mixed by multiple themes (topic), and
Each Topic is the probability distribution on vocabulary, and each word that each pixel represents in video sequence is fixed by one
Topic generations, this process is exactly the process of Document Modeling, is a bag-of-words model:If there is V topic-
Word is denoted asEach theme corresponds to the probability distribution of a term vectorFor including the language material C=of M documents
(d1,d2,···,dM) in every document dm, all can be there are one specific doc-topicThat is the corresponding master of every document
Inscribing vectorial probability distribution isSo m document dmIn the generating probability of each word be:
The generating probability of entire chapter document is:
Due to independently of each other, the generating probability of entire language material being write out according to above-mentioned formula between document, Topic- is generated
Then Model carries out solution locally optimal solution using EM algorithms;
The judgement of S8, limbs conflict behavior
Limbs conflict behavior detection method based on the extraction of low-dimensional space-time characteristic and theme modeling combines the data of low-dimensional
Character representation and the complex scene analysis based on model, analyze video sequence with this, according to detecting people in video
Body position using the variation of position of human body information in action, learns a mass motion model unrelated with body part, leads to
Analysis mass motion model is crossed, the parameter in the result and model that detect is compared, and then judges human motion shape
State, the present invention in each behavior can correspond to a kind of theme distribution, under trained model case, in the video segment tested
Such as occur the situation of limbs conflict, then this behavior, which can concentrate, to be distributed in a kind of theme, and then determines this according to theme
Kind behavior is to belong to the state for limbs conflict occur.
Compared with prior art, the present invention it has the advantages that:It is main accurately to be extracted using the spectral signature of image
The profile of moving region can clearly see the contour edge of moving target, be analyzed for behavioural characteristic, be applicable not only to beat
The limbs conflict behaviors such as frame are equally applicable to the detection of other behaviors, such as quickly mobile behavior, this method design concept are skilful
Wonderful, testing principle science, detection mode is simple and detects accuracy height, and application environment is friendly, great market prospects.
Description of the drawings:
Fig. 1 is the foreground detection design sketch of different video two field picture in video flowing of the present invention.
Fig. 2 is the limbs conflict behavior detection method of the present invention based on the extraction of low-dimensional space-time characteristic with theme modeling
Process flow diagram.
Specific embodiment:
The present invention is described further by way of example and in conjunction with the accompanying drawings.
Embodiment:
To achieve these goals, being rushed based on the extraction of low-dimensional space-time characteristic and the limbs of theme modeling described in the present embodiment
The processing step that prominent behavioral value method specifically includes is as follows:
The definition of S1, word sheet
First go out to meet the semantic understanding of human cognitive from original monitor video extracting data, calculation through this embodiment
Method design, which automatically analyzes, understands video data, and analytic process is divided into the extraction of foreground target, target signature represents and behavioural analysis
Sort out, this method is based on LDMA models for human body unusual checking in video monitoring, to the pixel of each object in video
Position is described, to each pixel decimation feature vector, position of this feature vector comprising each pixel, movement speed and
Direction, the size for being under the jurisdiction of target object ultimately form visual information word sheet and document, and define an effective word sheet, make
The dictionary that can be inquired about for the pixel covered in monitor video;
S2, the location of pixels for quantifying object
In the video obtained in video monitoring, behavior is substantially characterized by the position of behavior hair survivor, therefore, this reality
It applies in the structure that location information is considered word sheet by example, the location of pixels of object in video is quantized into nonoverlapping 10*10's
In cell member, for the object video of M × N, therefore M/10 × N/10 cell tuple can be obtained;
S3, description scene in foreground target size
In order to accurately represent foreground target in object video, what each foreground pixel and the pixel belonged to by the present embodiment
Kind foreground target connects, and in the video data obtained in video monitoring, the prospect frame observed is based on their size energy
Two classes are enough divided into, one kind is small prospect frame, mainly pedestrian, and one kind is big prospect frame, mainly includes vehicle or one
Group pedestrian;Therefore, the present embodiment is subordinate to using K-means clusters come the size for prospect frame of classifying so as to obtaining each pixel
Foreground target takes the cluster numbers k=2 in K-means, final to describe the big of the target in scene using cluster label 1 and 2
Small, i.e., 1 is Small object, and 2 be big target;
S4, the motion conditions for determining foreground pixel
For the scene in video monitoring, the content of analysis is directed to foreground target, it is necessary to carry out before background subtraction obtains
Scene element, and each foreground pixel to obtaining solves the Optic flow information of the pixel according to Lucas-Kanade optical flow algorithms, leads to
The threshold value of setting light stream vectors size is crossed to define prospect static pixels (static labels) and dynamic pixel;Again dynamic picture
Element is quantized into the motion state described with 4 kinds of direction of motion, track, position, speed sports immunology words, therefore, for detection
The foreground pixel arrived has and determines prospect picture with the direction of motion, track, position, speed and static 5 kinds of possible sports immunology words
The motion conditions of element;
S5, video sequence and pixel are defined
Video sequence under scene in video monitoring is denoted asIt willIt is divided into several videos
Sequence, wherein,For m-th of video segment of segmentation, video sequenceRegard current corpus asThen Document (document) in corresponding corpus, in video segmentIn, definition pixel is word
(word), each word corresponds to a theme (topic), then with the variation of time t,In, each word theme is to other masters
Topic generates transfer or from transfering state, and from MCMC (Markov Chain MonteCarlo) characteristic, this characteristic is being passed through
A kind of Stationary Distribution can be reached after a period of time;
S6, word sheet is established
There are the expression of M/10 × N/10 kinds, fortune for the position of each pixel of the object video of M × N according to above-mentioned steps
Dynamic form has 5 kinds of descriptions, big target and Small object there are two types of statement, and the word that can be obtained is expressed as M/10 × N/10 × 5 × 2
Kind form, i.e., for some foreground pixel, existKind of describing mode, but at a time under, each pixel
Movable information and the target that is subordinate to there is independence, i.e., for video segment, the different masters formed with the variation of time t
Topic, theme independent should respectively obtain, and therefore, each position (location) can use union feature (to move, greatly
It is small) it representsIt will move and the feature of size cascades, then the word as each cell member
Set, uses VcIt represents, this is meant that when building a video-frequency band, and a pixel will provide two kinds of features simultaneously to this position
Word --- the target sizes for moving and being subordinate to, then final word originally can be expressed as M/10 × N/10 × (5+2) form;Therefore, one
The Feature Words of a pixel can be defined as wc,aC is thin cell location, and a is forms of motion and the union feature of size;
The foundation of S7, corpus
Monitor video is divided into several short video-frequency bands, each video-frequency band is as a document, t at any time in video-frequency band
The pixel of variation is expressed as the subject content that the word occurred in document and a series of this word represent, then is generated with each pixel
Word sheet for foundation, if total word frequency in corpus is N, in all N number of words, if paying close attention to each word viGeneration
Frequency frequency ni, then
Then the probability of each language material is in corpus:
Wherein, P (n) refers to the probability of the frequency number that each word occurs in corpus;
So, for each specific themeAnd the probability of vocabulary in corpus is generated by the themeThen
The probability that final corpus generates is exactly to each themeThe cumulative summation of the vocabulary probability of upper generation:
In corpus WObey multinomial distribution,ThemeIt obeys
One probability distributionThis distribution becomes parameterPrior distribution, prior distributionSelect the conjugation of multinomial distribution
Distribution --- Dirichlet is distributed;According to the regularity of distribution of Dirichlet, the generation probability to calculate corpus of text is:
Wherein,Represent the parameter of Dirichlet prior distributions;The corpus of text is into corpus by sets of documentation
Regard video sequence as a document (document), document is mixed by multiple themes (topic), and
Each Topic is the probability distribution on vocabulary, and each word that each pixel represents in video sequence is fixed by one
Topic generations, this process is exactly the process of Document Modeling, is a bag-of-words model:If there is V topic-
Word is denoted asEach theme corresponds to the probability distribution of a term vectorFor including the language material C=of M documents
(d1,d2,···,dM) in every document dm, all can be there are one specific doc-topicThat is the corresponding master of every document
Inscribing vectorial probability distribution isSo m document dmIn the generating probability of each word be:
The generating probability of entire chapter document is:
Due to independently of each other, the generating probability of entire language material being write out according to above-mentioned formula between document, Topic- is generated
Then Model carries out solution locally optimal solution using EM algorithms;
The judgement of S8, limbs conflict behavior
Limbs conflict behavior detection method based on the extraction of low-dimensional space-time characteristic and theme modeling, the data with reference to low-dimensional are special
Sign represents and the complex scene analysis based on model, video sequence is analyzed with this, according to detecting human body in video
Position using the variation of position of human body information in action, learns a mass motion model unrelated with body part, pass through
Mass motion model is analyzed, the parameter in the result and model that detect is compared, and then judges human motion state,
Each behavior can correspond to a kind of theme distribution in the present embodiment, under trained model case, in the video segment tested
Such as occur the situation of limbs conflict, then this behavior, which can concentrate, to be distributed in a kind of theme, and then determines this according to theme
Kind behavior is to belong to the state for limbs conflict occur.
Claims (1)
1. a kind of limbs conflict behavior detection method based on the extraction of low-dimensional space-time characteristic with theme modeling, it is characterised in that specific
Detection method carries out in accordance with the following steps:
The definition of S1, word sheet
First go out to meet the semantic understanding of human cognitive from original monitor video extracting data, be designed by the algorithm of the present invention
It automatically analyzes and understands video data, analytic process is divided into the extraction of foreground target, target signature represents and behavioural analysis is sorted out, should
Method is based on LDMA models for human body unusual checking in video monitoring, and the location of pixels of each object in video is carried out
Description, to each pixel decimation feature vector, position of this feature vector comprising each pixel, the speed of movement and direction, person in servitude
Belong to the size of target object, ultimately form visual information word sheet and document, and define an effective word sheet, supervised as covering
The dictionary that pixel in control video can inquire about;
S2, the location of pixels for quantifying object
In the video obtained in video monitoring, behavior is substantially characterized by the position of behavior hair survivor, and therefore, the present invention will
Location information is considered in the structure of word sheet, the location of pixels of object in video is quantized into the cell member of nonoverlapping 10*10
In, for the object video of M × N, therefore M/10 × N/10 cell tuple can be obtained;
S3, description scene in foreground target size
In order to accurately represent foreground target in object video, which kind of prospect each foreground pixel and the pixel are belonged to by the present invention
Target connects, in the video data obtained in video monitoring, the prospect frame observed based on they be sized to divide
For two classes, one kind is small prospect frame, mainly pedestrian, and one kind is big prospect frame, mainly includes vehicle or a group pedestrian;
Therefore, the present invention clusters to classify the size of prospect frame using K-means, so as to obtain the foreground target that each pixel is subordinate to,
The cluster numbers k=2 in K-means is taken, final to describe the size of the target in scene using cluster label 1 and 2, i.e., 1 is small
Target, 2 be big target;
S4, the motion conditions for determining foreground pixel
For the scene in video monitoring, the content of analysis is directed to foreground target, it is necessary to which carrying out background subtraction obtains prospect picture
Element, and each foreground pixel to obtaining solves the Optic flow information of the pixel according to Lucas-Kanade optical flow algorithms, by setting
The threshold value of light stream vectors size is determined to define prospect static pixels (static labels) and dynamic pixel;Again dynamic amount of pixels
The motion state that chemical conversion is described with 4 kinds of direction of motion, track, position, speed sports immunology words, therefore, for what is detected
Foreground pixel has and determines foreground pixel with the direction of motion, track, position, speed and static 5 kinds of possible sports immunology words
Motion conditions;
S5, video sequence and pixel are defined
Video sequence under scene in video monitoring is denoted asIt willSeveral video sequences are divided into,
In,For m-th of video segment of segmentation, video sequenceRegard current corpus asThen
Document (document) in corresponding corpus, in video segmentIn, definition pixel is word (word), and each word corresponds to one
A theme (topic), then with the variation of time t,In, each word theme generates transfer or certainly transfer shape to other themes
State, from MCMC (Markov Chain MonteCarlo) characteristic, this characteristic can reach a kind of after a period of time has passed
Stationary Distribution;
S6, word sheet is established
There is the expression of M/10 × N/10 kinds for the position of each pixel of the object video of M × N according to above-mentioned steps, move shape
Formula has 5 kinds of descriptions, big target and Small object there are two types of statement, and the word that can be obtained is expressed as M/10 × N/10 × 5 × 2 kind shape
Formula that is, for some foreground pixel, existsKind of describing mode, but at a time under, the fortune of each pixel
Dynamic information and the target being subordinate to have independence, i.e., for video segment, with the different themes that the variation of time t is formed,
Its theme independent should respectively obtain, and therefore, each position (location) can use union feature (movement, size)
To representTo move and the feature of size cascades, then as each cell member word collection
It closes, uses VcIt represents, this is meant that when building a video-frequency band, and a pixel will provide two kinds of features simultaneously to this position
Word --- the target sizes for moving and being subordinate to, then final word originally can be expressed as M/10 × N/10 × (5+2) form;Therefore, one
The Feature Words of a pixel can be defined as wc,aC is thin cell location, and a is forms of motion and the union feature of size;
The foundation of S7, corpus
Monitor video is divided into several short video-frequency bands, each video-frequency band is as a document, and t changes at any time in video-frequency band
Pixel be expressed as the subject content that the word occurred in document and a series of this word represent, then the word generated with each pixel
This is foundation, if total word frequency in corpus is N, in all N number of words, if paying close attention to each word viOccurrence frequency
Frequency ni, then
Then the probability of each language material is in corpus:
Wherein, P (n) refers to the probability of the frequency number that each word occurs in corpus;
So, for each specific themeAnd the probability of vocabulary in corpus is generated by the themeThen final language
Expect that the probability that storehouse generates is exactly to each themeThe cumulative summation of the vocabulary probability of upper generation:
In corpus WObey multinomial distribution,ThemeObey one generally
Rate is distributedThis distribution becomes parameterPrior distribution, prior distributionSelect the conjugation point of multinomial distribution
Cloth --- Dirichlet is distributed;According to the regularity of distribution of Dirichlet, the generation probability to calculate corpus of text is:
Wherein,Represent the parameter of Dirichlet prior distributions;The corpus of text is into corpus by sets of documentation
Regard video sequence as a document (document), document is mixed by multiple themes (topic), and each
Topic is the probability distribution on vocabulary, and each word that each pixel represents in video sequence is by a fixed Topic life
Into, this process is exactly the process of Document Modeling, is a bag-of-words model:If there is V topic-word, note
ForEach theme corresponds to the probability distribution of a term vectorFor including the language material C=(d of M documents1,
d2,···,dM) in every document dm, all can there are one specificThat is the corresponding master of every document
Inscribing vectorial probability distribution isSo m document dmIn the generating probability of each word be:
The generating probability of entire chapter document is:
Due to independently of each other, the generating probability of entire language material being write out according to above-mentioned formula between document, Topic-Model is generated,
Then solution locally optimal solution is carried out using EM algorithms;
The judgement of S8, limbs conflict behavior
Limbs conflict behavior detection method based on the extraction of low-dimensional space-time characteristic and theme modeling, with reference to the data characteristics table of low-dimensional
Show and the complex scene analysis based on model, video sequence is analyzed with this, according to detecting position of human body in video,
Using the variation of position of human body information in action, learn a mass motion model unrelated with body part, pass through analysis
Parameter in the result and model that detect is compared, and then judges human motion state, this hair by mass motion model
Each behavior can correspond to a kind of theme distribution in bright, under trained model case, if any going out in the video segment tested
The situation of existing limbs conflict, then this behavior, which can concentrate, to be distributed in a kind of theme, and then determines this behavior according to theme
It is to belong to the state for limbs conflict occur.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711366304.XA CN108108688B (en) | 2017-12-18 | 2017-12-18 | Limb conflict behavior detection method based on low-dimensional space-time feature extraction and topic modeling |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711366304.XA CN108108688B (en) | 2017-12-18 | 2017-12-18 | Limb conflict behavior detection method based on low-dimensional space-time feature extraction and topic modeling |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108108688A true CN108108688A (en) | 2018-06-01 |
CN108108688B CN108108688B (en) | 2021-11-23 |
Family
ID=62209950
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711366304.XA Active CN108108688B (en) | 2017-12-18 | 2017-12-18 | Limb conflict behavior detection method based on low-dimensional space-time feature extraction and topic modeling |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108108688B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109242826A (en) * | 2018-08-07 | 2019-01-18 | 高龑 | Mobile device end label shape object radical method of counting and system based on target detection |
CN110659363A (en) * | 2019-07-30 | 2020-01-07 | 浙江工业大学 | Web service mixed evolution clustering method based on membrane computing |
CN111160170A (en) * | 2019-12-19 | 2020-05-15 | 青岛联合创智科技有限公司 | Self-learning human behavior identification and anomaly detection method |
CN111707375A (en) * | 2020-06-10 | 2020-09-25 | 青岛联合创智科技有限公司 | Electronic class card with intelligent temperature measurement attendance and abnormal behavior detection functions |
CN113705274A (en) * | 2020-05-20 | 2021-11-26 | 杭州海康威视数字技术股份有限公司 | Climbing behavior detection method and device, electronic equipment and storage medium |
CN117372969A (en) * | 2023-12-08 | 2024-01-09 | 暗物智能科技(广州)有限公司 | Monitoring scene-oriented abnormal event detection method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103268495A (en) * | 2013-05-31 | 2013-08-28 | 公安部第三研究所 | Human body behavioral modeling identification method based on priori knowledge cluster in computer system |
CN103530603A (en) * | 2013-09-24 | 2014-01-22 | 杭州电子科技大学 | Video abnormality detection method based on causal loop diagram model |
CN103995915A (en) * | 2014-03-21 | 2014-08-20 | 中山大学 | Crowd evacuation simulation system based on composite potential energy field |
CN104268546A (en) * | 2014-05-28 | 2015-01-07 | 苏州大学 | Dynamic scene classification method based on topic model |
-
2017
- 2017-12-18 CN CN201711366304.XA patent/CN108108688B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103268495A (en) * | 2013-05-31 | 2013-08-28 | 公安部第三研究所 | Human body behavioral modeling identification method based on priori knowledge cluster in computer system |
CN103530603A (en) * | 2013-09-24 | 2014-01-22 | 杭州电子科技大学 | Video abnormality detection method based on causal loop diagram model |
CN103995915A (en) * | 2014-03-21 | 2014-08-20 | 中山大学 | Crowd evacuation simulation system based on composite potential energy field |
CN104268546A (en) * | 2014-05-28 | 2015-01-07 | 苏州大学 | Dynamic scene classification method based on topic model |
Non-Patent Citations (5)
Title |
---|
TIMOTHY HOSPEDALES ET AL.: "A Markov Clustering Topic Model for Mining Behaviour in Video", 《2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION》 * |
胡瑗 等: "基于轨迹分析的行人异常行为识别", 《计算机工程与科学》 * |
赵春晖 等: "《视频图像运动目标分析》", 30 June 2011, 国防工业出版社 * |
赵靓 等: "主题模型在视频异常行为检测中的应用", 《计算机科学》 * |
黄鲜萍: "人群运动主题语义特征提取和行为分析研究", 《中国博士学位论文全文数据库信息科技辑》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109242826A (en) * | 2018-08-07 | 2019-01-18 | 高龑 | Mobile device end label shape object radical method of counting and system based on target detection |
CN109242826B (en) * | 2018-08-07 | 2022-02-22 | 高龑 | Mobile equipment end stick-shaped object root counting method and system based on target detection |
CN110659363A (en) * | 2019-07-30 | 2020-01-07 | 浙江工业大学 | Web service mixed evolution clustering method based on membrane computing |
CN110659363B (en) * | 2019-07-30 | 2021-11-23 | 浙江工业大学 | Web service mixed evolution clustering method based on membrane computing |
CN111160170A (en) * | 2019-12-19 | 2020-05-15 | 青岛联合创智科技有限公司 | Self-learning human behavior identification and anomaly detection method |
CN111160170B (en) * | 2019-12-19 | 2023-04-21 | 青岛联合创智科技有限公司 | Self-learning human behavior recognition and anomaly detection method |
CN113705274A (en) * | 2020-05-20 | 2021-11-26 | 杭州海康威视数字技术股份有限公司 | Climbing behavior detection method and device, electronic equipment and storage medium |
CN113705274B (en) * | 2020-05-20 | 2023-09-05 | 杭州海康威视数字技术股份有限公司 | Climbing behavior detection method and device, electronic equipment and storage medium |
CN111707375A (en) * | 2020-06-10 | 2020-09-25 | 青岛联合创智科技有限公司 | Electronic class card with intelligent temperature measurement attendance and abnormal behavior detection functions |
CN111707375B (en) * | 2020-06-10 | 2021-07-09 | 青岛联合创智科技有限公司 | Electronic class card with intelligent temperature measurement attendance and abnormal behavior detection functions |
CN117372969A (en) * | 2023-12-08 | 2024-01-09 | 暗物智能科技(广州)有限公司 | Monitoring scene-oriented abnormal event detection method |
CN117372969B (en) * | 2023-12-08 | 2024-05-10 | 暗物智能科技(广州)有限公司 | Monitoring scene-oriented abnormal event detection method |
Also Published As
Publication number | Publication date |
---|---|
CN108108688B (en) | 2021-11-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108108688A (en) | A kind of limbs conflict behavior detection method based on the extraction of low-dimensional space-time characteristic with theme modeling | |
Nawaratne et al. | Spatiotemporal anomaly detection using deep learning for real-time video surveillance | |
Eyjolfsdottir et al. | Detecting social actions of fruit flies | |
CN101894276B (en) | Training method of human action recognition and recognition method | |
Hu et al. | A weakly supervised framework for abnormal behavior detection and localization in crowded scenes | |
Nair et al. | A review on Indian sign language recognition | |
CN104281853A (en) | Behavior identification method based on 3D convolution neural network | |
Serpush et al. | Complex human action recognition using a hierarchical feature reduction and deep learning-based method | |
Mittelman et al. | Weakly supervised learning of mid-level features with Beta-Bernoulli process restricted Boltzmann machines | |
CN103500456A (en) | Object tracking method and equipment based on dynamic Bayes model network | |
Morozov et al. | Development of the logic programming approach to the intelligent monitoring of anomalous human behaviour | |
Kumar et al. | Bird species classification from images using deep learning | |
Hajji et al. | Incidents prediction in road junctions using artificial neural networks | |
Hao et al. | Human behavior analysis based on attention mechanism and LSTM neural network | |
Zhao et al. | A unified framework with a benchmark dataset for surveillance event detection | |
Khokher et al. | Crowd behavior recognition using dense trajectories | |
Bhaltilak et al. | Human motion analysis with the help of video surveillance: a review | |
Pagariya et al. | Facial emotion recognition in videos using hmm | |
Fajar et al. | Real time human activity recognition using convolutional neural network and deep gated recurrent unit | |
Jasmine et al. | Study on recent approaches for human action recognition in real time | |
Bisoi et al. | Human Activity Recognition Using CTAL Model | |
Zhang | Spatial-Temporal Behavior Understanding | |
Loungani et al. | Vision Based Vehicle-Pedestrian Detection and Warning System | |
Gao et al. | The use of optimised SVM method in human abnormal behaviour detection | |
Liu et al. | Abnormal behavior recognition based on improved Gaussian mixture model and hierarchical detectors |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 266200 Household No. 8, Qingda Third Road, Laoshan District, Qingdao City, Shandong Province Applicant after: QINGDAO LIANHE CHUANGZHI TECHNOLOGY Co.,Ltd. Address before: Room 1204, No. 40, Hong Kong Middle Road, Shinan District, Qingdao, Shandong 266200 Applicant before: QINGDAO LIANHE CHUANGZHI TECHNOLOGY Co.,Ltd. |
|
CB02 | Change of applicant information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |