CN108108688A - A kind of limbs conflict behavior detection method based on the extraction of low-dimensional space-time characteristic with theme modeling - Google Patents

A kind of limbs conflict behavior detection method based on the extraction of low-dimensional space-time characteristic with theme modeling Download PDF

Info

Publication number
CN108108688A
CN108108688A CN201711366304.XA CN201711366304A CN108108688A CN 108108688 A CN108108688 A CN 108108688A CN 201711366304 A CN201711366304 A CN 201711366304A CN 108108688 A CN108108688 A CN 108108688A
Authority
CN
China
Prior art keywords
video
word
pixel
theme
document
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711366304.XA
Other languages
Chinese (zh)
Other versions
CN108108688B (en
Inventor
纪刚
周粉粉
周萌萌
安帅
商胜楠
于腾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Powerise Technology Co Ltd
Original Assignee
Qingdao Powerise Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Powerise Technology Co Ltd filed Critical Qingdao Powerise Technology Co Ltd
Priority to CN201711366304.XA priority Critical patent/CN108108688B/en
Publication of CN108108688A publication Critical patent/CN108108688A/en
Application granted granted Critical
Publication of CN108108688B publication Critical patent/CN108108688B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/28Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/254Analysis of motion involving subtraction of images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20224Image subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30232Surveillance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/44Event detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to technical field of video monitoring,It is related to limbs conflict behavior detection method of the kind based on the extraction of low-dimensional space-time characteristic with theme modeling,The step of detection is to need first to define a word sheet,The location of pixels of re-quantization object,The size of foreground target in scene is described,Determine the motion conditions of foreground pixel,By completing complete this foundation of word and the foundation of corpus after above-mentioned steps,The judgement of limbs conflict behavior is carried out by above-mentioned calculation,This method combines the data characteristics expression of low-dimensional and the complex scene analysis based on model,Utilize the variation of position of human body information in action,Learn a mass motion model unrelated with body part,By analyzing mass motion model,Parameter in the result and model that detect is compared,And then judge human motion state,The present invention is compared with prior art,This method design concept is ingenious,Testing principle science,Detection mode is simple and detects accuracy height,Great market prospects.

Description

It is a kind of to be detected based on the extraction of low-dimensional space-time characteristic and the limbs conflict behavior of theme modeling Method
Technical field:
The invention belongs to technical field of video monitoring, are related to a kind of limbs conflict behavior detection method, more particularly to a kind of Limbs conflict behavior detection method based on the extraction of low-dimensional space-time characteristic with theme modeling.
Background technology:
In recent years, increasing with various safe accidents, the promotion of popular awareness of safety, the artificial intelligence of simultaneous The energy infiltration of theory and the continuous maturation of artificial intelligence technology, intelligent monitoring have been to be concerned by more and more people.Traditional monitoring System mainly realizes the safety management to public arena by way of manually monitoring, and lacks real-time and initiative.Very much In the case of, video monitoring is to play the role of video backup not accomplishing the responsibility supervised due to unattended.In addition, with It the popularization of monitoring camera and lays extensively, traditional artificial monitor mode cannot meet the needs of modern monitoring.For Solve the problems, such as this, masses are devoted to seek solution to replace manual operation.At present, with Video Supervision Technique and The continuous development of information science, there is significant progress in the fields such as video monitoring, human-computer interaction, video search, automatically-monitored It is increasingly becoming a research topic with wide application prospect.Unusual checking is the important content monitored automatically, is compared It being concentrated in usually Human bodys' response in the identification of the conventional action of people, abnormal behaviour is usually sudden with height, and Duration is shorter, it is difficult to the characteristics of obtaining behavioural characteristic.
In recent years, for the detection of abnormal behaviour, researchers propose different methods, abnormal in early stage behavioral value Research work is focused primarily upon describes human body behavior using simple set model, such as based on two-dimensional silhouette model, three dimensional cylinder Model etc.;In addition to static geometric model, researcher also attempts to be modeled using some features for describing human motion, such as shape The features such as shape, angle, position, movement velocity, the direction of motion, movement locus carry out behavior description and differentiation, and using bag It includes the subspace method including Principal Component Analysis, independent component analysis method etc. and dimensionality reduction and screening is carried out to the feature of extraction, from And carry out behavioural analysis.The existing invention for unusual checking exists and fails the inherent characteristics for getting a real idea of abnormal behaviour, Thus existing unusual checking model can not abnormal reaction behavior completely essence, so as to cause according to existing abnormal behaviour Accuracy of detection that detection model obtains and not up to ideal effect, therefore, design is a kind of to be extracted and main based on low-dimensional space-time characteristic The limbs conflict behavior detection method of modeling is inscribed, computational methods are accurate, and testing result is accurate.
The content of the invention:
It is an object of the invention to overcome defect existing in the prior art, seek design one kind and carried based on low-dimensional space-time characteristic The limbs conflict behavior detection method with theme modeling is taken, calculation is simple, and computational accuracy is high, can be rapidly and accurately to limb Body conflict behavior is detected, and being capable of timely early warning.
To achieve these goals, a kind of limbs based on the extraction of low-dimensional space-time characteristic with theme modeling of the present invention The processing step that conflict behavior detection method specifically includes is as follows:
The definition of S1, word sheet
First go out to meet the semantic understanding of human cognitive from original monitor video extracting data, pass through the algorithm of the present invention Design, which automatically analyzes, understands video data, and analytic process is divided into the extraction of foreground target, target signature represents and behavioural analysis is returned Class, this method is based on LDMA models for human body unusual checking in video monitoring, to the pixel position of each object in video It puts and is described, to each pixel decimation feature vector, position of this feature vector comprising each pixel, the speed of movement and side To, be under the jurisdiction of the size of target object, ultimately form visual information word sheet and document, and define an effective word sheet, as Cover the dictionary that the pixel in monitor video can inquire about;
S2, the location of pixels for quantifying object
In the video obtained in video monitoring, behavior is substantially characterized by the position of behavior hair survivor, therefore, this hair In the bright structure that location information is considered to word sheet, the location of pixels of object in video is quantized into the thin of nonoverlapping 10*10 In cell element, for the object video of M × N, therefore M/10 × N/10 cell tuple can be obtained;
S3, description scene in foreground target size
In order to accurately represent foreground target in object video, which kind of each foreground pixel and the pixel belonged to by the present invention Foreground target connects, and in the video data obtained in video monitoring, the prospect frame observed is sized to based on them Two classes are divided into, one kind is small prospect frame, mainly pedestrian, and one kind is big prospect frame, mainly includes vehicle or a group Pedestrian;Therefore, the present invention clusters to classify the size of prospect frame using K-means, so as to obtain the prospect that each pixel is subordinate to Target takes the cluster numbers k=2 in K-means, final to describe the size of the target in scene, i.e., 1 using cluster label 1 and 2 It is big target for Small object, 2;
S4, the motion conditions for determining foreground pixel
For the scene in video monitoring, the content of analysis is directed to foreground target, it is necessary to carry out before background subtraction obtains Scene element, and each foreground pixel to obtaining solves the Optic flow information of the pixel according to Lucas-Kanade optical flow algorithms, leads to The threshold value of setting light stream vectors size is crossed to define prospect static pixels (static labels) and dynamic pixel;Again dynamic picture Element is quantized into the motion state described with 4 kinds of direction of motion, track, position, speed sports immunology words, therefore, for detection The foreground pixel arrived has and determines prospect picture with the direction of motion, track, position, speed and static 5 kinds of possible sports immunology words The motion conditions of element;
S5, video sequence and pixel are defined
Video sequence under scene in video monitoring is denoted asIt willIt is divided into several video sequences Row, wherein,For m-th of video segment of segmentation, video sequenceRegard current corpus asThen Document (document) in corresponding corpus, in video segmentIn, definition pixel is word (word), and each word corresponds to One theme (topic), then with the variation of time t,In, each word theme generates transfer to other themes or shifts certainly State, from MCMC (Markov Chain MonteCarlo) characteristic, this characteristic can reach one after a period of time has passed Kind Stationary Distribution;
S6, word sheet is established
There are the expression of M/10 × N/10 kinds, fortune for the position of each pixel of the object video of M × N according to above-mentioned steps Dynamic form has 5 kinds of descriptions, big target and Small object there are two types of statement, and the word that can be obtained is expressed as M/10 × N/10 × 5 × 2 Kind form, i.e., for some foreground pixel, existKind of describing mode, but at a time under, each pixel Movable information and the target that is subordinate to there is independence, i.e., for video segment, the different masters formed with the variation of time t Topic, theme independent should respectively obtain, and therefore, each position (location) can use union feature (to move, greatly It is small) it representsIt will move and the feature of size cascades, then the word as each cell member Set, uses VcIt represents, this is meant that when building a video-frequency band, and a pixel will provide two kinds of features simultaneously to this position Word --- the target sizes for moving and being subordinate to, then final word originally can be expressed as M/10 × N/10 × (5+2) form;Therefore, one The Feature Words of a pixel can be defined as wc,aC is thin cell location, and a is forms of motion and the union feature of size;
The foundation of S7, corpus
Monitor video is divided into several short video-frequency bands, each video-frequency band is as a document, t at any time in video-frequency band The pixel of variation is expressed as the subject content that the word occurred in document and a series of this word represent, then is generated with each pixel Word sheet for foundation, if total word frequency in corpus is N, in all N number of words, if paying close attention to each word viGeneration Frequency frequency ni, then
Then the probability of each language material is in corpus:
Wherein, P (n) refers to the probability of the frequency number that each word occurs in corpus;
So, for each specific themeAnd the probability of vocabulary in corpus is generated by the themeIt is then final The probability that corpus generates is exactly to each themeThe cumulative summation of the vocabulary probability of upper generation:
In corpus WObey multinomial distribution,ThemeIt obeys One probability distributionThis distribution becomes parameterPrior distribution, prior distributionSelect the conjugation of multinomial distribution Distribution --- Dirichlet is distributed;According to the regularity of distribution of Dirichlet, the generation probability to calculate corpus of text is:
Wherein,Represent the parameter of Dirichlet prior distributions;The corpus of text is into corpus by sets of documentation
Regard video sequence as a document (document), document is mixed by multiple themes (topic), and Each Topic is the probability distribution on vocabulary, and each word that each pixel represents in video sequence is fixed by one Topic generations, this process is exactly the process of Document Modeling, is a bag-of-words model:If there is V topic- Word is denoted asEach theme corresponds to the probability distribution of a term vectorFor including the language material C=of M documents (d1,d2,···,dM) in every document dm, all can be there are one specific doc-topicThat is the corresponding master of every document Inscribing vectorial probability distribution isSo m document dmIn the generating probability of each word be:
The generating probability of entire chapter document is:
Due to independently of each other, the generating probability of entire language material being write out according to above-mentioned formula between document, Topic- is generated Then Model carries out solution locally optimal solution using EM algorithms;
The judgement of S8, limbs conflict behavior
Limbs conflict behavior detection method based on the extraction of low-dimensional space-time characteristic and theme modeling combines the data of low-dimensional Character representation and the complex scene analysis based on model, analyze video sequence with this, according to detecting people in video Body position using the variation of position of human body information in action, learns a mass motion model unrelated with body part, leads to Analysis mass motion model is crossed, the parameter in the result and model that detect is compared, and then judges human motion shape State, the present invention in each behavior can correspond to a kind of theme distribution, under trained model case, in the video segment tested Such as occur the situation of limbs conflict, then this behavior, which can concentrate, to be distributed in a kind of theme, and then determines this according to theme Kind behavior is to belong to the state for limbs conflict occur.
Compared with prior art, the present invention it has the advantages that:It is main accurately to be extracted using the spectral signature of image The profile of moving region can clearly see the contour edge of moving target, be analyzed for behavioural characteristic, be applicable not only to beat The limbs conflict behaviors such as frame are equally applicable to the detection of other behaviors, such as quickly mobile behavior, this method design concept are skilful Wonderful, testing principle science, detection mode is simple and detects accuracy height, and application environment is friendly, great market prospects.
Description of the drawings:
Fig. 1 is the foreground detection design sketch of different video two field picture in video flowing of the present invention.
Fig. 2 is the limbs conflict behavior detection method of the present invention based on the extraction of low-dimensional space-time characteristic with theme modeling Process flow diagram.
Specific embodiment:
The present invention is described further by way of example and in conjunction with the accompanying drawings.
Embodiment:
To achieve these goals, being rushed based on the extraction of low-dimensional space-time characteristic and the limbs of theme modeling described in the present embodiment The processing step that prominent behavioral value method specifically includes is as follows:
The definition of S1, word sheet
First go out to meet the semantic understanding of human cognitive from original monitor video extracting data, calculation through this embodiment Method design, which automatically analyzes, understands video data, and analytic process is divided into the extraction of foreground target, target signature represents and behavioural analysis Sort out, this method is based on LDMA models for human body unusual checking in video monitoring, to the pixel of each object in video Position is described, to each pixel decimation feature vector, position of this feature vector comprising each pixel, movement speed and Direction, the size for being under the jurisdiction of target object ultimately form visual information word sheet and document, and define an effective word sheet, make The dictionary that can be inquired about for the pixel covered in monitor video;
S2, the location of pixels for quantifying object
In the video obtained in video monitoring, behavior is substantially characterized by the position of behavior hair survivor, therefore, this reality It applies in the structure that location information is considered word sheet by example, the location of pixels of object in video is quantized into nonoverlapping 10*10's In cell member, for the object video of M × N, therefore M/10 × N/10 cell tuple can be obtained;
S3, description scene in foreground target size
In order to accurately represent foreground target in object video, what each foreground pixel and the pixel belonged to by the present embodiment Kind foreground target connects, and in the video data obtained in video monitoring, the prospect frame observed is based on their size energy Two classes are enough divided into, one kind is small prospect frame, mainly pedestrian, and one kind is big prospect frame, mainly includes vehicle or one Group pedestrian;Therefore, the present embodiment is subordinate to using K-means clusters come the size for prospect frame of classifying so as to obtaining each pixel Foreground target takes the cluster numbers k=2 in K-means, final to describe the big of the target in scene using cluster label 1 and 2 Small, i.e., 1 is Small object, and 2 be big target;
S4, the motion conditions for determining foreground pixel
For the scene in video monitoring, the content of analysis is directed to foreground target, it is necessary to carry out before background subtraction obtains Scene element, and each foreground pixel to obtaining solves the Optic flow information of the pixel according to Lucas-Kanade optical flow algorithms, leads to The threshold value of setting light stream vectors size is crossed to define prospect static pixels (static labels) and dynamic pixel;Again dynamic picture Element is quantized into the motion state described with 4 kinds of direction of motion, track, position, speed sports immunology words, therefore, for detection The foreground pixel arrived has and determines prospect picture with the direction of motion, track, position, speed and static 5 kinds of possible sports immunology words The motion conditions of element;
S5, video sequence and pixel are defined
Video sequence under scene in video monitoring is denoted asIt willIt is divided into several videos Sequence, wherein,For m-th of video segment of segmentation, video sequenceRegard current corpus asThen Document (document) in corresponding corpus, in video segmentIn, definition pixel is word (word), each word corresponds to a theme (topic), then with the variation of time t,In, each word theme is to other masters Topic generates transfer or from transfering state, and from MCMC (Markov Chain MonteCarlo) characteristic, this characteristic is being passed through A kind of Stationary Distribution can be reached after a period of time;
S6, word sheet is established
There are the expression of M/10 × N/10 kinds, fortune for the position of each pixel of the object video of M × N according to above-mentioned steps Dynamic form has 5 kinds of descriptions, big target and Small object there are two types of statement, and the word that can be obtained is expressed as M/10 × N/10 × 5 × 2 Kind form, i.e., for some foreground pixel, existKind of describing mode, but at a time under, each pixel Movable information and the target that is subordinate to there is independence, i.e., for video segment, the different masters formed with the variation of time t Topic, theme independent should respectively obtain, and therefore, each position (location) can use union feature (to move, greatly It is small) it representsIt will move and the feature of size cascades, then the word as each cell member Set, uses VcIt represents, this is meant that when building a video-frequency band, and a pixel will provide two kinds of features simultaneously to this position Word --- the target sizes for moving and being subordinate to, then final word originally can be expressed as M/10 × N/10 × (5+2) form;Therefore, one The Feature Words of a pixel can be defined as wc,aC is thin cell location, and a is forms of motion and the union feature of size;
The foundation of S7, corpus
Monitor video is divided into several short video-frequency bands, each video-frequency band is as a document, t at any time in video-frequency band The pixel of variation is expressed as the subject content that the word occurred in document and a series of this word represent, then is generated with each pixel Word sheet for foundation, if total word frequency in corpus is N, in all N number of words, if paying close attention to each word viGeneration Frequency frequency ni, then
Then the probability of each language material is in corpus:
Wherein, P (n) refers to the probability of the frequency number that each word occurs in corpus;
So, for each specific themeAnd the probability of vocabulary in corpus is generated by the themeThen The probability that final corpus generates is exactly to each themeThe cumulative summation of the vocabulary probability of upper generation:
In corpus WObey multinomial distribution,ThemeIt obeys One probability distributionThis distribution becomes parameterPrior distribution, prior distributionSelect the conjugation of multinomial distribution Distribution --- Dirichlet is distributed;According to the regularity of distribution of Dirichlet, the generation probability to calculate corpus of text is:
Wherein,Represent the parameter of Dirichlet prior distributions;The corpus of text is into corpus by sets of documentation
Regard video sequence as a document (document), document is mixed by multiple themes (topic), and Each Topic is the probability distribution on vocabulary, and each word that each pixel represents in video sequence is fixed by one Topic generations, this process is exactly the process of Document Modeling, is a bag-of-words model:If there is V topic- Word is denoted asEach theme corresponds to the probability distribution of a term vectorFor including the language material C=of M documents (d1,d2,···,dM) in every document dm, all can be there are one specific doc-topicThat is the corresponding master of every document Inscribing vectorial probability distribution isSo m document dmIn the generating probability of each word be:
The generating probability of entire chapter document is:
Due to independently of each other, the generating probability of entire language material being write out according to above-mentioned formula between document, Topic- is generated Then Model carries out solution locally optimal solution using EM algorithms;
The judgement of S8, limbs conflict behavior
Limbs conflict behavior detection method based on the extraction of low-dimensional space-time characteristic and theme modeling, the data with reference to low-dimensional are special Sign represents and the complex scene analysis based on model, video sequence is analyzed with this, according to detecting human body in video Position using the variation of position of human body information in action, learns a mass motion model unrelated with body part, pass through Mass motion model is analyzed, the parameter in the result and model that detect is compared, and then judges human motion state, Each behavior can correspond to a kind of theme distribution in the present embodiment, under trained model case, in the video segment tested Such as occur the situation of limbs conflict, then this behavior, which can concentrate, to be distributed in a kind of theme, and then determines this according to theme Kind behavior is to belong to the state for limbs conflict occur.

Claims (1)

1. a kind of limbs conflict behavior detection method based on the extraction of low-dimensional space-time characteristic with theme modeling, it is characterised in that specific Detection method carries out in accordance with the following steps:
The definition of S1, word sheet
First go out to meet the semantic understanding of human cognitive from original monitor video extracting data, be designed by the algorithm of the present invention It automatically analyzes and understands video data, analytic process is divided into the extraction of foreground target, target signature represents and behavioural analysis is sorted out, should Method is based on LDMA models for human body unusual checking in video monitoring, and the location of pixels of each object in video is carried out Description, to each pixel decimation feature vector, position of this feature vector comprising each pixel, the speed of movement and direction, person in servitude Belong to the size of target object, ultimately form visual information word sheet and document, and define an effective word sheet, supervised as covering The dictionary that pixel in control video can inquire about;
S2, the location of pixels for quantifying object
In the video obtained in video monitoring, behavior is substantially characterized by the position of behavior hair survivor, and therefore, the present invention will Location information is considered in the structure of word sheet, the location of pixels of object in video is quantized into the cell member of nonoverlapping 10*10 In, for the object video of M × N, therefore M/10 × N/10 cell tuple can be obtained;
S3, description scene in foreground target size
In order to accurately represent foreground target in object video, which kind of prospect each foreground pixel and the pixel are belonged to by the present invention Target connects, in the video data obtained in video monitoring, the prospect frame observed based on they be sized to divide For two classes, one kind is small prospect frame, mainly pedestrian, and one kind is big prospect frame, mainly includes vehicle or a group pedestrian; Therefore, the present invention clusters to classify the size of prospect frame using K-means, so as to obtain the foreground target that each pixel is subordinate to, The cluster numbers k=2 in K-means is taken, final to describe the size of the target in scene using cluster label 1 and 2, i.e., 1 is small Target, 2 be big target;
S4, the motion conditions for determining foreground pixel
For the scene in video monitoring, the content of analysis is directed to foreground target, it is necessary to which carrying out background subtraction obtains prospect picture Element, and each foreground pixel to obtaining solves the Optic flow information of the pixel according to Lucas-Kanade optical flow algorithms, by setting The threshold value of light stream vectors size is determined to define prospect static pixels (static labels) and dynamic pixel;Again dynamic amount of pixels The motion state that chemical conversion is described with 4 kinds of direction of motion, track, position, speed sports immunology words, therefore, for what is detected Foreground pixel has and determines foreground pixel with the direction of motion, track, position, speed and static 5 kinds of possible sports immunology words Motion conditions;
S5, video sequence and pixel are defined
Video sequence under scene in video monitoring is denoted asIt willSeveral video sequences are divided into, In,For m-th of video segment of segmentation, video sequenceRegard current corpus asThen Document (document) in corresponding corpus, in video segmentIn, definition pixel is word (word), and each word corresponds to one A theme (topic), then with the variation of time t,In, each word theme generates transfer or certainly transfer shape to other themes State, from MCMC (Markov Chain MonteCarlo) characteristic, this characteristic can reach a kind of after a period of time has passed Stationary Distribution;
S6, word sheet is established
There is the expression of M/10 × N/10 kinds for the position of each pixel of the object video of M × N according to above-mentioned steps, move shape Formula has 5 kinds of descriptions, big target and Small object there are two types of statement, and the word that can be obtained is expressed as M/10 × N/10 × 5 × 2 kind shape Formula that is, for some foreground pixel, existsKind of describing mode, but at a time under, the fortune of each pixel Dynamic information and the target being subordinate to have independence, i.e., for video segment, with the different themes that the variation of time t is formed, Its theme independent should respectively obtain, and therefore, each position (location) can use union feature (movement, size) To representTo move and the feature of size cascades, then as each cell member word collection It closes, uses VcIt represents, this is meant that when building a video-frequency band, and a pixel will provide two kinds of features simultaneously to this position Word --- the target sizes for moving and being subordinate to, then final word originally can be expressed as M/10 × N/10 × (5+2) form;Therefore, one The Feature Words of a pixel can be defined as wc,aC is thin cell location, and a is forms of motion and the union feature of size;
The foundation of S7, corpus
Monitor video is divided into several short video-frequency bands, each video-frequency band is as a document, and t changes at any time in video-frequency band Pixel be expressed as the subject content that the word occurred in document and a series of this word represent, then the word generated with each pixel This is foundation, if total word frequency in corpus is N, in all N number of words, if paying close attention to each word viOccurrence frequency Frequency ni, then
Then the probability of each language material is in corpus:
Wherein, P (n) refers to the probability of the frequency number that each word occurs in corpus;
So, for each specific themeAnd the probability of vocabulary in corpus is generated by the themeThen final language Expect that the probability that storehouse generates is exactly to each themeThe cumulative summation of the vocabulary probability of upper generation:
In corpus WObey multinomial distribution,ThemeObey one generally Rate is distributedThis distribution becomes parameterPrior distribution, prior distributionSelect the conjugation point of multinomial distribution Cloth --- Dirichlet is distributed;According to the regularity of distribution of Dirichlet, the generation probability to calculate corpus of text is:
Wherein,Represent the parameter of Dirichlet prior distributions;The corpus of text is into corpus by sets of documentation
Regard video sequence as a document (document), document is mixed by multiple themes (topic), and each Topic is the probability distribution on vocabulary, and each word that each pixel represents in video sequence is by a fixed Topic life Into, this process is exactly the process of Document Modeling, is a bag-of-words model:If there is V topic-word, note ForEach theme corresponds to the probability distribution of a term vectorFor including the language material C=(d of M documents1, d2,···,dM) in every document dm, all can there are one specificThat is the corresponding master of every document Inscribing vectorial probability distribution isSo m document dmIn the generating probability of each word be:
The generating probability of entire chapter document is:
Due to independently of each other, the generating probability of entire language material being write out according to above-mentioned formula between document, Topic-Model is generated, Then solution locally optimal solution is carried out using EM algorithms;
The judgement of S8, limbs conflict behavior
Limbs conflict behavior detection method based on the extraction of low-dimensional space-time characteristic and theme modeling, with reference to the data characteristics table of low-dimensional Show and the complex scene analysis based on model, video sequence is analyzed with this, according to detecting position of human body in video, Using the variation of position of human body information in action, learn a mass motion model unrelated with body part, pass through analysis Parameter in the result and model that detect is compared, and then judges human motion state, this hair by mass motion model Each behavior can correspond to a kind of theme distribution in bright, under trained model case, if any going out in the video segment tested The situation of existing limbs conflict, then this behavior, which can concentrate, to be distributed in a kind of theme, and then determines this behavior according to theme It is to belong to the state for limbs conflict occur.
CN201711366304.XA 2017-12-18 2017-12-18 Limb conflict behavior detection method based on low-dimensional space-time feature extraction and topic modeling Active CN108108688B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711366304.XA CN108108688B (en) 2017-12-18 2017-12-18 Limb conflict behavior detection method based on low-dimensional space-time feature extraction and topic modeling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711366304.XA CN108108688B (en) 2017-12-18 2017-12-18 Limb conflict behavior detection method based on low-dimensional space-time feature extraction and topic modeling

Publications (2)

Publication Number Publication Date
CN108108688A true CN108108688A (en) 2018-06-01
CN108108688B CN108108688B (en) 2021-11-23

Family

ID=62209950

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711366304.XA Active CN108108688B (en) 2017-12-18 2017-12-18 Limb conflict behavior detection method based on low-dimensional space-time feature extraction and topic modeling

Country Status (1)

Country Link
CN (1) CN108108688B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109242826A (en) * 2018-08-07 2019-01-18 高龑 Mobile device end label shape object radical method of counting and system based on target detection
CN110659363A (en) * 2019-07-30 2020-01-07 浙江工业大学 Web service mixed evolution clustering method based on membrane computing
CN111160170A (en) * 2019-12-19 2020-05-15 青岛联合创智科技有限公司 Self-learning human behavior identification and anomaly detection method
CN111707375A (en) * 2020-06-10 2020-09-25 青岛联合创智科技有限公司 Electronic class card with intelligent temperature measurement attendance and abnormal behavior detection functions
CN113705274A (en) * 2020-05-20 2021-11-26 杭州海康威视数字技术股份有限公司 Climbing behavior detection method and device, electronic equipment and storage medium
CN117372969A (en) * 2023-12-08 2024-01-09 暗物智能科技(广州)有限公司 Monitoring scene-oriented abnormal event detection method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103268495A (en) * 2013-05-31 2013-08-28 公安部第三研究所 Human body behavioral modeling identification method based on priori knowledge cluster in computer system
CN103530603A (en) * 2013-09-24 2014-01-22 杭州电子科技大学 Video abnormality detection method based on causal loop diagram model
CN103995915A (en) * 2014-03-21 2014-08-20 中山大学 Crowd evacuation simulation system based on composite potential energy field
CN104268546A (en) * 2014-05-28 2015-01-07 苏州大学 Dynamic scene classification method based on topic model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103268495A (en) * 2013-05-31 2013-08-28 公安部第三研究所 Human body behavioral modeling identification method based on priori knowledge cluster in computer system
CN103530603A (en) * 2013-09-24 2014-01-22 杭州电子科技大学 Video abnormality detection method based on causal loop diagram model
CN103995915A (en) * 2014-03-21 2014-08-20 中山大学 Crowd evacuation simulation system based on composite potential energy field
CN104268546A (en) * 2014-05-28 2015-01-07 苏州大学 Dynamic scene classification method based on topic model

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
TIMOTHY HOSPEDALES ET AL.: "A Markov Clustering Topic Model for Mining Behaviour in Video", 《2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION》 *
胡瑗 等: "基于轨迹分析的行人异常行为识别", 《计算机工程与科学》 *
赵春晖 等: "《视频图像运动目标分析》", 30 June 2011, 国防工业出版社 *
赵靓 等: "主题模型在视频异常行为检测中的应用", 《计算机科学》 *
黄鲜萍: "人群运动主题语义特征提取和行为分析研究", 《中国博士学位论文全文数据库信息科技辑》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109242826A (en) * 2018-08-07 2019-01-18 高龑 Mobile device end label shape object radical method of counting and system based on target detection
CN109242826B (en) * 2018-08-07 2022-02-22 高龑 Mobile equipment end stick-shaped object root counting method and system based on target detection
CN110659363A (en) * 2019-07-30 2020-01-07 浙江工业大学 Web service mixed evolution clustering method based on membrane computing
CN110659363B (en) * 2019-07-30 2021-11-23 浙江工业大学 Web service mixed evolution clustering method based on membrane computing
CN111160170A (en) * 2019-12-19 2020-05-15 青岛联合创智科技有限公司 Self-learning human behavior identification and anomaly detection method
CN111160170B (en) * 2019-12-19 2023-04-21 青岛联合创智科技有限公司 Self-learning human behavior recognition and anomaly detection method
CN113705274A (en) * 2020-05-20 2021-11-26 杭州海康威视数字技术股份有限公司 Climbing behavior detection method and device, electronic equipment and storage medium
CN113705274B (en) * 2020-05-20 2023-09-05 杭州海康威视数字技术股份有限公司 Climbing behavior detection method and device, electronic equipment and storage medium
CN111707375A (en) * 2020-06-10 2020-09-25 青岛联合创智科技有限公司 Electronic class card with intelligent temperature measurement attendance and abnormal behavior detection functions
CN111707375B (en) * 2020-06-10 2021-07-09 青岛联合创智科技有限公司 Electronic class card with intelligent temperature measurement attendance and abnormal behavior detection functions
CN117372969A (en) * 2023-12-08 2024-01-09 暗物智能科技(广州)有限公司 Monitoring scene-oriented abnormal event detection method
CN117372969B (en) * 2023-12-08 2024-05-10 暗物智能科技(广州)有限公司 Monitoring scene-oriented abnormal event detection method

Also Published As

Publication number Publication date
CN108108688B (en) 2021-11-23

Similar Documents

Publication Publication Date Title
CN108108688A (en) A kind of limbs conflict behavior detection method based on the extraction of low-dimensional space-time characteristic with theme modeling
Nawaratne et al. Spatiotemporal anomaly detection using deep learning for real-time video surveillance
Eyjolfsdottir et al. Detecting social actions of fruit flies
CN101894276B (en) Training method of human action recognition and recognition method
Hu et al. A weakly supervised framework for abnormal behavior detection and localization in crowded scenes
Nair et al. A review on Indian sign language recognition
CN104281853A (en) Behavior identification method based on 3D convolution neural network
Serpush et al. Complex human action recognition using a hierarchical feature reduction and deep learning-based method
Mittelman et al. Weakly supervised learning of mid-level features with Beta-Bernoulli process restricted Boltzmann machines
CN103500456A (en) Object tracking method and equipment based on dynamic Bayes model network
Morozov et al. Development of the logic programming approach to the intelligent monitoring of anomalous human behaviour
Kumar et al. Bird species classification from images using deep learning
Hajji et al. Incidents prediction in road junctions using artificial neural networks
Hao et al. Human behavior analysis based on attention mechanism and LSTM neural network
Zhao et al. A unified framework with a benchmark dataset for surveillance event detection
Khokher et al. Crowd behavior recognition using dense trajectories
Bhaltilak et al. Human motion analysis with the help of video surveillance: a review
Pagariya et al. Facial emotion recognition in videos using hmm
Fajar et al. Real time human activity recognition using convolutional neural network and deep gated recurrent unit
Jasmine et al. Study on recent approaches for human action recognition in real time
Bisoi et al. Human Activity Recognition Using CTAL Model
Zhang Spatial-Temporal Behavior Understanding
Loungani et al. Vision Based Vehicle-Pedestrian Detection and Warning System
Gao et al. The use of optimised SVM method in human abnormal behaviour detection
Liu et al. Abnormal behavior recognition based on improved Gaussian mixture model and hierarchical detectors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 266200 Household No. 8, Qingda Third Road, Laoshan District, Qingdao City, Shandong Province

Applicant after: QINGDAO LIANHE CHUANGZHI TECHNOLOGY Co.,Ltd.

Address before: Room 1204, No. 40, Hong Kong Middle Road, Shinan District, Qingdao, Shandong 266200

Applicant before: QINGDAO LIANHE CHUANGZHI TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant