CN107341815B - Violent motion detection method based on multi-view stereoscopic vision scene stream - Google Patents
Violent motion detection method based on multi-view stereoscopic vision scene stream Download PDFInfo
- Publication number
- CN107341815B CN107341815B CN201710404056.7A CN201710404056A CN107341815B CN 107341815 B CN107341815 B CN 107341815B CN 201710404056 A CN201710404056 A CN 201710404056A CN 107341815 B CN107341815 B CN 107341815B
- Authority
- CN
- China
- Prior art keywords
- motion
- scene
- flow
- dimensional
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/215—Motion-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/251—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/292—Multi-camera tracking
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a violent motion detection method based on a multi-view stereoscopic scene stream. Firstly, the method comprises the following steps: acquiring a plurality of groups of image sequences by using a calibrated multi-view camera; II, secondly: preprocessing an image sequence; thirdly, the method comprises the following steps: designing a scene flow energy functional data item; fourthly, the method comprises the following steps: designing a scene flow energy functional smoothing term; fifthly: performing optimization solution on an energy functional ground; calculating by using a calculation model from the image pyramid lowest resolution image obtained in the step two; sixthly, the method comprises the following steps: clustering of scene stream motion regions; seventhly, the method comprises the following steps: constructing a motion direction discrete degree evaluation model, and judging whether the motion is violent; eighthly: constructing a kinetic energy size evaluation model of a motion area; nine: and setting a threshold value, and triggering an alarm when n continuous frames meet the evaluation condition. The invention adopts scene flow estimation based on multi-view stereo vision, and multiple groups of image sequences from the same scene are acquired by a calibrated multi-view camera. The detection of sharp motion can be efficiently performed using a 3-dimensional scene stream.
Description
Technical Field
The invention relates to a method for detecting violent movement, in particular to a method for detecting violent movement based on multi-view stereoscopic vision scene flow.
Background
With the high development of the technology information technology, especially human beings have breakthrough progress on computer vision and artificial intelligence, so that a lot of work which should be completed by manpower can be completed by a computer. Such as video surveillance, the most common method of operation is for a person to observe the surveillance display and then react accordingly to the occurrence of an abnormal event. A false alarm phenomenon inevitably occurs because a person cannot concentrate on monitoring all events occurring in a video for a long time. Therefore, it is very important to use a computer to process the video frames and determine whether an abnormal event occurs.
The video surveillance camera is typically fixed in position, i.e. object detection in a static background. The classical methods for the detection of objects in most static backgrounds are as follows: background subtraction, interframe subtraction and optical flow. The background subtraction method has the advantages of small calculation amount, and can update the background model according to the dynamic background change, but is greatly influenced by the background change. The interframe difference method also has a small amount of operation, but does not perform well in terms of stability and robustness. The above two methods are difficult to achieve ideal effects for detecting violent movement. The optical flow method is to calculate an optical flow field through two adjacent frames of images, wherein the calculated flow field is 2-dimensional, namely only plane motion information is lost but depth information is lost. Under the condition of no depth information, the detection of the violent motion is difficult to evaluate and judge, and false alarms are easily caused.
The scene stream contains 3-dimensional motion information and 3-dimensional surface depth information, which is indicative of the true motion of the surface of the object relative to the three general methods described above. The scene flow can obtain enough information to judge whether the motion is violent motion, namely the scene flow can effectively solve the problem of judgment of the violent motion.
Disclosure of Invention
The invention aims to provide a method for detecting violent motion based on multi-view stereoscopic scene flow, which has strong detection adaptability.
The purpose of the invention is realized as follows:
the method comprises the following steps: acquiring a plurality of groups of image sequences by using a calibrated multi-view camera;
step two: preprocessing an image sequence, performing multi-resolution down-sampling on the image sequence by adopting an image pyramid, performing coordinate system conversion according to internal and external parameters of a camera, and establishing a relation between an image coordinate system and a camera coordinate system;
step three: designing a scene flow energy functional data item, directly fusing 3-dimensional scene flow information and 3-dimensional surface depth information, designing the data item, and introducing a robust penalty function at the same time on the basis of a structure tensor constancy assumption;
step four: designing a scene flow energy functional smoothing term, wherein the smoothing term adopts flow driving anisotropic smoothing which simultaneously constrains a 3-dimensional flow field V (u, V, w) and a 3-dimensional surface depth Z, and the smoothing term simultaneously introduces a robust penalty function;
step five: optimizing and solving the energy functional, minimizing the energy functional to obtain an Euler-Lagrange equation, and then solving the equation; starting to use a calculation model to calculate from the image pyramid lowest resolution image obtained in the step two until the image pyramid lowest resolution image reaches the full resolution image;
step six: clustering the motion areas of the scene flow, clustering the motion areas by using a clustering algorithm, separating the motion areas from background areas, and removing the background areas;
step seven: constructing a motion direction discrete degree evaluation model, and judging whether the motion is violent;
step eight: constructing a kinetic energy size evaluation model of a motion area;
step nine: and setting a threshold value, and triggering an alarm when n continuous frames meet the evaluation condition.
The present invention may further comprise:
1. in the first step, a plurality of groups of image sequences are acquired by using a calibrated multi-view camera, and then scene flow V (u, V, w) and depth information Z are obtained.
2. In the second step, in the establishment of the relationship between the image coordinate system and the camera coordinate system, the relationship between the 2-dimensional optical flow and the 3-dimensional scene flow is established asWhere (u, v) is the 2-dimensional optical flow, (u)0,v0) Are the optical center coordinates.
3. The design of the data items described in step three specifically includes using the assumption of constancy based on the structure tensor,
the constancy assumption of the structure tensor of the N cameras at the time t and t +1 is defined as:
reference camera C0The assumption of constancy of the structure tensor with the other N-1 cameras at time t is defined as:
reference camera C0The structural constancy assumption with the other N-1 cameras at time t +1 is defined as:
in the above data item formulaPenalty function is 0.0001, so that the smoothness approximates to L1The norm of the number of the first-order-of-arrival,is a binary shielding mask, is obtained by a shielding boundary region detection technology of a stereo image, and is used for shielding points when pixels are shielding pointsNon-occluded pointsITIs a local tensor of the 2-dimensional image, e.g. formulaAs shown.
4. The design of the scene flow energy functional smoothing term described in step four specifically includes,
directly regularizing 3-dimensional flow field and depth information, and designing a flow-driven anisotropic smooth hypothesis, Sm(V) and Sd(Z) respectively constraining the 3-dimensional flow field and the depth information, wherein a smooth item design formula is as follows:
Sm(V)=ψ(|u(x,y)x|2)+ψ(|u(x,y)y|2)+ψ(|v(x,y)x|2)+ψ(|v(x,y)y|2)+ψ(|w(x,y)x|2)+ψ(|w(x,y)y|2)
Sd(Z)=ψ(|Z(x,y)x|2)+ψ(|Z(x,y)y|2)
the overall scene flow estimation energy functional is as follows,
5. the clustering of the scene flow motion areas in the sixth step specifically includes clustering the scene flow V (u, V, w) obtained in the fifth step by using a clustering algorithm, and separating the background from the motion areas, wherein the feature information of the scene flow specifically includes: each point scene flow u, v, w three components, each point scene flow module value isThe included angle theta between each point scene flow and the xoy plane, the xoz plane and the yoz planex,θy,θzEach point ofAll represent V by a 7-dimensional feature vectori,j=(u,v,w,|V|,θx,θy,θz);
The specific process is as follows: the input is a similarity matrix S formed by the similarity between every two of all N data pointsN×NThe initial stage treats all samples as potential cluster centers and then x in order to find the appropriate cluster centerkContinuously collecting the attraction degree r (i, k) and the reliability degree a (i, k) from the data samples, and matching the formula Continuously iterating to update the attraction degree and the reliability degree until m cluster center points are generated, wherein r (i, k) is used for describing the degree that the point k is suitable as the cluster center of the data point i, and a (i, k) is used for describing the degree that the point k is selected as the cluster center of the point i;
setting a flag for a motion area, wherein if the motion area is a motion area, the flag is 1, if the motion area is a background area, the flag is 0, counting the number of pixels in the motion area as count, and setting the motion area as a spatial neighborhood
6. The constructing of the motion direction discrete degree evaluation model described in the seventh step specifically includes,
defining the Z axis based on the camera coordinate system as the reference vector direction, and calculating the included angle phi between each motion vector and the reference directioni,j(t), the calculation formula is as follows:
calculating phi of pixel point in each frame motion regioni,j(t) variance D (phi)i,j(t)), whereinIs the average of all the included angles,
7. the kinetic energy of each frame of motion region is calculated from the calculated scene stream as follows:
calculating the average kinetic energy of the motion region from the total kinetic energy of each frame of motion region
8. In the step of setting an angle variance threshold phithAnd kinetic energy threshold value WthWhen D (phi)i,j(t))>φth,If n frames satisfy the above two conditions, it is determined that the motion is violent and an alarm is triggered.
The invention adopts scene flow estimation based on multi-view stereo vision, and multiple groups of image sequences from the same scene are acquired by a calibrated multi-view camera. The invention can obtain scene flow information of a multi-view scene sequence and scene 3-dimensional surface depth information, and can effectively detect severe motion by using the 3-dimensional scene flow.
Drawings
FIG. 1 is a flow chart of the present invention.
Fig. 2 shows a stereo correspondence relationship between image sequences acquired by the multi-view camera.
FIG. 3 is a flow chart of an algorithm for solving a scene flow.
Detailed Description
With reference to fig. 1, the detection of the violent motion based on the multi-view stereoscopic scene stream of the present invention mainly includes the following steps:
s1, a plurality of groups of image sequences are obtained by using a calibrated multi-view camera.
And S2, preprocessing the input image, and performing multi-resolution down-sampling on the image sequence by adopting an image pyramid. And converting a coordinate system according to the internal and external parameters of the camera, and establishing a relation between an image coordinate system and a camera coordinate system.
And S3, designing a scene flow energy functional data item. Different from the most of the previous constraint modes which combine optical flow and parallax, the method adopts the mode of directly fusing 3-dimensional scene flow information and 3-dimensional surface depth information. The design of the data item adopts the constancy assumption based on the structure tensor, and simultaneously introduces a robust penalty function.
And S4, designing a scene flow energy functional smoothing item. And the smoothing term adopts flow driving anisotropic smoothing which simultaneously constrains a 3-dimensional flow field V (u, V, w) and a 3-dimensional surface depth Z, and the robust penalty function is introduced into the smoothing term.
And S5, energy functional ground optimization solving. In order to solve the 3-dimensional motion V (u, V, w) and the 3-dimensional surface depth Z, it is necessary to minimize the energy functional, obtain the euler-lagrange equation, and then solve the equation. A multi-resolution calculation scheme from coarse to fine is introduced for solving the problem of large displacement existing in a scene stream. The calculation using the calculation model is started from the image pyramid lowest resolution image obtained in S2 until the full resolution image is reached. And S6, clustering the motion areas of the scene streams. And clustering the motion areas by using a clustering algorithm, separating the motion areas from the background areas, and removing the background areas, thereby facilitating the establishment of a subsequent violent motion judgment model.
And S7, compared with the steady motion state, the motion direction of the 3-dimensional scene flow obtained in the target area in the violent motion state is disordered. Based on the above, a motion direction discrete degree evaluation model can be constructed, and whether the motion is violent motion or not is judged.
S8, compared with the steady motion state, the 3-dimensional scene flow value obtained by the target area in the violent motion state is larger. Based on the method, a motion area kinetic energy size evaluation model can be constructed.
S9, manually setting corresponding threshold values, and triggering an alarm when the continuous n frames meet evaluation conditions.
The invention will now be described in more detail by way of example with reference to the accompanying drawings.
S1, as shown in figure 2, an image sequence is acquired by using a calibrated multi-view camera. Points in a real scene move from the P position to t +1 from time tPosition, two points at each camera CiThe corresponding points in the imaging plane are respectively points piAnd pointtime t +1 positionWhere V (u, V, w) is a real-world 3-dimensional motion vector, u represents a real-world horizontal-direction instantaneous motion velocity, V represents a real-world vertical-direction instantaneous motion velocity, and w represents an instantaneous motion velocity in the depth direction. Mapping V (u, V, w) to 2-dimensional is optical flow
S2, the method is based on direct estimation of 3-dimensional scene flow of multi-view stereo vision, and a real world 3-dimensional motion flow field V (u, V, w) and a 3-dimensional surface depth Z are directly constrained in an energy functional. The scene flow energy functional is based on a 2-dimensional plane image, so that a 3-dimensional space needs to be mapped to a 2-dimensional space through perspective projection transformation, and a mapping relation between a 2-dimensional optical flow and a 3-dimensional scene flow is established. I (x)i,yiT) is a camera CiImage sequence pixel point at time t, MiIs about the camera CiThe projection matrix of (2). P (X, Y, Z)TThe real coordinate of the camera coordinate system at the time t is mapped to an image sequence relational expression as follows:
wherein M isiIs a 3 × 4 projection matrix, [ M ]i]1,2Is the first two rows of the matrix, [ M ]i]3Is the third row of the matrix. The projection matrix is shown in formula (2), and C is a camera internal parameter matrix which is only related to the internal structure of the camera. [ R T]It is the extrinsic parameter matrix of the camera that is determined by the orientation of the camera relative to the world coordinate system.
The scene flow solving energy functional obtained based on the relation has P (X, Y, Z)TV (u, V, w)6 unknowns. As shown in equation (3), the relationship between X, Y and Z can be established, and 6 unknowns can be reduced to 4 unknowns. Solving for Z and V by N pairs of image sequences, where (o)x,oy) Is the camera principal point.
The relationship between the 2-dimensional optical flow V (u, V) and the 3-dimensional scene flow V (u, V, w) is shown in equation (4):
and performing image pyramid on the obtained N image sequences, performing multi-resolution down-sampling on the images, wherein the sampling factor eta is 0.9, and performing Gaussian filtering on the images obtained in each layer to filter partial noise.
And S3, designing a scene flow energy functional data item. The structure tensor constancy assumption is used in both the spatial and temporal domains. From the structure tensor constancy assumption the following equation can be derived:
the structure tensor of the N cameras at the time t and t +1 is constantly defined as the formula (5):
reference camera C0The structure tensor constancy hypothesis definition at time t for a camera and other N-1 cameras is shown in equation (6):
reference camera C0The structure tensor constancy assumption definition at time t +1 for a camera and other N-1 cameras is shown in equation (7):
in the above data item formula0.0001 is a penalty function, such that the smoothness approximates TV-L1Norm, in order to reduce the influence of the out-of-set points on the functional solution. I isTIs the local tensor of the 2-dimensional image as shown in equation (8).
Is a binary occlusion mask that acts to ignore occlusion point pixels. When the pixel is an occlusion pointNon-occluded pointsIs calculated by the occlusion boundary area detection technology of the stereo image, and adopts an occlusion boundary detection algorithm based on a credible map. The reference camera C can be effectively detected0And occlusion areas with other cameras.
And S4, designing a scene flow energy functional smoothing item. Suppose for the parameterExamination camera C0The depth information Z and the 3-dimensional flow field V (u, V, w) are piecewise smooth. The smoothing term directly regularizes a 3-dimensional flow field and depth information, the flow field has smoothness in a 3-dimensional space, and a flow driving anisotropic smoothing hypothesis is designed, so that the smoothness of scene flow is ensured.
Sd(Z)=ψ(|Z(x,y)x|2)+ψ(|Z(x,y)y|2) (10)
Sm(V) and SdAnd (Z) respectively constraining the 3-dimensional flow field and the depth information, and adopting anisotropic constraint smoothing based on 3-dimensional flow driving for the flow field. The entire energy functional can be written as shown in equation (11):
s5, a solution scheme of the scene flow is adopted, namely the values of Z and V when the energy functional is minimized to the maximum extent are found. The common approach is to minimize the energy functional to obtain the Euler-Lagrange equation and then solve the Euler-Lagrange equation. The euler-lagrange equation after minimization of the energy functional can be written as:
before minimizing the energy functional, the structural tensor constancy assumption in the data item is abbreviated for simplicity as follows:
Δi=IT(p0,t)-IT(pi,t) (14)
according to the variation principle, the energy functional E (Z, V) is minimized, the partial derivatives of u and Z are respectively calculated and are equal to 0, and the following Euler-Lagrangian equation can be obtained:
for v, w minimizes the energy functional to obtain an Eulerian-Lagrangian equation similar to equation (16). The nonlinear problem exists in data items and smoothing items, and the most critical in the process of solving the scene stream is how to avoid trapping in a local minimum to obtain a global optimal solution.
Since a violent motion is detected, there will be a large displacement motion. In order to solve the problem of large displacement, a multi-resolution calculation scheme from coarse to fine is adopted. Using the image pyramid already obtained in S2, the initial value of the scene stream is set to 0, the calculation is started from the lowest resolution, and the initial value is added to the result as the initial value of the next resolution until the full resolution image is reached. Therefore, the problem of calculation inaccuracy caused by large displacement can be effectively eliminated. The specific solution is shown in fig. 3. In FIG. 3, L is the number of image pyramid layers, and only V (u, V, w) is calculated when L is greater than or equal to K, and scene streams V (u, V, w) and Z are calculated when L is greater than 0 and less than K.
S6.3 dimensional scene stream motion V (u, V, w) clustering. The scene stream V (u, V, w) calculated at S5 is not necessarily zero-valued in the background region due to noise and error. If the background area scene flow is not zero, the subsequent violent motion judgment is influenced. The motion areas are clustered by using a clustering algorithm, so that the background and the motion areas are separated, the background area is excluded, and the evaluation of the violent motion can be effectively carried out.
The clustering algorithm aims to find an optimal class representative point set so that the sum of the similarity of all data points to the nearest class representative point is maximum. The algorithm is briefly as follows: the input to the algorithm is all N data pointsSimilarity matrix S formed by similarity between every twoN×NThe algorithm start-up considers all samples as potential cluster centers. Information establishing the degree of attraction with other sample points for each sample point is defined as follows:
the attraction degree: r (i, k) is used to describe the extent to which point k fits as the cluster center for data point i.
Reliability: a (i, k) is used to describe how well point i selects point k as its cluster center.
To find a suitable cluster center xkThe algorithm continuously gathers evidence r (i, k) and a (i, k) from data samples. The iterative formula for r (i, k) and a (i, k) is as follows:
the algorithm iterates through equations (18) (19) to update the attractiveness and reliability until m high quality cluster center points are generated, while assigning the remaining data points to the corresponding clusters.
The calculated scene stream includes a scene stream of the background region and a scene stream of the moving object, which are significantly different. The scene flow at each point will differ in magnitude and direction. Therefore, the algorithm takes the scene flow direction information and the amplitude information of each point as the characteristics of the point to form the characteristic vector of the point, and the characteristic vector is input into the clustering algorithm for classification.
The feature information of the scene stream specifically includes: each point scene flow u, v, w three components, each point scene flow module value isThe included angle theta between each point scene flow and the xoy plane, the xoz plane and the yoz planex,θy,θz. Each point represents V by a 7-dimensional feature vectori,j=(u,v,w,|V|,θx,θy,θz). And clustering the scene flow with the 7-dimensional feature vector, wherein the obtained clustering area comprises a background area and a motion area. Generally, based on scene flow, when the camera is stationary, it is determined that a region with a motion vector close to 0 in the clustering result belongs to a background region, and other clustering regions are motion regions.
After the motion area and the background area are separated, a flag is set. If the motion region flag is equal to 1, if the motion region flag is equal to 0, counting the number of pixels in the motion region as count, and setting the motion region as a spatial neighborhood
And S7, according to the motion area scene flow V (u, V, w) obtained in the step S6, establishing a proper evaluation model to evaluate whether the motion area scene flow is violent or not.
And establishing a motion direction evaluation model according to the motion direction condition of the scene flow. The motion area of each frame of the moving object in the camera coordinate system has been separated in S6. If the motion is normal and stable, the motion vector direction of the motion area is analyzed, and the main focus can be found in one direction. The distribution of the motion directions of the violent motion is relatively random. If the motion vector direction histogram of the motion point is constructed, the histogram formed by the violent motion is relatively discrete, and the histogram formed by the steady motion is relatively centralized.
For quantitative evaluation of the direction of motion of each motion vector, the Z-axis based on the camera coordinate system is defined as the reference vector direction. The direction of the motion vector can be determined by calculating the included angle between each motion point of the motion area and the reference direction. Obtaining the horizontal velocity u of each pixel point of the nth frame in the camera coordinate system from S5i,j(t) velocity v in the vertical directioni,j(t) and velocity w in the depth directioni,j(t) of (d). Its angle phi with the reference vectori,j(t) is shown in equation (20):
to determine if the motion is a violent motion, calculate phi of all motion pointsi,j(t) variance D (phi)i,j(t)), as shown in formula (21), whereinIs the mean of all included angles.
And S8, establishing an evaluation model of the kinetic energy of the motion according to the motion energy of the motion area. The kinetic energy of each frame of motion region of the scene stream is calculated as follows:
the average kinetic energy of each pixel point in the motion area can be calculated according to the total kinetic energy of each frame of motion area
S9, manually setting an angle variance threshold phithWith the kinetic energy threshold W of each pixelthFrom S7 and S8, D (φ) is knowni,j(t))>φth,If n frames continuously satisfy the two conditions, it is determined that the motion is abnormal and violent, and an alarm is triggered.
The invention uses 3-dimensional scene flow for violent motion detection for the first time, and can better realize the detection and alarm function of violent motion.
Claims (8)
1. A violent motion detection method based on multi-view stereoscopic vision scene flow is characterized by comprising the following steps:
the method comprises the following steps: acquiring a plurality of groups of image sequences by using a calibrated multi-view camera;
step two: preprocessing an image sequence, performing multi-resolution down-sampling on the image sequence by adopting an image pyramid, performing coordinate system conversion according to internal and external parameters of a camera, and establishing a relation between an image coordinate system and a camera coordinate system;
step three: designing a scene flow energy functional data item, directly fusing 3-dimensional scene flow information and 3-dimensional surface depth information, designing the data item, and introducing a robust penalty function at the same time on the basis of a structure tensor constancy assumption;
step four: designing a scene flow energy functional smoothing term, wherein the smoothing term adopts flow driving anisotropic smoothing which simultaneously constrains a 3-dimensional flow field V (u, V, w) and a 3-dimensional surface depth Z, and the smoothing term simultaneously introduces a robust penalty function;
step five: optimizing and solving the energy functional, minimizing the energy functional to obtain an Euler-Lagrange equation, and then solving the equation; starting to use a calculation model to calculate from the image pyramid lowest resolution image obtained in the step two until the image pyramid lowest resolution image reaches the full resolution image;
step six: clustering the motion areas of the scene flow, clustering the motion areas by using a clustering algorithm, separating the motion areas from background areas, and removing the background areas;
step seven: constructing a motion direction discrete degree evaluation model, and judging whether the motion is violent;
step eight: constructing a kinetic energy size evaluation model of a motion area;
step nine: and setting a threshold value, and triggering an alarm when n continuous frames meet the evaluation condition.
2. The method of claim 1, wherein the method comprises: in the second step, in the establishment of the relationship between the image coordinate system and the camera coordinate system, the relationship between the 2-dimensional optical flow and the 3-dimensional scene flow is established asWhere (u, v) is the 2-dimensional optical flow, (u)0,v0) Are the optical center coordinates.
3. The method of claim 1, wherein the method comprises: the design of the data items described in step three specifically includes using the assumption of constancy based on the structure tensor,
the constancy assumption of the structure tensor of the N cameras at the time t and t +1 is defined as:
reference camera C0The assumption of constancy of the structure tensor with the other N-1 cameras at time t is defined as:
reference camera C0The structural constancy assumption with the other N-1 cameras at time t +1 is defined as:
in the above data item formulaIs a penalty function, making the smoothness approximate to L1The norm of the number of the first-order-of-arrival,is a binary shielding mask, is obtained by a shielding boundary region detection technology of a stereo image, and is used for shielding points when pixels are shielding pointsNon-shielding holderITIs a local tensor of the 2-dimensional image, e.g. formulaAs shown.
4. The method of claim 1, wherein the method comprises: the design of the scene flow energy functional smoothing term described in step four specifically includes,
directly regularizing 3-dimensional flow field and depth information, and designing a flow-driven anisotropic smooth hypothesis, Sm(V) and Sd(Z) respectively constraining the 3-dimensional flow field and the depth information, wherein a smooth item design formula is as follows:
Sd(Z)=ψ(|Z(x,y)x|2)+ψ(|Z(x,y)y|2)
the overall scene flow estimation energy functional is as follows,
5. the method of claim 1, wherein the method comprises: the clustering of the scene flow motion areas in the sixth step specifically includes clustering the scene flow V (u, V, w) obtained in the fifth step by using a clustering algorithm, and separating the background from the motion areas, wherein the feature information of the scene flow specifically includes: each point scene flow u, v, w three components, each point scene flow module value isThe included angle theta between each point scene flow and the xoy plane, the xoz plane and the yoz planex,θy,θzEach point represents V by a 7-dimensional feature vectori,j=(u,v,w,|V|,θx,θy,θz);
The specific process is as follows: the input is a similarity matrix S formed by the similarity between every two of all N data pointsN×NThe initial stage treats all samples as potential cluster centers and then x in order to find the appropriate cluster centerkContinuously collecting the attraction degree r (i, k) and the reliability degree a (i, k) from the data samples, and matching the formula Continuously iterating to update the attraction degree and the reliability degree until m cluster center points are generated, wherein r (i, k) is used for describing the degree that the point k is suitable as the cluster center of the data point i, and a (i, k) is used for describing the degree that the point k is selected as the cluster center of the point i;
6. The method of claim 1, wherein the method comprises: the constructing of the motion direction discrete degree evaluation model described in the seventh step specifically includes,
defining the Z axis based on the camera coordinate system as the reference vector direction, and calculating the included angle phi between each motion vector and the reference directioni,j(t), the calculation formula is as follows:
calculating phi of pixel point in each frame motion regioni,j(t) variance D (phi)i,j(t)), whereinIs the average of all the included angles,
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710404056.7A CN107341815B (en) | 2017-06-01 | 2017-06-01 | Violent motion detection method based on multi-view stereoscopic vision scene stream |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710404056.7A CN107341815B (en) | 2017-06-01 | 2017-06-01 | Violent motion detection method based on multi-view stereoscopic vision scene stream |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107341815A CN107341815A (en) | 2017-11-10 |
CN107341815B true CN107341815B (en) | 2020-10-16 |
Family
ID=60221390
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710404056.7A Active CN107341815B (en) | 2017-06-01 | 2017-06-01 | Violent motion detection method based on multi-view stereoscopic vision scene stream |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107341815B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102707596B1 (en) * | 2018-08-07 | 2024-09-19 | 삼성전자주식회사 | Device and method to estimate ego motion |
CN109726718B (en) * | 2019-01-03 | 2022-09-16 | 电子科技大学 | Visual scene graph generation system and method based on relation regularization |
CN109978968B (en) * | 2019-04-10 | 2023-06-20 | 广州虎牙信息科技有限公司 | Video drawing method, device and equipment of moving object and storage medium |
CN112015170A (en) * | 2019-05-29 | 2020-12-01 | 北京市商汤科技开发有限公司 | Moving object detection and intelligent driving control method, device, medium and equipment |
CN112581494B (en) * | 2020-12-30 | 2023-05-02 | 南昌航空大学 | Binocular scene flow calculation method based on pyramid block matching |
CN112614151B (en) * | 2021-03-08 | 2021-08-31 | 浙江大华技术股份有限公司 | Motion event detection method, electronic device and computer-readable storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104680544A (en) * | 2015-03-18 | 2015-06-03 | 哈尔滨工程大学 | Method for estimating variational scene flow based on three-dimensional flow field regularization |
CN106485675A (en) * | 2016-09-27 | 2017-03-08 | 哈尔滨工程大学 | A kind of scene flows method of estimation guiding anisotropy to smooth based on 3D local stiffness and depth map |
CN106504202A (en) * | 2016-09-27 | 2017-03-15 | 哈尔滨工程大学 | A kind of based on the non local smooth 3D scene flows methods of estimation of self adaptation |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9659372B2 (en) * | 2012-05-17 | 2017-05-23 | The Regents Of The University Of California | Video disparity estimate space-time refinement method and codec |
-
2017
- 2017-06-01 CN CN201710404056.7A patent/CN107341815B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104680544A (en) * | 2015-03-18 | 2015-06-03 | 哈尔滨工程大学 | Method for estimating variational scene flow based on three-dimensional flow field regularization |
CN106485675A (en) * | 2016-09-27 | 2017-03-08 | 哈尔滨工程大学 | A kind of scene flows method of estimation guiding anisotropy to smooth based on 3D local stiffness and depth map |
CN106504202A (en) * | 2016-09-27 | 2017-03-15 | 哈尔滨工程大学 | A kind of based on the non local smooth 3D scene flows methods of estimation of self adaptation |
Non-Patent Citations (2)
Title |
---|
A Variational Method for Scene Flow Estimation from Stereo Sequences;Frederic Huguet 等;《2007 IEEE 11th International Conference on Computer Vision》;20071021;第1-17页 * |
基于双目场景流的运动目标检测与跟踪;杨文康;《中国优秀硕士学位论文全文数据库 信息科技辑》;20170215;I138-3458 * |
Also Published As
Publication number | Publication date |
---|---|
CN107341815A (en) | 2017-11-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107341815B (en) | Violent motion detection method based on multi-view stereoscopic vision scene stream | |
CN111462200B (en) | Cross-video pedestrian positioning and tracking method, system and equipment | |
CN110910421B (en) | Weak and small moving object detection method based on block characterization and variable neighborhood clustering | |
CN102098440A (en) | Electronic image stabilizing method and electronic image stabilizing system aiming at moving object detection under camera shake | |
Xu et al. | Dynamic obstacle detection based on panoramic vision in the moving state of agricultural machineries | |
CN110599522A (en) | Method for detecting and removing dynamic target in video sequence | |
CN111260691B (en) | Space-time regular correlation filtering tracking method based on context awareness regression | |
CN106530407A (en) | Three-dimensional panoramic splicing method, device and system for virtual reality | |
CN105957060B (en) | A kind of TVS event cluster-dividing method based on optical flow analysis | |
CN111582036A (en) | Cross-view-angle person identification method based on shape and posture under wearable device | |
Ellenfeld et al. | Deep fusion of appearance and frame differencing for motion segmentation | |
CN112509014B (en) | Robust interpolation light stream computing method matched with pyramid shielding detection block | |
CN109166079B (en) | Mixed synthesis motion vector and brightness clustering occlusion removing method | |
Hu et al. | An integrated background model for video surveillance based on primal sketch and 3D scene geometry | |
Li et al. | Real-time action recognition by feature-level fusion of depth and inertial sensor | |
Rougier et al. | 3D head trajectory using a single camera | |
CN111160255B (en) | Fishing behavior identification method and system based on three-dimensional convolution network | |
Panagiotakis et al. | Shape-based individual/group detection for sport videos categorization | |
CN108647589A (en) | It is a kind of based on regularization form than fall down detection method | |
CN117876419B (en) | Dual-view-field aerial target detection and tracking method | |
Briassouli et al. | Fusion of frequency and spatial domain information for motion analysis | |
CN118314162B (en) | Dynamic visual SLAM method and device for time sequence sparse reconstruction | |
Nagmode et al. | A novel approach to detect and track moving object using Partitioning and Normalized Cross Correlation | |
Nagmode et al. | Moving Object detection and tracking based on correlation and wavelet Transform Technique to optimize processing time | |
Hadfield et al. | Go with the flow: Hand trajectories in 3D via clustered scene flow |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |