CN114638808A - Multi-scene video jitter detection method based on video monitoring - Google Patents

Multi-scene video jitter detection method based on video monitoring Download PDF

Info

Publication number
CN114638808A
CN114638808A CN202210284410.8A CN202210284410A CN114638808A CN 114638808 A CN114638808 A CN 114638808A CN 202210284410 A CN202210284410 A CN 202210284410A CN 114638808 A CN114638808 A CN 114638808A
Authority
CN
China
Prior art keywords
value
formula
image
video
corner
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210284410.8A
Other languages
Chinese (zh)
Inventor
马丕明
刘学孔
左修洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202210284410.8A priority Critical patent/CN114638808A/en
Publication of CN114638808A publication Critical patent/CN114638808A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N17/00Diagnosis, testing or measuring for television systems or their details
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20164Salient point detection; Corner detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30232Surveillance

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a multi-scene video jitter detection method based on video monitoring, which comprises the following steps: acquiring images of two adjacent frames of a conference monitoring picture; searching and matching feature points of the two acquired frames of images; if the number of the matched feature points is smaller than a set threshold value, judging that the meeting place is switched, and not carrying out picture jitter detection at the moment; if the number of the matched characteristic points is larger than a set threshold value, judging that the meeting place is not switched, respectively calculating the row projection and the column projection of the two frames of images, respectively performing correlation operation on the row projection and the column projection, taking an extreme value in a correlation value curve as displacement generated between the two frames, and taking a vector absolute value as the displacement; and when the displacement distance is larger than the set threshold value, judging that the video shakes. The invention can accurately and efficiently judge whether the video picture shakes or not aiming at the video conference. The method can adapt to multi-scene monitoring scenes, avoids the situation of jitter detection misjudgment caused by scene conversion, and has the advantages of small calculated amount, good detection effect and wider use scenes.

Description

Multi-scene video jitter detection method based on video monitoring
Technical Field
The invention relates to a multi-scene video jitter detection method based on video monitoring, and belongs to the technical field of computer vision.
Background
The monitoring system becomes an indispensable part in our daily life, and the importance of the monitoring system in the fields of banks, markets, companies, schools, residential areas, public transportation and the like is self-evident, and the monitoring system plays an increasingly important role along with the construction of cities and the development of companies. Especially, in recent years, the outbreak of new crown epidemic situation meets the epidemic prevention requirement, avoids the gathering of a large number of people in different areas, and the monitoring system of multiple meeting places is more and more widely applied. The multi-meeting-place monitoring system means that a plurality of sub-meeting-place monitoring pictures are sequentially transmitted into a main monitoring picture so as to meet the polling requirement of each meeting place. In order to guarantee the quality of the monitoring video, a series of problem detection needs to be carried out on the polling video, and video picture jitter detection is a very critical loop.
Chinese patent document CN106385580B discloses a video jitter detection method based on image gray distribution characteristics, in which a video frame jitter detection method is mentioned, and the method is: step 1: intercepting two adjacent frames of images in the video; and 2, step: converting the two intercepted frame images into a gray scale space; and step 3: counting the gray value of each row and the gray value of each column of the previous frame image, and calculating a row gray mean value, a row gray variance value, a column gray mean value and a column gray variance value; and 4, step 4: counting the gray value of each row and the gray value of each column of the next frame of image, and calculating a row gray mean value, a row gray variance value, a column gray mean value and a column gray variance value; and 5: performing a line hypothesis test on the line gray level mean value and the line gray level variance of the two frames of images obtained in the step 3 and the step 4 to obtain a line test factor; step 6: performing a column hypothesis test on the column gray mean value and the column gray variance of the front frame and the back frame obtained by calculation in the step 3 and the step 4 to obtain a column test factor; and 7: comparing the row check factor and the column check factor obtained by calculation in the step 5 and the step 6 with a given threshold respectively, and if one of the row check factor and the column check factor exceeds the threshold, judging that the previous frame of image is a jittered frame; and 8: and counting the proportion of the jitter frames in the whole video, and if the proportion exceeds a set jitter threshold, judging that the video jitters. The method mainly utilizes the gray feature of the image, judges whether the video picture shakes or not by judging the mean value and the variance of the gray of the row and the column of the two frames of images, and has the characteristics of small calculated amount and good real-time performance. However, the method has high limitation, is not suitable for the current multi-meeting-place monitoring scene, and the scene switching of the picture can cause the gray scale characteristics of the image to be changed violently, thereby directly causing the misjudgment phenomenon of the algorithm.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides a multi-scene video jitter detection method based on video monitoring, so as to solve the problem that the jitter detection is mostly applied to a single scene at present.
The technical scheme of the invention is as follows:
a video jitter detection method for multiple meeting places based on video monitoring comprises the following steps:
step 1: detecting characteristic points of the two acquired frames of images by a Shi-Tomasi corner detection method;
step 2: judging whether meeting place switching occurs or not through corner matching; the method comprises the following steps: if the number of the matched angular points is less than a set threshold value, judging that the meeting place is switched, and not carrying out picture jitter detection at the moment; if the number of the matched angular points is larger than a set threshold value, judging that the meeting place is not switched, and entering the step 3;
and step 3: judging whether the picture shakes or not according to the gray characteristic of the image; the method comprises the following steps: respectively calculating the row projection and the column projection of two frames of images, respectively carrying out correlation operation on the row projection and the column projection, taking an extreme value in a correlation value curve as displacement generated between the two frames, and taking a vector absolute value as the magnitude of the displacement; and when the displacement distance is larger than the set threshold value, judging that the video shakes.
According to the invention, the step 1 is preferably realized by the following steps:
step 1.1: calculating the pixel value variation E (u, v) inside a window when the window moves towards x and y directions simultaneously in a Cartesian rectangular coordinate system with the upper left corner of the captured video frame image as an origin, the right side as an x axis and the downward side as a y axis;
step 1.2: for each window, respectively calculating a corresponding angular point response function R;
step 1.3, setting a threshold value threshold, carrying out threshold value judgment on the calculated angular point response function, if R is larger than the threshold value, indicating that the window corresponds to an angular point characteristic, and the pixel corresponds to an angular point, otherwise, neglecting the pixel point, and detecting the next pixel point.
Further preferably, the step 1.1 specifically comprises the following steps:
enabling the center of a window to be located at any position (x, y) of a gray level image of any frame image in a video acquired by video monitoring, wherein the gray level value of a pixel at the position is I (x, y), if the window moves towards the x direction and the y direction by small displacement u and v respectively, the window moves to a new position (x + u, y + v), the gray level value of the pixel at the position is I (x + u, y + v), and I (x + u, y + v) -I (x, y) refers to the change value of the gray level value caused by window movement;
defining ω (x, y) as a window function at position (x, y) to represent the weight of each pixel within the window, setting the weight of all pixels within the window to 1; or, ω (x, y) is set to a gaussian distribution with the center of the window as the origin; if the pixel at the center point of the window is the corner point, the weight coefficient of the center point of the window is set to be 1, which indicates that the contribution of the center point of the window to the gray scale change is large; the farther the points are from the center point of the window, namely the angular point, the smaller the gray scale change of the points, the smaller the weight coefficient is set to be 01, and the farther the weight coefficient is from the angular point, the closer the weight coefficient is to 0, the smaller the contribution of the points to the gray scale change is shown;
the formula for calculating the variation E (u, v) of the pixel value inside the window is shown in formula (I):
Figure BDA0003557577310000021
in the formula (I), IxAnd IyIs the partial differential of I, in the image is the gradient map in the x and y directions,
Figure BDA0003557577310000022
the matrix M is:
Figure BDA0003557577310000023
the right side of the arrow is a result of diagonalization processing of the real symmetric matrix, R is a response function of an angular point and does not influence the variation components of two orthogonal directions;
after diagonalization, the variation components in two orthogonal directions are extracted, namely lambda1And λ2,λ1And λ2Are two eigenvalues of the matrix M.
Further preferably, the step 1.2 specifically comprises the following steps:
directly using a smaller characteristic value as a corner response function R, as shown in formula (II):
R=min(λ1,λ2) (II)
in the formula (II), λ1And λ2Is the eigenvalue of the matrix M.
Further preferably, in step 1.3, threshold is 15.
According to the invention, the step 2 is preferably realized by the following steps:
intercepting two adjacent frames of images of a video, wherein the former frame is defined as a reference frame, the latter frame is defined as a current frame, taking one corner point of the reference frame image, and finding out two corner points which are nearest to a characteristic point Euclidean distance of the reference frame in the current frame image, wherein in the two corner points, the ratio of the Euclidean distance of the corner point nearest to the reference frame corner point divided by the Euclidean distance of the next-nearest corner point is less than a threshold value T, then the two corner points are successfully matched, otherwise, the two corner points are not successfully matched;
repeating the operation until all the detected corner points in the step 1 are subjected to corner point matching, and taking T0.40.6;
if the matching degree of all the corner points of the two frames reaches more than 90%, the meeting place switching does not occur, otherwise, the meeting place switching is judged to occur.
Further preferably, the current frame corner is based on the nearest Euclidean distance rho of the reference frame corner1Is obtained by the formulaFormula (III):
Figure BDA0003557577310000031
in the formula (III), x1、y1Is the line and column coordinates, x, of the corner point of the reference frame2、y2The coordinates of the current frame corresponding to the angular point are taken as the line coordinates and the column coordinates;
the Euclidean distance rho of the current frame corner next to the reference frame corner2The formula (IV) is shown as the following formula:
Figure BDA0003557577310000032
in the formula (IV), x3、y3The coordinates of the row and the column of the angular point of the current frame, which is next to the angular point of the reference frame;
the formula for the ratio is shown in formula (V):
Figure BDA0003557577310000033
according to the invention, the step 3 is preferably realized by the following steps:
step 3.1: calculating the total pixel value Row of each line of the imagek(i) As shown in formula (VI):
Figure BDA0003557577310000034
in formula (VI), k represents the k-th frame image, i represents the i-th row, and m represents the number of columns of the image;
step 3.2: calculating the line average gray value Row of the whole imagekAs shown in formula (VII):
Rowk=[∑Rowk(i))]/n (VII)
in formula (VII), n represents the total number of lines of the image;
step 3.3: using the total pixel value Row of each Row of the imagek(i) Subtract the line mean gray value Row of the imagekObtaining a line correction value Rowproject of a projection value of a k frame imagek(i) As shown in formula (VIII):
Rowprojectk(i)=Rowk(i)-Rowk (VIII)
step 3.4: calculating the total pixel value Col of each column of the imagek(j) As shown in formula (IX):
Figure BDA0003557577310000041
in the formula (IX), k denotes a k frame image, j denotes a j line, and n denotes a total number of lines of the image;
step 3.5: calculating the column mean gray value Col of the whole imagekAs shown in formula (X):
Colk=[∑Colk(j))]/m(X)
in formula (X), m represents the total number of columns of the image;
step 3.6: using the total pixel value Col of each column of the imagek(j) Subtract the column mean gray value Col of the imagekObtaining Colproject correction value of projection value of k frame imagek(j) As shown in formula (XI):
Colprojectk(j)=Colk(j)-Colk (XI)
drawing a gray level projection curve of the row and the column of the image through the line and column projection value correction values calculated in the step 3.3 and the step 3.6;
step 3.7: after calculating the row and column gray projection curves of the current frame and the reference frame respectively, performing cross-correlation operation on the row and column gray projection curves of the current frame and the reference frame respectively, taking an extreme value in a correlation value curve as displacement of the image frame, and taking an absolute value of a vector thereof as the displacement, wherein the cross-correlation operation calculation formula is shown as a formula (XII) and a formula (XIII):
Figure BDA0003557577310000042
Figure BDA0003557577310000043
in the formulae (XII) and (XIII), Rx(w) and Ry(v) Respectively representing the calculation formulas, Col, for performing the correlation operation on the processed row and column projectionspre(j) Is the gray projection value of the j column of the current frame image, Colref(j) Taking the gray projection value of the jth column of the reference frame as p and q are the search length of the current frame relative to the reference frame on one side of the image;
w and v are in a range such that Rx(w) and Ry(v) Is defined as wminAnd vminThen, the displacement vectors of the current frame relative to the reference frame in the horizontal and vertical directions are respectively shown in formula (XIV) and formula (XV):
dx=m+1-wmin (XIV)
dy=n+1-vmin (XV)
in the formulae (XIV) and (XV), dxRepresents the displacement of the current frame compared to the reference frame in the horizontal direction, where dyRepresenting the displacement of the current frame in the vertical direction compared to the reference frame;
when d isxOr dyIs greater than a set threshold value T1And judging that the video picture shakes, otherwise, judging that the video picture does not shake.
Further preferably, T is selected according to different applicable scenes1And taking 5-30 parts.
The beneficial effects of the invention are as follows:
the invention provides a multi-scene video jitter detection method based on video monitoring, which can effectively identify whether a video picture is converted or not through corner matching and detection, thereby being suitable for multi-scene monitoring situations, avoiding the situation of jitter detection misjudgment caused by scene conversion, and judging whether the video picture is jittered or not through the gray level characteristics of images. The method improves the traditional single scene video jitter detection mode, so that the method has wider application range, stronger anti-interference capability and better detection effect.
Drawings
FIG. 1 is a schematic flow chart of a video jitter detection method for multiple meeting places based on video surveillance according to the present invention;
FIG. 2 is a schematic diagram of corner point matching when no video scene is switched;
FIG. 3 is a schematic diagram of corner point matching when a video scene is switched;
FIG. 4(a) is a diagram illustrating a video scene I;
FIG. 4(b) is a schematic column projection of the image of FIG. 4 (a);
FIG. 4(c) is a schematic line projection of the image of FIG. 4 (a);
FIG. 5(a) is a schematic view of a video scene II;
FIG. 5(b) is a schematic column projection of the image of FIG. 5 (a);
fig. 5(c) is a schematic line projection of the image of fig. 5 (a).
Detailed Description
The invention is further described below, but not limited thereto, with reference to the drawings and examples of the specification.
Example 1
A video monitoring-based multi-meeting-place video jitter detection method detects feature points through a Shi-Tomasi corner detection method, judges whether meeting-place switching occurs or not through feature point matching, and finally judges whether a picture jitters or not through the gray scale features of the image, as shown in figure 1, the method comprises the following steps:
step 1: detecting characteristic points (angular points) of the two acquired frames of images by a Shi-Tomasi angular point detection method;
step 2: judging whether meeting place switching occurs or not through corner matching; the method comprises the following steps: if the number of the matched angular points is less than a set threshold value, judging that the meeting place is switched, and not carrying out picture jitter detection at the moment; if the number of the matched angular points is larger than a set threshold value, judging that the meeting place is not switched, and entering the step 3;
and 3, step 3: judging whether the picture shakes or not according to the gray characteristic of the image; the method comprises the following steps: respectively calculating the row projection and the column projection of two frames of images, respectively carrying out correlation operation on the row projection and the column projection, taking an extreme value in a correlation value curve as displacement generated between the two frames, and taking a vector absolute value as the magnitude of the displacement; and when the displacement distance is greater than the set threshold value, judging that the video shakes.
Example 2
The method for detecting the video jitter of multiple meeting places based on video monitoring in the embodiment 1 is characterized in that:
when one window moves in the image smooth area, the image gray scale is not changed; the window moves in the direction of the edge, and the image gray scale is not changed; the window moves at the corner points, causing a significant change in the image grey scale. The Shi-Tomas corner detection utilizes the intuitive physical phenomenon, and judges whether the corner is a corner or not according to the change degree of the window in each direction. The concrete implementation steps of the step 1 comprise:
step 1.1: calculating the pixel value variation E (u, v) inside a window when the window moves towards x and y directions simultaneously in a Cartesian rectangular coordinate system with the upper left corner of the captured video frame image as an origin, the right side as an x axis and the downward side as a y axis; the method specifically comprises the following steps:
enabling the center of a window to be located at any position (x, y) of a gray level image of any frame image in a video acquired by video monitoring, wherein the gray level value of a pixel at the position is I (x, y), if the window moves towards the x direction and the y direction by small displacement u and v respectively, the window moves to a new position (x + u, y + v), the gray level value of the pixel at the position is I (x + u, y + v), and I (x + u, y + v) -I (x, y) refers to the change value of the gray level value caused by window movement;
defining ω (x, y) as a window function at position (x, y) to represent the weight of each pixel within the window, setting the weight of all pixels within the window to 1; alternatively, ω (x, y) is set to a gaussian distribution (binary normal distribution) with the center of the window as the origin; if the pixel at the center point of the window is an angular point, the gray value change of the center point of the window is very strong before and after the window moves, so that the weight coefficient of the center point of the window is set to be 1, which indicates that the center point of the window has a large contribution to the gray value change; the farther the points are from the center point of the window, namely the angular point, the smaller the gray scale change of the points, the smaller the weight coefficient is set to be 01, and the farther the weight coefficient is from the angular point, the closer the weight coefficient is to 0, the smaller the contribution of the points to the gray scale change is shown;
the amount of change in the gray value of the pixel caused by the window moving in each direction (u, v) is as follows:
Figure BDA0003557577310000061
e (u, v) will be very large for one corner point. Thus, this function above can be maximized to obtain corner points in the image. Computing E (u, v) with the above function can be very slow. Therefore, the taylor expansion (only first order) can be used to get an approximate form of this formula.
The taylor expansion formula for two dimensions is:
T(x,y)≈f(u,v)+(x-u)fx(u,v)+(y-v)fy(u,v)+…
applying I (u + x, y + v) to the above formula, we can obtain:
I(x+u,y+v)≈I(x,y)+uIx+vIy
wherein IxAnd IyIs the partial differential of I, which in the image is the gradient map in the x and y directions.
Figure BDA0003557577310000071
The derivation continues as follows:
Figure BDA0003557577310000072
taking u and v out to obtain the final form: the formula for calculating the variation E (u, v) of the pixel values inside the window is shown in formula (I):
Figure BDA0003557577310000073
in the formula (I), IxAnd IyIs the partial differential of I, in the image is the gradient map in the x and y directions,
Figure BDA0003557577310000074
the matrix M is:
Figure BDA0003557577310000075
the right side of the arrow is a result of diagonalization processing of the real symmetric matrix, R is a response function of an angular point and does not influence the variation components of two orthogonal directions;
after diagonalization, the variation components in two orthogonal directions are extracted, namely lambda1And λ2,λ1And λ2Are two eigenvalues of the matrix M;
step 1.2: for each window, respectively calculating a corresponding angular point response function R;
having obtained the final form of E (u, v) from the derivation of step 1.1, it is then necessary to use the eigenvalues to find those windows that will cause large changes in the grey value. The stability of the corner is related to the smaller eigenvalue of the matrix M, and then the smaller eigenvalue is directly used as the corner response function R, as shown in equation (II):
R=min(λ1,λ2) (II)
in the formula (II), λ1And λ2Is the eigenvalue of the matrix M;
step 1.3: the result of detecting the Shi-Tomas corner is a gray image with a corner response function R, a threshold value is set (threshold is 15 in the method), the calculated corner response function is subjected to threshold value judgment, if R is greater than threshold, it is indicated that the window corresponds to a corner feature, the pixel corresponds to a corner, otherwise, the pixel is ignored, and the next pixel is detected.
The concrete implementation steps of the step 2 comprise:
intercepting two adjacent frames of images of a video, wherein the former frame is defined as a reference frame, the latter frame is defined as a current frame, taking one corner point of the reference frame image, and finding out two corner points which are nearest to a characteristic point Euclidean distance of the reference frame in the current frame image, wherein in the two corner points, the ratio of the Euclidean distance of the corner point nearest to the reference frame corner point divided by the Euclidean distance of the next-nearest corner point is less than a threshold value T, then the two corner points are successfully matched, otherwise, the two corner points are not successfully matched;
repeating the operation until all the detected corner points in the step 1 are subjected to corner point matching, and taking T0.40.6;
if the matching degree of all the corner points of the two frames reaches more than 90%, the meeting place switching does not occur, otherwise, the meeting place switching is judged to occur.
The Euclidean distance rho between the corner of the current frame and the corner of the reference frame1The formula (III) is shown as the formula:
Figure BDA0003557577310000081
in the formula (III), x1、y1Is the row and column coordinates, x, of the corner of the reference frame2、y2The coordinates of the current frame corresponding to the angular point are taken as the line coordinates and the column coordinates;
the Euclidean distance rho of the current frame corner next to the reference frame corner2The formula (IV) is shown as the following formula:
Figure BDA0003557577310000082
in the formula (IV), x3、y3The coordinates of the row and the column of the angular point of the current frame, which is next to the angular point of the reference frame;
the formula for the ratio is shown in formula (V):
Figure BDA0003557577310000083
the most obvious characteristic when the video shakes is that the whole displacement can be generated between frames, and after the displacement is detected, whether the video shakes is judged through further logic, therefore, basically, the shaking of the video is carried out around how to detect the displacement, the invention calculates the row and column displacement quantity based on the gray characteristic of the image of the video picture, and further judges whether the video picture shakes, and the specific implementation step of the step 3 comprises the following steps:
step 3.1: calculating the total pixel value Row of each line of the imagek(i) As shown in formula (VI):
Figure BDA0003557577310000084
in formula (VI), k represents the k-th frame image, i represents the i-th row, and m represents the number of columns of the image;
step 3.2: calculating the line average gray value Row of the whole imagekAs shown in formula (VII):
Rowk=[∑Rowk(i))]/n (VII)
in formula (VII), n represents the total number of lines of the image;
step 3.3: using the total pixel value Row of each Row of the imagek(i) Subtract the line mean gray value Row of the imagekObtaining a line correction value Rowproject of a projection value of a k frame imagek(i) As shown in formula (VIII):
Rowprojectk(i)=Rowk(i)-Rowk (VIII)
step 3.4: calculating the total pixel value Col of each column of the imagek(j) As shown in formula (IX):
Figure BDA0003557577310000091
in the formula (IX), k denotes the k-th frame image, j denotes the j-th line, and n denotes the total number of lines of the image;
step 3.5: calculating the column average gray value Col of the whole imagekAs shown in formula (X):
Colk=[∑Colk(j))]/m (X)
in the formula (X), m represents the total number of columns of the image;
step 3.6: using the total pixel value Col of each column of the imagek(j) Subtracting the column mean gray value Col of the imagekObtaining Colproject correction value of projection value of k frame imagek(j) As shown in formula (XI):
Colprojectk(j)=Colk(j)-Colk (XI)
drawing a gray level projection curve of the row and the column of the image through the projection value correction values of the row and the column calculated in the step 3.3 and the step 3.6;
step 3.7: after calculating the row and column gray projection curves of the current frame and the reference frame respectively, performing cross-correlation operation on the row and column gray projection curves of the current frame and the reference frame respectively, taking an extreme value in a correlation value curve as displacement of the image frame, and taking an absolute value of a vector thereof as the displacement, wherein the cross-correlation operation calculation formula is shown as a formula (XII) and a formula (XIII):
Figure BDA0003557577310000092
Figure BDA0003557577310000093
in the formulae (XII) and (XIII), Rx(w) and Ry (v) respectively represent calculation formulas for performing correlation operation on the processed row and column projections, Colpre(j) Is the gray projection value of the j column of the current frame image, Colref(j) Taking the gray projection value of the jth column of the reference frame as p and q are the search length of the current frame relative to the reference frame on one side of the image;
w and v are in the range of values such that Rx(w) and Ry(v) The value of (b) is defined as wminAnd vminThen, the displacement vectors of the current frame relative to the reference frame in the horizontal and vertical directions are respectively shown in formula (XIV) and formula (XV):
dx=m+1-wmin(XIV)
dy=n+1-vmin(XV)
in the formulae (XIV) and (XV), dxRepresents the displacement of the current frame compared to the reference frame in the horizontal direction, where dyRepresenting the displacement of the current frame in the vertical direction compared to the reference frame;
when d isxOr dyIs greater than a set threshold value T1Judging that the video image shakes during the process, otherwise, judging that the video image does not shake, and according to different applicable scenes, T1And taking 5-30 parts.
FIG. 2 is a schematic diagram of corner point matching when no video scene is switched; FIG. 3 is a schematic diagram of corner matching when a video scene is switched; as can be seen from fig. 2 and 3, when the monitored scene is not changed, the corners of the two frames of images captured from the video are substantially completely matched, and when the monitored scene is changed, the corners of the two frames of images captured from the video are only partially matched, and whether the video scene is switched or not is determined according to the matching degree of the corners of the two frames of images.
FIG. 4(a) is a diagram illustrating a video scene I; FIG. 4(b) is a schematic column projection of the image of FIG. 4 (a); FIG. 4(c) is a schematic line projection of the image of FIG. 4 (a); FIG. 5(a) is a schematic view of a video scene II; FIG. 5(b) is a schematic column projection of the image of FIG. 5 (a); fig. 5(c) is a schematic line projection of the image of fig. 5 (a). After the row-column projection image of the image is obtained, the formula in step 3 is used for carrying out correlation operation on the row-column projection respectively, the extreme value in the correlation value curve is taken as the displacement generated between two frames, and the vector absolute value is the displacement. And when the displacement distance is larger than a set threshold value, judging that the video shakes.

Claims (9)

1. A video jitter detection method for multiple meeting places based on video monitoring is characterized by comprising the following steps:
step 1: detecting characteristic points of the two acquired frames of images by a Shi-Tomasi corner detection method;
step 2: judging whether meeting place switching occurs or not through corner matching; the method comprises the following steps: if the number of the matched angular points is less than a set threshold value, judging that the meeting place is switched, and not carrying out picture jitter detection at the moment; if the number of the matched angular points is larger than a set threshold value, judging that the meeting place is not switched, and entering the step 3;
and step 3: judging whether the picture shakes or not according to the gray characteristics of the image; the method comprises the following steps: respectively calculating the row projection and the column projection of two frames of images, respectively carrying out correlation operation on the row projection and the column projection, taking an extreme value in a correlation value curve as displacement generated between the two frames, and taking a vector absolute value as the magnitude of the displacement; and when the displacement distance is larger than the set threshold value, judging that the video shakes.
2. The method for detecting the video jitter of multiple meeting places based on video surveillance as claimed in claim 1, wherein the specific implementation steps of step 1 include:
step 1.1: calculating the pixel value variation E (u, v) inside a window when the window moves towards x and y directions simultaneously in a Cartesian rectangular coordinate system with the upper left corner of the captured video frame image as an origin, the right side as an x axis and the downward side as a y axis;
step 1.2: for each window, respectively calculating a corresponding angular point response function R;
step 1.3, setting a threshold value threshold, carrying out threshold value judgment on the calculated angular point response function, if R is larger than the threshold value, indicating that the window corresponds to an angular point characteristic, and the pixel corresponds to an angular point, otherwise, neglecting the pixel point, and detecting the next pixel point.
3. The video shake detection method for multiple meeting places based on video surveillance as claimed in claim 2, wherein the step 1.1 includes the following steps;
enabling the center of a window to be located at any position (x, y) of a gray level image of any frame image in a video acquired by video monitoring, wherein the gray level value of a pixel at the position is I (x, y), if the window moves towards the x direction and the y direction by small displacement u and v respectively, the window moves to a new position (x + u, y + v), the gray level value of the pixel at the position is I (x + u, y + v), and I (x + u, y + v) -I (x, y) refers to the change value of the gray level value caused by window movement;
defining ω (x, y) as a window function at position (x, y) to represent the weight of each pixel within the window, setting the weight of all pixels within the window to 1; or, ω (x, y) is set to a gaussian distribution with the center of the window as the origin; if the pixel at the center point of the window is the corner point, the weight coefficient of the center point of the window is set to be 1, which indicates that the contribution of the center point of the window to the gray scale change is large; the farther the points are from the center point of the window, namely the angular point, the smaller the gray scale change of the points, the smaller the weight coefficient is set to be 01, and the farther the weight coefficient is from the angular point, the closer the weight coefficient is to 0, the smaller the contribution of the points to the gray scale change is shown;
the formula for calculating the variation E (u, v) of the pixel value inside the window is shown in formula (I):
Figure FDA0003557577300000021
in the formula (I), IxAnd IyIs the partial differential of I, in the image is the gradient map in the x and y directions,
Figure FDA0003557577300000022
the matrix M is:
Figure FDA0003557577300000023
the right side of the arrow is a result of diagonalization processing of the real symmetric matrix, R is a response function of an angular point and does not influence the variation components of two orthogonal directions;
after diagonalization, the variation components in two orthogonal directions are extracted, namely lambda1And λ2,λ1And λ2Are two eigenvalues of the matrix M.
4. The video shake detection method for multiple meeting places based on video monitoring as claimed in claim 2, wherein the step 1.2 comprises the following steps:
directly using a smaller characteristic value as a corner response function R, as shown in formula (II):
R=min(λ1,λ2) (II)
in the formula (II), λ1And λ2Is the eigenvalue of the matrix M.
5. The method for detecting multi-meeting-place video jitter based on video surveillance as claimed in claim 2, wherein in step 1.3, threshold is 15.
6. The method for detecting the video jitter of multiple meeting places based on video surveillance as claimed in claim 1, wherein the step 2 is implemented by the following steps:
intercepting two adjacent frames of images of a video, wherein the former frame is defined as a reference frame, the latter frame is defined as a current frame, taking one corner point of the reference frame image, and finding out two corner points which are nearest to a characteristic point Euclidean distance of the reference frame in the current frame image, wherein in the two corner points, the ratio of the Euclidean distance of the corner point nearest to the reference frame corner point divided by the Euclidean distance of the next-nearest corner point is less than a threshold value T, then the two corner points are successfully matched, otherwise, the two corner points are not successfully matched;
repeating the operation until all the detected corner points in the step 1 are subjected to corner point matching, and taking T as 0.4-0.6;
if the matching degree of all the corner points of the two frames reaches more than 90%, the meeting place switching does not occur, otherwise, the meeting place switching is judged to occur.
7. The method as claimed in claim 6, wherein the Euclidean distance p between the corner of the current frame and the corner of the reference frame is the nearest Euclidean distance1The formula (III) is shown as the formula:
Figure FDA0003557577300000024
formula (III)In, x1、y1Is the row and column coordinates, x, of the corner of the reference frame2、y2The coordinates of the current frame corresponding to the angular point are taken as the line coordinates and the column coordinates;
the Euclidean distance rho of the current frame corner next to the reference frame corner2The formula (IV) is shown as the following formula:
Figure FDA0003557577300000031
in the formula (IV), x3、y3The coordinates of the row and the column of the angular point of the current frame, which is next to the angular point of the reference frame;
the formula for the ratio is shown in formula (V):
Figure FDA0003557577300000032
8. the video shake detection method for multiple meeting places based on video surveillance as claimed in any one of claims 1-7, wherein the step 3 is implemented by the following steps:
step 3.1: calculating the total pixel value Row of each line of the imagek(i) As shown in formula (VI):
Figure FDA0003557577300000033
in formula (VI), k represents the kth frame image, i represents the ith row, and m represents the number of columns of the image;
step 3.2: calculating the line average gray value Row of the whole imagekAs shown in formula (VII):
Rowk=[∑Rowk(i))]/n (VII)
in formula (VII), n represents the total number of lines of the image;
step 3.3: using the total pixel value Row of each Row of the imagek(i) Subtract the line mean gray value Row of the imagekTo obtain a pairRowpproject line correction value of image projection value of k-th framek(i) As shown in formula (VIII):
Rowprojectk(i)=Rowk(i)-Rowk (VIII)
step 3.4: calculating the total pixel value Col of each column of the imagek(j) As shown in formula (IX):
Figure FDA0003557577300000034
in the formula (IX), k denotes the k-th frame image, j denotes the j-th line, and n denotes the total number of lines of the image;
step 3.5: calculating the column mean gray value Col of the whole imagekAs shown in formula (X):
Colk=[∑Colk(j))]/m (X)
in the formula (X), m represents the total number of columns of the image;
step 3.6: using the total pixel value Col of each column of the imagek(j) Subtracting the column mean gray value Col of the imagekObtaining Colproject correction value of projection value of k frame imagek(j) As shown in formula (XI):
Colprojectk(j)=Colk(j)-Colk (XI)
drawing a gray level projection curve of the row and the column of the image through the line and column projection value correction values calculated in the step 3.3 and the step 3.6;
step 3.7: after calculating the row and column gray projection curves of the current frame and the reference frame respectively, performing cross-correlation operation on the row and column gray projection curves of the current frame and the reference frame respectively, taking an extreme value in a correlation value curve as displacement of the image frame, and taking an absolute value of a vector thereof as the displacement, wherein the cross-correlation operation calculation formula is shown as a formula (XII) and a formula (XIII):
Figure FDA0003557577300000041
Figure FDA0003557577300000042
in the formulae (XII) and (XIII), Rx(w) and Ry(v) Respectively representing the calculation formulas, Col, for performing the correlation operation on the processed row and column projectionspre(j) Is the gray projection value of the jth column of the current frame image, Colref(j) Taking the gray projection value of the jth column of the reference frame as p and q are the search length of the current frame relative to the reference frame on one side of the image;
w and v are in a range such that Rx(w) and Ry(v) The value of (b) is defined as wminAnd vminThen, the displacement vectors of the current frame relative to the reference frame in the horizontal and vertical directions are respectively shown in formula (XIV) and formula (XV):
dx=m+1-wmin (XIV)
dy=n+1-vmin (XV)
in the formulae (XIV) and (XV), dxRepresents the displacement of the current frame compared with the reference frame in the horizontal direction, wherein dy represents the displacement of the current frame compared with the reference frame in the vertical direction;
when d isxOr dyIs greater than a set threshold value T1And judging that the video picture shakes when the video picture is in the normal state, otherwise, judging that the video picture does not shake.
9. The video monitoring-based multi-meeting-place video shake detection method according to claim 8, wherein T is the number of applications according to different applicable scenes1And taking 5-30 parts.
CN202210284410.8A 2022-03-22 2022-03-22 Multi-scene video jitter detection method based on video monitoring Pending CN114638808A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210284410.8A CN114638808A (en) 2022-03-22 2022-03-22 Multi-scene video jitter detection method based on video monitoring

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210284410.8A CN114638808A (en) 2022-03-22 2022-03-22 Multi-scene video jitter detection method based on video monitoring

Publications (1)

Publication Number Publication Date
CN114638808A true CN114638808A (en) 2022-06-17

Family

ID=81949541

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210284410.8A Pending CN114638808A (en) 2022-03-22 2022-03-22 Multi-scene video jitter detection method based on video monitoring

Country Status (1)

Country Link
CN (1) CN114638808A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114674415A (en) * 2022-05-25 2022-06-28 合肥安迅精密技术有限公司 Method and system for testing jitter of suction nozzle rod of XY motion platform
WO2024055762A1 (en) * 2022-09-14 2024-03-21 支付宝(杭州)信息技术有限公司 Video jitter detection method and apparatus, and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114674415A (en) * 2022-05-25 2022-06-28 合肥安迅精密技术有限公司 Method and system for testing jitter of suction nozzle rod of XY motion platform
WO2024055762A1 (en) * 2022-09-14 2024-03-21 支付宝(杭州)信息技术有限公司 Video jitter detection method and apparatus, and device

Similar Documents

Publication Publication Date Title
CN114638808A (en) Multi-scene video jitter detection method based on video monitoring
US7664329B2 (en) Block-based Gaussian mixture model video motion detection
US8054881B2 (en) Video stabilization in real-time using computationally efficient corner detection and correspondence
US7982774B2 (en) Image processing apparatus and image processing method
CN112184759A (en) Moving target detection and tracking method and system based on video
CN105279772B (en) A kind of trackability method of discrimination of infrared sequence image
CN113723190A (en) Multi-target tracking method for synchronous moving target
CN108765455A (en) A kind of target tenacious tracking method based on TLD algorithms
CN107862713A (en) Video camera deflection for poll meeting-place detects method for early warning and module in real time
CN111914627A (en) Vehicle identification and tracking method and device
CN113034383A (en) Method for obtaining video image based on improved grid motion statistics
CN105844671B (en) A kind of fast background relief method under the conditions of change illumination
JP4192719B2 (en) Image processing apparatus and method, and program
CN117315547A (en) Visual SLAM method for solving large duty ratio of dynamic object
CN108985216B (en) Pedestrian head detection method based on multivariate logistic regression feature fusion
TWI736063B (en) Object detection method for static scene and associated electronic device
KR102450466B1 (en) System and method for removing camera movement in video
CN111242980B (en) Point target-oriented infrared focal plane blind pixel dynamic detection method
CN114627309A (en) Visual SLAM method based on dotted line features in low texture environment
CN106250859B (en) The video flame detecting method spent in a jumble is moved based on characteristic vector
CN111008564B (en) Non-matching type face image recognition method and system
Nan et al. Automatic Detection Method of High Altitude Falling Object Based on Interframe Difference Method
CN116678401A (en) Dynamic environment SLAM method based on YOLOv6 algorithm improvement
Zhang et al. Covariance tracking with forgetting factor and random sampling
Xu et al. Multichannel correlation clustering target detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination