CN110602487A - Video image jitter detection method based on TSN (time delay network) - Google Patents

Video image jitter detection method based on TSN (time delay network) Download PDF

Info

Publication number
CN110602487A
CN110602487A CN201910843031.6A CN201910843031A CN110602487A CN 110602487 A CN110602487 A CN 110602487A CN 201910843031 A CN201910843031 A CN 201910843031A CN 110602487 A CN110602487 A CN 110602487A
Authority
CN
China
Prior art keywords
video
optical flow
tsn
jitter
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910843031.6A
Other languages
Chinese (zh)
Other versions
CN110602487B (en
Inventor
毛亮
王倩
李俊民
朱婷婷
王祥雪
谭焕新
侯玉清
刘双广
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Electronic Science and Technology
Gosuncn Technology Group Co Ltd
Original Assignee
Xian University of Electronic Science and Technology
Gosuncn Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Electronic Science and Technology, Gosuncn Technology Group Co Ltd filed Critical Xian University of Electronic Science and Technology
Priority to CN201910843031.6A priority Critical patent/CN110602487B/en
Publication of CN110602487A publication Critical patent/CN110602487A/en
Application granted granted Critical
Publication of CN110602487B publication Critical patent/CN110602487B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N17/00Diagnosis, testing or measuring for television systems or their details
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of video quality detection, and particularly relates to a video frame jitter detection method based on a TSN (time delay network), which comprises the following steps: based on a TSN network structure, extracting a normal optical flow field and a distorted optical flow field by utilizing a TVL1 optical flow algorithm; inputting the normal optical flow field and the distorted optical flow field into a TSN network; and judging whether the video shakes through the TSN network, and outputting the shaken frame number. The video jitter detection based on the TSN network can overcome the defect that the traditional algorithm cannot adapt to the video detection of environmental change and a long time range, and can also keep very high detection performance while reducing the calculation amount; the distorted optical flow proposed in the TSN network can suppress the interference of the motion of people or other things in the video, thereby further making the detection of the jitter more accurate.

Description

Video image jitter detection method based on TSN (time delay network)
Technical Field
The invention belongs to the technical field of video quality detection, and particularly relates to a video frame jitter detection method based on a TSN (time delay network).
Background
With the rapid development of security in safe cities, monitoring systems are widely applied in various fields, and the quality of videos transmitted by the monitoring systems is an important factor influencing whether the monitoring systems can play a role, so that how to maintain the monitoring systems with high efficiency is an urgent problem to be solved in the field of video monitoring. Video image jitter is a fault frequently occurring in a monitoring system, and the video image jitter is generally the up-down, left-right, or up-down, left-right jitter of a video image caused by the infirm fixation of a camera or other external forces and artificial actions, and influences the visual effect. The conventional video image jitter detection method is based on a traditional method, and more methods are a gray projection method and an optical flow method, wherein the gray projection method is an operation for simplifying and extracting image distribution characteristics, and the image characteristics are converted into curves along row and column coordinates by taking pixel rows and columns of a two-dimensional image as units, so that the image distribution characteristics are easier to calculate, but the method is only suitable for video jitter detection under a fixed scene and a simple condition and has relatively poor accuracy; the optical flow method is to extract feature points of a video at first and then track the feature points by using an optical flow algorithm, so that the optical flow is very dependent on the detection quality of the feature points, if a current environment has no way to find more corner points, the estimated displacement is incorrect, if a better effect is required to be obtained, the calculated amount is larger, and the optical flow is very easy to generate wrong estimation on moving objects in the actual environment, the robustness is poor, and the method is not suitable for video jitter detection in a long-time range and a complex environment.
Some existing technologies are basically based on traditional methods, but the generalization effect of traditional algorithms is generally poor, usually a set of threshold values or a set of rules are only applicable to specific scenes, and the accuracy of the algorithms is reduced or even fails when the scenes are changed. However, the application scenarios and environments are varied, which can greatly increase the difficulty of implementing conventional algorithms. For example, jiangei and the like propose a video jitter detection algorithm based on forward-backward optical flow point matching motion entropy, which firstly adopts an orb (oriented FAST and rotated bright) algorithm to extract feature points of a video, and then utilizes a forward-backward optical flow algorithm to track the feature points, wherein the algorithm has strong real-time processing capability, but has two problems: firstly, an optical flow algorithm is sensitive to illumination change and is not suitable for long-time tracking, and the algorithm is not suitable for video jitter detection in a complex environment; and secondly, only judging whether the video has jitter or not, and not outputting the frame number generated by the jitter.
Disclosure of Invention
In order to solve the technical defects in the prior art, the invention provides a video image jitter detection method based on a TSN (time delay network).
The invention is realized by the following technical scheme:
a video image jitter detection method based on TSN network includes steps:
1) based on a TSN network structure, extracting a normal optical flow field and a distorted optical flow field by utilizing a TVL1 optical flow algorithm;
2) inputting the normal optical flow field and the distorted optical flow field into a TSN network;
3) and judging whether the video shakes through the TSN network, and outputting the shaken frame number.
Further, the TSN network is composed of a spatial stream convolution network and a time stream convolution network.
And further, the normal optical flow field and the distorted optical flow field are used as input for capturing motion information, and when too many moving objects exist in the video shot in real time, the motion of the objects is inhibited through the distorted optical flow field, so that the objects are concentrated on the background motion in the video.
Further, in the step 3), the TSN network determining includes: given a video V, dividing V into K segments { S ] at equal intervals1,S2,...,SKAfter that, the TSN network models a series of fragments as follows:
TSN(T1,T2,...,TK)=H(G(F(T1;W),F(T2;W),...,F(TK;W)))
wherein: (T)1,T2,...,TK) Represents a sequence of fragments, each fragment TKFrom its corresponding segment SKObtaining the intermediate random sample; f (T)K(ii) a W) function represents that the convolutional network using W as a parameter acts on the segment TKThe function returns TKA score relative to a jitter category; g () is a segment consensus function; h () is a probability prediction function.
Further, a segment consensus function G () is output in combination with the class scores of the plurality of segments to obtain consensus on a class hypothesis, based on which the probability prediction function H () predicts the probability that the entire video segment belongs to a jitter class; in combination with standard classification cross-entropy losses, the final loss function for partial consensus is of the form:
wherein C is the total category number of behaviors; y isiIs a grountruth, G of class iiIs the jitter class score inferred from the scores of the same class in all segments using the aggregation function g.
Preferably, C is 1, a category of dithering.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the video picture jitter detection method based on the TSN network when executing the program.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of a video picture shaking detection method based on a TSN network.
Compared with the prior art, the invention has at least the following beneficial effects or advantages: the video jitter detection based on the TSN network can overcome the defect that the traditional algorithm cannot adapt to the video detection of environmental change and a long time range, and can also keep very high detection performance while reducing the calculation amount; the distorted optical flow proposed in the TSN network can suppress the interference of the motion of people or other things in the video, thereby further making the detection of the jitter more accurate. The scheme uses the TSN to detect the video jitter, fully utilizes the advantages of the TSN, not only can adapt to environmental changes in any scene, but also can detect videos in a long time range in real time, can inhibit the interference of the motion of other objects in the videos, and has strong anti-interference capability; the method can achieve good detection effect while dealing with the change of different scenes.
Drawings
The present invention will be described in further detail with reference to the accompanying drawings;
FIG. 1 is a block diagram of a TSN network architecture of the present invention;
FIG. 2 is a flow chart of video jitter detection according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a video frame jitter detection method based on a TSN (traffic service network). in one embodiment, the method utilizes a pedestrian recognition principle to detect whether a video jitters, and the behavior recognition is to automatically analyze the ongoing behavior in an unknown video or an image sequence. The method comprises the steps of simple behavior recognition, namely action classification, wherein a video is given, only a plurality of known action classes are required to be correctly classified, complex points are recognized, the video only comprises one action class but a plurality of action classes, the system needs to automatically recognize the action class and the starting moment of the action, and video jitter is also an action behavior from an intuitive angle. Therefore, video jitter can be detected by using a behavior recognition principle, wherein TSN is an algorithm with higher precision in the current behavior recognition and is composed of a spatial stream convolution network and a time stream convolution network, firstly, an input video is divided into k segments, one segment is obtained by corresponding random sampling, the category scores of different segments are fused by adopting a consensus function to generate segment consensus, and then, the prediction fusion of all modes generates a final prediction result; model parameters can be learned from the whole video instead of a short segment, and a sparse time sampling strategy is adopted, wherein the sampling segment only comprises a small part of frames, so that the calculation overhead is greatly reduced, and the network structure diagram of the TSN is shown in figure 1.
The scheme is based on training a TSN network to enable the TSN network to learn video jitter characteristics so as to detect whether pictures are jittered or not. The specific implementation steps are as follows: according to the network structure of the TSN, firstly, the TVL1 optical flow algorithm (TVL1 optical flow algorithm, namely, assuming two continuous frame images I) is utilized0And I1X ═ (X, y) is I0And if the last pixel point is the previous pixel point, the energy function of the optical flow model is as follows: wherein U-is a two-dimensional optical flow field,andis a two-dimensional ladderThe parameter lambda is a weight constant of a data item, the first item is a data constraint item and represents the gray value difference between two frames of images of the same pixel point; the second term is the motion regularization constraint, i.e., the motion is assumed to be continuous. The TVL1 optical flow is calculated by adopting a method for minimizing a total variation optical flow energy function based on a numerical analysis mechanism of image denoising-based bidirectional solving) to extract a normal optical flow field and a distorted optical flow field, the optical flow field is used as input to focus on capturing motion information, but when too many moving objects exist in a video shot in reality, misjudgment is easily caused by the movement of the objects, so that the object motion is inhibited through the distorted optical flow field, and the background motion in the video is concentrated; and inputting the optical flow field into the TSN network, and finally judging whether the video shakes and outputting the shaken frame number by the TSN network. The TSN network is specifically implemented by giving a video V and dividing the video V into K segments { S } at equal intervals1,S2,…,SKAfter that, the TSN models a series of fragments as follows:
TSN(T1,T2,...,TK)=H(G(F(T1;W),F(T2;W),...,F(TK;W)))
wherein: (T)1,T2,...,TK) Represents a sequence of fragments, each fragment TKFrom its corresponding segment SKObtaining the intermediate random sample; f (T)K(ii) a W) function represents that the convolutional network using W as a parameter acts on a short segment TKFunction return TKA score relative to a jitter category; the segment consensus function G combines the class score outputs of the plurality of short segments to obtain a consensus on the class hypothesis, based on which the prediction function H predicts the probability that the whole segment of video belongs to the jitter class; in combination with standard classification cross-entropy losses, the final loss function G for partial consensus is of the form:
where C is the total class number of behaviors, here 1, with only one class dithering, yiIs the group of category i (true for calibration)Data) using an aggregation function G to infer a jitter class score G from scores of the same class in all segmentsiThe aggregation function g represents the final recognition accuracy by a uniform averaging method. The specific flow is shown in fig. 2.
In another embodiment, a computer readable storage medium is provided, on which a computer program is stored, wherein the program, when executed by a processor, implements the steps of a video picture jitter detection method based on a TSN network.
In another embodiment, a computer device is provided, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the video picture jitter detection method based on the TSN network when executing the program.
The above-mentioned embodiments are provided to further explain the objects, technical solutions and advantages of the present invention in detail, and it should be understood that the above-mentioned embodiments are only examples of the present invention and are not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the invention are also within the protection scope of the invention.

Claims (8)

1. A video image jitter detection method based on a TSN network is characterized by comprising the following steps:
1) based on a TSN network structure, extracting a normal optical flow field and a distorted optical flow field by utilizing a TVL1 optical flow algorithm;
2) inputting the normal optical flow field and the distorted optical flow field into a TSN network;
3) and judging whether the video shakes through the TSN network, and outputting the shaken frame number.
2. The TSN network-based video picture jitter detection method of claim 1, wherein the TSN network is composed of a spatial stream convolutional network and a temporal stream convolutional network.
3. The method of claim 1, wherein the normal optical flow field and the distorted optical flow field are used as input for capturing motion information, and when too many moving objects are in the video captured in real time, the distorted optical flow field is used to suppress the motion of the object so as to focus on the background motion in the video.
4. The TSN network-based video picture jitter detection method of claim 1, wherein in said step 3), said TSN network determining comprises: given a video V, dividing V into K segments { S ] at equal intervals1,S2,...,SKAfter that, the TSN network models a series of fragments as follows:
TSN(T1,T2,...,TK)=H(G(F(T1;W),F(T2;W),...,F(TK;W)))
wherein: (T)1,T2,...,TK) Represents a sequence of fragments, each fragment TKFrom its corresponding segment SKObtaining the intermediate random sample; f (T)K(ii) a W) function represents that the convolutional network using W as a parameter acts on the segment TKThe function returns TKA score relative to a jitter category; g () is a segment consensus function; h () is a probability prediction function.
5. The method of claim 4, wherein the segment consensus function G () is output in combination with the class scores of the plurality of segments to obtain consensus on a class hypothesis, and based on the consensus on the hypothesis, the probability prediction function H () predicts a probability that the entire segment of video belongs to a jitter class; in combination with standard classification cross-entropy losses, the final loss function for partial consensus is of the form:
wherein C is the total category number of behaviors; y isiIs a grountruth, G of class iiTo adopt the aggregation function g from all sheetsThe inferred jitter category scores of the same category in the segment.
6. A video frame jitter detection method according to claim 5, wherein C-1 is a category of jitter.
7. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the steps of the TSN network based video picture jitter detection method according to any of claims 1-6.
8. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for detecting video picture jitter according to any of claims 1-6 based on the TSN network.
CN201910843031.6A 2019-09-06 2019-09-06 Video image jitter detection method based on TSN (time delay network) Active CN110602487B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910843031.6A CN110602487B (en) 2019-09-06 2019-09-06 Video image jitter detection method based on TSN (time delay network)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910843031.6A CN110602487B (en) 2019-09-06 2019-09-06 Video image jitter detection method based on TSN (time delay network)

Publications (2)

Publication Number Publication Date
CN110602487A true CN110602487A (en) 2019-12-20
CN110602487B CN110602487B (en) 2021-04-20

Family

ID=68858068

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910843031.6A Active CN110602487B (en) 2019-09-06 2019-09-06 Video image jitter detection method based on TSN (time delay network)

Country Status (1)

Country Link
CN (1) CN110602487B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116193231A (en) * 2022-10-24 2023-05-30 成都与睿创新科技有限公司 Method and system for handling minimally invasive surgical field anomalies

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001006181A (en) * 1999-05-07 2001-01-12 Sony Precision Eng Center Singapore Pte Ltd Apparatus for measuring jitter of optical disc
US20100157070A1 (en) * 2008-12-22 2010-06-24 Honeywell International Inc. Video stabilization in real-time using computationally efficient corner detection and correspondence
CN104135597A (en) * 2014-07-04 2014-11-05 上海交通大学 Automatic detection method of jitter of video
CN108492287A (en) * 2018-03-14 2018-09-04 罗普特(厦门)科技集团有限公司 A kind of video jitter detection method, terminal device and storage medium
CN110191320A (en) * 2019-05-29 2019-08-30 合肥学院 Video jitter based on pixel timing motion analysis and freeze detection method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001006181A (en) * 1999-05-07 2001-01-12 Sony Precision Eng Center Singapore Pte Ltd Apparatus for measuring jitter of optical disc
US20100157070A1 (en) * 2008-12-22 2010-06-24 Honeywell International Inc. Video stabilization in real-time using computationally efficient corner detection and correspondence
CN104135597A (en) * 2014-07-04 2014-11-05 上海交通大学 Automatic detection method of jitter of video
CN108492287A (en) * 2018-03-14 2018-09-04 罗普特(厦门)科技集团有限公司 A kind of video jitter detection method, terminal device and storage medium
CN110191320A (en) * 2019-05-29 2019-08-30 合肥学院 Video jitter based on pixel timing motion analysis and freeze detection method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116193231A (en) * 2022-10-24 2023-05-30 成都与睿创新科技有限公司 Method and system for handling minimally invasive surgical field anomalies

Also Published As

Publication number Publication date
CN110602487B (en) 2021-04-20

Similar Documents

Publication Publication Date Title
US10769480B2 (en) Object detection method and system
Ruiz et al. Fine-grained head pose estimation without keypoints
WO2020173226A1 (en) Spatial-temporal behavior detection method
US20220417590A1 (en) Electronic device, contents searching system and searching method thereof
US9767570B2 (en) Systems and methods for computer vision background estimation using foreground-aware statistical models
US8218819B2 (en) Foreground object detection in a video surveillance system
US8218818B2 (en) Foreground object tracking
US20170213105A1 (en) Method and apparatus for event sampling of dynamic vision sensor on image formation
TWI482123B (en) Multi-state target tracking mehtod and system
CN106851049B (en) A kind of scene alteration detection method and device based on video analysis
US7982774B2 (en) Image processing apparatus and image processing method
KR102002812B1 (en) Image Analysis Method and Server Apparatus for Detecting Object
CN112561951B (en) Motion and brightness detection method based on frame difference absolute error and SAD
CN110633643A (en) Abnormal behavior detection method and system for smart community
CN110602487B (en) Video image jitter detection method based on TSN (time delay network)
CN108876807B (en) Real-time satellite-borne satellite image moving object detection tracking method
CN114049483A (en) Target detection network self-supervision training method and device based on event camera
Wu et al. A novel visual object detection and distance estimation method for hdr scenes based on event camera
CN111784750A (en) Method, device and equipment for tracking moving object in video image and storage medium
JPWO2018179119A1 (en) Video analysis device, video analysis method, and program
TWI783572B (en) Object tracking method and object tracking apparatus
CN115512263A (en) Dynamic visual monitoring method and device for falling object
CN114972840A (en) Momentum video target detection method based on time domain relation
WO2016136214A1 (en) Identifier learning device, remaining object detection system, identifier learning method, remaining object detection method, and program recording medium
Takahara et al. Making background subtraction robust to various illumination changes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant