CN116647690B - Video concentration method based on space-time rotation - Google Patents

Video concentration method based on space-time rotation Download PDF

Info

Publication number
CN116647690B
CN116647690B CN202310626770.6A CN202310626770A CN116647690B CN 116647690 B CN116647690 B CN 116647690B CN 202310626770 A CN202310626770 A CN 202310626770A CN 116647690 B CN116647690 B CN 116647690B
Authority
CN
China
Prior art keywords
target
time
space
collision
rotation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310626770.6A
Other languages
Chinese (zh)
Other versions
CN116647690A (en
Inventor
张云佐
郭凯娜
朱鹏飞
张天
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shijiazhuang Tiedao University
Original Assignee
Shijiazhuang Tiedao University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shijiazhuang Tiedao University filed Critical Shijiazhuang Tiedao University
Priority to CN202310626770.6A priority Critical patent/CN116647690B/en
Publication of CN116647690A publication Critical patent/CN116647690A/en
Application granted granted Critical
Publication of CN116647690B publication Critical patent/CN116647690B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • H04N5/91Television signal processing therefor
    • H04N5/917Television signal processing therefor for bandwidth reduction
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses a video concentration method based on space-time rotation, which belongs to the field of computer vision and comprises the following steps: 1) According to the target duty ratio threshold, respectively performing dynamic time domain translation or space-time rotation to avoid pseudo collision in the video concentration process; 2) Defining an angle critical threshold to divide the space-time rotation into self-adaptive space-time rotation and critical space-time rotation, and avoiding pseudo collision while guaranteeing the compression rate; 3) Designing a time sequence function to perform time sequence judgment, so as to avoid time sequence confusion among targets; 4) Finally, the target tube is sutured to generate a concentrated video. The method can fully utilize the space-time information in the video, avoid pseudo collision while maintaining good frame concentration rate, and simultaneously maintain the time sequence relation among targets, thereby having good visual effect.

Description

Video concentration method based on space-time rotation
Technical Field
The invention relates to a video concentration method based on space-time rotation, and belongs to the technical field of computer vision.
Background
In recent years, with the improvement of living standard, the safety precaution consciousness of people is continuously enhanced, and the monitoring equipment which is one of the effective means of safety precaution is widely applied to places with intensive traffic such as banks, supermarkets and the like. A large number of cameras are used for monitoring video recordings 24 hours a day for security purposes, producing a vast amount of video data, and although the captured video contains effective information for security purposes, the storage, management and control of such large amounts of video data is becoming increasingly difficult. In the video surveillance, no activity or activity may only occur in a small image area within a period of time, if the video is watched by adopting the traditional manual method to find the required information, a great deal of manpower, material resources and time are consumed, and moreover, the video is watched by focusing attention for a long time, so that fatigue of workers is easily caused, and misjudgment of information is easily caused by miswatching and missed watching. The video concentration technique provides an effective solution to the above problems by optimally rearranging the motion trajectories in the original picture to generate a concentrated video suitable for efficient browsing and searching. The video duration is greatly reduced, and the information of the original video is not lost. In the face of massive monitoring video data, how to quickly browse and inquire the required content becomes a problem to be solved urgently, and the urgent real demand exists. The application of video concentration technology is particularly important for the situation that video data is being generated from time to time in the current society.
Furthermore, ghatak et al evaluate the performance of converting the tube optimization rearrangement problem into solving an energy function optimal solution problem, namely simulated annealing (Simulated Annealing, SA), cultural algorithm (Cultural Algorithm, CA), teaching-based optimization algorithm (Teaching-Learning-Based Optimization, TLBO), forest optimization algorithm (Forest Optimization Algorithm, FOA), gray wolf optimizer (Gray Wolf Optimizer, GWO), non-dominant ranking genetic algorithm II (Non-dominated Sorting Genetic Algorithm-II, NSGA-II), JAYA algorithm, elite-JAYA algorithm (elist-JAYA) and Adaptive Multi-Population-based JAYA algorithm (SAMP-JAYA). In a subsequent improvement, ghatak et al propose to improve the energy minimization process using a hybrid algorithm combining SA and JAYA. Yao et al use genetic algorithms (Genetic Algorithm, GA) to generate new minimisation energy function formulas. In addition, xu et al propose an optimization scheme based on GA to solve the problem of merging of target tubes in the process of concentrated video generation, which is superior to the SA-based method in terms of information loss and time consumption. Ghatak et al explored the concept of multi-frame and scaling and proposed a HGWOSA optimization algorithm that mixed GWO and SA together to obtain globally optimal results at low computational cost. Moussa et al optimized rearranging the target tubes using a particle swarm algorithm (Particle Swarm Optimization, PSO), reduced false collisions, maintained a time sequence, and calculated interactions between targets.
The most challenging task in object-based video concentration is to obtain the optimal rearrangement of the object tubes to show the most motion information in the shortest time span. In the video concentration process, two objects in the original video, where there is no collision, may collide in the concentrated video, which is called a pseudo collision. In order to solve the problems of target interactivity loss, false collisions, etc., many improved video concentration techniques have been proposed. Nie et al propose a video enrichment technique to move objects in the time and space domain to generate a reduced false collision enriched video. Li et al propose a solution to the problem of false collisions in video concentration that minimizes the identified false collision target size in the time domain, although the problem of false collisions is technically solved, the vehicle and person sizes of the concentrated video in the same scene close to each other may be the same, not in line with reality. He et al define collision states between moving objects, respectively: the method has the advantages that no collision, same-direction collision and reverse collision exist, motion collision is further analyzed, an optimization strategy based on a collision diagram is also provided, a target tube is filled in a deterministic mode, and the calculation complexity is reduced.
Unlike the method of optimizing rearrangement target tubes using energy function solution, huang et al demonstrated the superiority of on-line optimization, which allows rearrangement of target tubes while detecting targets without waiting for the optimization process to begin, but the biggest problem in this method is to ignore false collisions completely to improve time performance, and another problem is that optimization technique threshold is determined manually instead of using decision technique, which brings about a trade-off between computation time and compression rate, and accuracy is lowered. Feng et al introduced a background generation method that selected the most active video frames in the image with the greatest background variation, and the resulting condensed video background varied with time and motion. Hsia et al propose an optimized rearrangement method for selecting target tubes by using region trees, introduce a high-efficiency search technology for a moving target database, and reduce the computational complexity.
Disclosure of Invention
Aiming at the problems that in a video concentration method, pseudo collision is serious and target time sequence is disordered, so that the generated concentrated video has low reflection degree on an original video, and the existing video concentration method for solving the optimal solution based on an energy function has poor local pseudo collision avoidance effect, the invention aims to provide a video concentration method which simultaneously keeps target time sequence, avoids pseudo collision and keeps higher video compression rate, and realizes high-quality and high-efficiency video concentration by designing a space-time rotation algorithm and a time sequence function.
To achieve the above object, an embodiment of the present invention provides a video condensing method based on space-time rotation, which is characterized by comprising the steps of:
1) Performing target detection and tracking on the input video by using a Yolov4 and deep Sort algorithm, and extracting a target tube;
2) Analyzing pseudo collision generated among targets in the video concentration process, defining a target duty ratio threshold, and respectively processing the pseudo collision targets;
3) The method for dynamic time domain translation is provided, and the moving targets larger than the target duty ratio threshold value are subjected to dynamic time domain translation to avoid pseudo collision, so that the robustness of a space-time rotation algorithm in the monitoring videos with different sizes of the moving targets is ensured;
4) Performing space-time rotation on a moving target smaller than a target duty ratio threshold, defining an angle critical threshold, and dividing a space-time rotation method into self-adaptive space-time rotation and critical space-time rotation;
5) The false collision target with the angle critical threshold value smaller than 0 adopts self-adaptive space-time rotation to improve the reflection degree of the original video;
6) For the false collision targets with the angle critical threshold value larger than 0, critical space-time rotation is adopted to improve the compression rate of the concentrated video.
The further technical proposal is that: firstly, defining an object interactivity function to analyze interactivity among objects, and dividing a management set for common processing; secondly, analyzing the pseudo collision among the targets, respectively carrying out dynamic time domain translation or space-time rotation according to the target duty ratio threshold value to avoid the pseudo collision, defining an angle critical threshold value to divide the space-time rotation into self-adaptive space-time rotation and critical space-time rotation, and avoiding the pseudo collision while ensuring the compression rate; then, designing a time sequence function to perform time sequence judgment, so as to avoid time sequence confusion among targets; finally, suturing the target vessel generates a condensed video.
The further technical proposal is that: assuming that the target tube in which the pseudo collision occurs is T i And T j Wherein T is i Has been rearranged and belongs to the concentrated video, T j Has not been rearranged, T i And T j The collision overlap ratio at time t is defined as:
the further technical proposal is that: defining a target duty ratio threshold, and respectively carrying out dynamic time domain translation and time-space domain rotation according to the threshold to avoid pseudo collision; and secondly, defining an angle critical threshold value, and dividing the time-space domain rotation into self-adaptive time-space rotation and critical time-space rotation.
The further technical proposal is that: defining a target duty cycle threshold:
for a pseudo collision target with a target frame area larger than gamma, adopting dynamic time domain translation to delay T i Will T j The start time tag of (c) is modified as follows:
the further technical proposal is that: for a target tube T with a target frame area smaller than gamma i And T j Avoiding pseudo collision by adopting a time-space domain rotation method, and defining an angle critical threshold value:
the further technical proposal is that: for the monitoring video with eta less than or equal to 0, a self-adaptive space-time rotation mode is provided to avoid pseudo collision, and a target tube T is arranged j The starting point is taken as the circle center to rotate so as to be far away from the collision point until two targets are just free from collision, and the dotted line is a target tube T j Space-time trajectory after rotation.
The further technical proposal is that: assume a target tube T j Into the surveillance zone is the coordinatesRotating target tube T j Avoiding false collisions, the direction vector after rotation is +.>Specifically defined as:
target tube T j Rotation angleThe calculation mode of (a) is as follows:
the further technical proposal is that: for the monitoring video with eta > 0, a critical space-time rotation method is provided to avoid pseudo collision on the premise of ensuring the compression rateTarget tube T j And rotating the target tubes by taking the starting point as the center of a circle to enable the target tubes to be far away from the collision point until two target tubes have no overlapping frames.
The further technical proposal is that: classifying according to the motion position relation of the target tube in the monitoring area, and defining the following formula:
wherein,and->Representing the barycentric coordinates of the mth object exiting and entering the surveillance zone, respectively.
The further technical proposal is that: with target tube T j When lambda is more than or equal to 0, T is the center of a starting point entering a monitoring area i And T j Is T at critical intersection point i Is called T at this time i Is a proximal segment; lambda < 0, T i And T j Is T at critical intersection point j Is called T at this time j Is a proximal segment; in performing optimal rearrangement of target tubes, different target tubes as proximal segments need to be treated differently to avoid collisions.
The further technical proposal is that: t (T) i For the near heart segment, T is j Rotate until it is equal to T i The two target tubes will not have a collision relationship,the specific definition of (2) is as follows:
wherein,respectively T i Coordinates and T of exit monitoring area j Into the coordinates of the monitored area. Known->Is T j Can calculate the rotation angle +.>
The further technical proposal is that: t (T) j For the proximal segment, T is j Rotate until T j Endpoint and T of (2) i Intersecting, there will be no collision relationship between the two target tubes, knownAssuming a rotated T j And T is i The intersection point coordinate of (a, ζ) is +.>The specific definition of (2) is as follows:
wherein, (μ, ζ) is calculated according to known conditions:
the beneficial effects of adopting above-mentioned technical scheme to produce lie in: aiming at the problem of pseudo collision among targets in a concentrated video, the invention provides a high-efficiency concentration scheme of a monitoring video based on space-time rotation, and firstly, a target interactivity function is provided for carrying out interactive analysis on an extracted target pipe to determine a target with interactive behavior; secondly, a dynamic time domain translation and space-time rotation algorithm is provided on the problem of processing the pseudo collision, and the self-adaptive space-time rotation and critical space-time rotation are respectively carried out according to the collision mode of the target, so that the compression rate is ensured while the pseudo collision is avoided; and finally, filling a target tube by using a rotation search mode, and defining a time sequence function to ensure the consistency of the generated concentrated video and the original video time sequence. Experimental results show that the method avoids pseudo collision among targets on the premise of ensuring the compression rate, and can realize high-quality video concentration.
Drawings
Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, given with reference to the accompanying drawings in which:
FIG. 1 is a flow chart of a monitoring video concentration process based on space-time rotation according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of dynamic time domain translation according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of adaptive spatio-temporal rotation according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of critical spatiotemporal rotation provided by an embodiment of the present invention;
FIG. 5 is a schematic diagram of a pseudo collision according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of critical spatiotemporal rotation (a) according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of critical spatiotemporal rotation (b) according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways other than those described herein, and persons skilled in the art will readily appreciate that the present invention is not limited to the specific embodiments disclosed below.
As shown in fig. 1, a space-time rotation-based monitoring video concentration overall flowchart provided by an embodiment of the present invention includes:
1) Performing target detection and tracking on the input video by using a Yolov4 and deep Sort algorithm, and extracting a target tube;
2) Analyzing pseudo collision generated among targets in the video concentration process, defining a target duty ratio threshold, and respectively processing the pseudo collision targets;
3) The method for dynamic time domain translation is provided, and the moving targets larger than the target duty ratio threshold value are subjected to dynamic time domain translation to avoid pseudo collision, so that the robustness of a space-time rotation algorithm in the monitoring videos with different sizes of the moving targets is ensured;
4) Performing space-time rotation on a moving target smaller than a target duty ratio threshold, defining an angle critical threshold, and dividing a space-time rotation method into self-adaptive space-time rotation and critical space-time rotation;
5) The false collision target with the angle critical threshold value smaller than 0 adopts self-adaptive space-time rotation to improve the reflection degree of the original video;
6) For the false collision targets with the angle critical threshold value larger than 0, critical space-time rotation is adopted to improve the compression rate of the concentrated video.
As shown in fig. 2, in order to ensure the robustness of the space-time rotation algorithm in the surveillance video with different sizes of moving objects, the present invention proposes three processing modes of the space-time rotation algorithm according to different pseudo collision modes. Firstly, defining a target duty ratio threshold, and respectively carrying out dynamic time domain translation and time-space domain rotation according to the threshold to avoid pseudo collision; and secondly, defining an angle critical threshold value, and dividing the time-space domain rotation into self-adaptive time-space rotation and critical time-space rotation. Defining a target duty cycle threshold gamma:
where w and h are the width and height of the target frame, W, H is the width and height of the video frame, and f is the frame rate of the input video. For a pseudo collision target with a target frame area larger than gamma, adopting dynamic time domain translation to delay T j Is to T assuming that the moving speed of the object j is v j The start time tag of (c) is modified as follows:
fig. 2 shows the result of video concentration with a target frame area greater than γ, where two targets that do not intersect in the original video appear at the same spatial location at the same time, creating a false collision. Modification of target tube T by dynamic time domain translation method j It can be seen that the condensed video after dynamic time domain translation has no pseudo collision, and the time length of the whole target tube does not need to be translated, so that the influence on the compression rate is very small.
As shown in FIG. 3, in order to ensure a good visual effect of the concentrated video, a self-adaptive space-time rotation mode is provided for the monitoring video with eta less than or equal to 0 to avoid pseudo collision. After two target pipes translate along a time axis, a small amount of video frames overlap, and collision-free targets in the original video generate pseudo collision after concentration. As shown in fig. 3, the target tube T j The starting point is taken as the circle center to rotate so as to be far away from the collision point until two targets are just free from collision, and the dotted line is a target tube T j Space-time trajectory after rotation.
To more intuitively demonstrate the adaptive spatio-temporal rotation method, FIG. 3 shows a target tube T i And T j Generating a two-dimensional schematic of the maximum impact area at time T, T i And T j The motion direction vectors of (a) are respectivelyAnd->With target tube T j The starting point of entering the monitoring area is taken as the center of a circle and rotated by T j Until the false collision disappears, as shown in FIG. 3, the dotted line indicates the direction of movement after rotation, ++>Is a direction vector +.>Is the angle of rotation of the target tube.
In order to improve the compression rate of the concentrated video, for the monitoring video with eta > 0, a large number of video frames generate pseudo collision, and the video compression rate is greatly sacrificed by delaying the starting time label of the target tube, so that the critical space-time rotation method is provided for avoiding the pseudo collision on the premise of ensuring the compression rate.
As shown in fig. 4, after the two target tubes translate along the time axis, a large number of overlapped video frames exist, and after the target tubes in the original video are condensed, the collision-free target has pseudo collision. The target tube T j Rotating the target tube with the starting point as the center of a circle to separate the target tube from the collision point until two target tubes have no overlapping frames, as shown in FIG. 4, the dotted line is the target tube T j Space-time trajectory after rotation.
To more intuitively illustrate the critical spatiotemporal rotation method, FIG. 5 illustrates a target tube T i And T j The two-dimensional schematic diagram of the motion direction vector at the time t is classified according to the motion position relation of the target tube in the monitoring area, and the following formula is defined:
as shown in FIG. 5, the target tube T j When lambda is more than or equal to 0, T is the center of a starting point entering a monitoring area i And T j Is T at critical intersection point i Is called T at this time i Is a proximal segment; lambda < 0, T i And T j Is T at critical intersection point j Is called T at this time j Is a proximal segment, as shown in fig. 5. In performing the target tube optimization rearrangement, different target tubes as proximal segments need to be treated differently to avoid collisions, as will be described separately belowFor T i And T j Introduction of rotation angle as proximal segmentA corresponding calculation method.
As shown in fig. 6, T i Is a proximal segment. Will T j Rotate until it is equal to T i The two target tubes will not have a collision relationship,the specific definition of (2) is as follows:
as shown in FIG. 7, T j Is a proximal segment. Will T j Rotate until T j Endpoint and T of (2) i Intersecting, there will be no collision relationship between the two target tubes.
Rotation angleIs +.>And->Known->Assuming a rotated T j And T is i The intersection point coordinate of (a, ζ) is +.>The specific definition of (2) is as follows:
wherein, (μ, ζ) is calculated according to known conditions:
from this, the target tube T can be obtained j Is of the rotation angle of (a)And (3) avoiding pseudo collision between target pipes, and generating a concentrated video which accords with the visual effect of human eyes and restores the relation between moving targets.
To verify the validity of the above embodiment, experimental comparisons were made on a common dataset and a real scene captured surveillance video. The experimental environment is Windows10 system, intel (R) Core (TM) i5-8265U CPU, NVIDIA GeForce MX display card, and memory 16G. The proposed method was experimentally verified using 12-segment surveillance video.
Tables 1 and 2 show the results of the frame concentration ratio FR and the collision ratio OR experiments, respectively, on 12-segment video. To more intuitively compare performance between different methods, a comparison is made using the average of experimental results for all videos at the last use of the table.
Table 1 compares FR results of CE method, PSO method, IV method, CF method
Table 2 compares OR results with CE method, PSO method, IV method, CF method
As can be seen from table 2, the average OR of the proposed method is significantly better than the CE, PSO and IV methods because the CE, PSO and IV methods merely translate the target tube on the time axis, while the proposed method considers both spatial and temporal dimensions, and can better avoid false collisions and thus result in a more excellent OR. The average OR of the proposed method is close to 0.0680 and 0.0633, respectively, because the CF method shifts the target tube in the time axis while changing the size and moving speed of the target, but the visual quality is poor although the OR close to the proposed method is obtained, and the degree of reflection on the original video content is to be improved. Generally speaking, the improvement in FR comes at the cost of an increase in OR, and by the comprehensive analysis of tables 1 and 2, the proposed method is a great improvement in FR compared to the CE method, while maintaining a better OR value. The OR value of the concentrated video is effectively reduced and the FR value is optimized compared to the PSO method. Compared with the IV method, the method is obviously superior to the IV method in OR aspect, and in FR aspect, the method is close to the IV method but superior to the IV method.
The foregoing describes in detail specific embodiments of the present invention. It is to be understood that the invention is not limited to the particular embodiments described above, and that various changes and modifications may be made by one skilled in the art within the scope of the claims without affecting the spirit of the invention.

Claims (3)

1. A video concentration method based on space-time rotation, comprising the steps of:
1) Performing target detection and tracking on the input video by using a Yolov4 and deep Sort algorithm, and extracting a target tube;
2) The method comprises the steps of analyzing pseudo collision generated among targets in the video concentration process, defining a target duty ratio threshold gamma, and respectively processing the targets of the pseudo collision, wherein the target duty ratio threshold is calculated according to the width W, the height H, the frame rate f and the target frame width W and the height H of an input video, and the calculation formula is as follows:
3) The method for dynamic time domain translation is provided, and the moving targets larger than the target duty ratio threshold value are subjected to dynamic time domain translation to avoid pseudo collision, so that the robustness of a space-time rotation algorithm in the monitoring videos with different sizes of the moving targets is ensured;
4) Performing space-time rotation on a moving target smaller than a target duty ratio threshold, defining an angle critical threshold eta, and dividing a space-time rotation method into self-adaptive space-time rotation and critical space-time rotation, wherein an angle critical threshold calculation formula is as follows:
in cos θ i,j Representing the target tube T i And T j A direction vector included angle;
5) Pseudo collision target with angle critical threshold less than 0 adopts self-adaptive space-time rotation to improve the reflection degree of original video, and the target is assumed to be T j Into the surveillance zone is the coordinatesRotating target tube T j Avoiding false collisions, the direction vector after rotation is +.>Specifically defined as:
wherein the method comprises the steps ofFor the direction vector before rotation of object j, r is the distance of the object from the entry into the surveillance zone to the current position,and->Respectively the target tubes T i And T j Barycentric coordinate at time t, < >>And->Respectively T i And T j At the time T, the target tube T is arranged at the lower right coordinate of the target frame j Rotation angle->The calculation mode of (a) is as follows:
6) For the false collision target with the angle critical threshold value larger than 0, adopting critical space-time rotation to improve the compression rate of the concentrated video; the target tube T j Rotating the target pipes by taking the starting point as the center of a circle to enable the target pipes to be far away from the collision point until two target pipes have no overlapping frames, classifying the target pipes according to the motion position relation of the target pipes in the monitored area, and defining the following formula:
wherein,representative target tube T i Barycentric coordinates of exit monitoring area, +.>And->Respectively represent target tubes T j Exit and enter the barycentric coordinates of the surveillance zone; with target tube T j When lambda is more than or equal to 0, T is the center of a starting point entering a monitoring area i And T j Is T at critical intersection point i Is called T at this time i Is a proximal segment; lambda < 0, T i And T j Is T at critical intersection point j Is called T at this time j Is a proximal segment; when the target tubes are optimized and rearranged, different target tubes are used as the near-heart section and need to be subjected to different treatments to avoid collision;
T i for the near heart segment, T is j Rotate until it is equal to T i The two target tubes will not have a collision relationship,is specifically defined as->Known->Is T j Can calculate the rotation angle +.>
T j For the near heart segment, T is j Rotate until T j Endpoint and T of (2) i Intersecting, there will be no collision relationship between the two target tubes, knownAssuming a rotated T j And T is i The intersection point coordinate of (a, ζ) is +.>The specific definition of (2) is as follows:
wherein (μ, ζ) is calculated according to known conditions:
wherein,representative target tube T i Entering the barycentric coordinates of the surveillance area.
2. The spatio-temporal rotation based video concentration method of claim 1, wherein the preprocessing operation of the input video is: firstly, detecting a moving target in an input video by using a Yolov4 target detection algorithm; then, a DeepSort target tracking algorithm is adopted to obtain a target motion track, and a target tube is formed; and finally, analyzing the collision relation among target pipes, and respectively processing according to the collision mode.
3. The video concentration method based on space-time rotation as claimed in claim 1, wherein for pseudo collision targets with target frame areas larger than γ, a dynamic time domain translation is adopted to delay the start time label of the target tube, and the moving speed of the target is assumed to be v according to the targetCross-to-cross ratio of target collisionModifying the starting time label of the target pipe as follows:
CN202310626770.6A 2023-05-30 2023-05-30 Video concentration method based on space-time rotation Active CN116647690B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310626770.6A CN116647690B (en) 2023-05-30 2023-05-30 Video concentration method based on space-time rotation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310626770.6A CN116647690B (en) 2023-05-30 2023-05-30 Video concentration method based on space-time rotation

Publications (2)

Publication Number Publication Date
CN116647690A CN116647690A (en) 2023-08-25
CN116647690B true CN116647690B (en) 2024-03-01

Family

ID=87615081

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310626770.6A Active CN116647690B (en) 2023-05-30 2023-05-30 Video concentration method based on space-time rotation

Country Status (1)

Country Link
CN (1) CN116647690B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109511019A (en) * 2017-09-14 2019-03-22 中兴通讯股份有限公司 A kind of video summarization method, terminal and computer readable storage medium
CN111709972A (en) * 2020-06-11 2020-09-25 石家庄铁道大学 Space constraint-based method for quickly concentrating wide-area monitoring video
CN114374885A (en) * 2021-12-31 2022-04-19 北京百度网讯科技有限公司 Video key segment determination method and device, electronic equipment and readable storage medium
CN114926495A (en) * 2022-05-17 2022-08-19 中南大学 Data processing method, trajectory visualization method and analysis method of traffic video stream
CN116074642A (en) * 2023-03-28 2023-05-05 石家庄铁道大学 Monitoring video concentration method based on multi-target processing unit

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI511058B (en) * 2014-01-24 2015-12-01 Univ Nat Taiwan Science Tech A system and a method for condensing a video
US20160080835A1 (en) * 2014-02-24 2016-03-17 Lyve Minds, Inc. Synopsis video creation based on video metadata

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109511019A (en) * 2017-09-14 2019-03-22 中兴通讯股份有限公司 A kind of video summarization method, terminal and computer readable storage medium
CN111709972A (en) * 2020-06-11 2020-09-25 石家庄铁道大学 Space constraint-based method for quickly concentrating wide-area monitoring video
CN114374885A (en) * 2021-12-31 2022-04-19 北京百度网讯科技有限公司 Video key segment determination method and device, electronic equipment and readable storage medium
CN114926495A (en) * 2022-05-17 2022-08-19 中南大学 Data processing method, trajectory visualization method and analysis method of traffic video stream
CN116074642A (en) * 2023-03-28 2023-05-05 石家庄铁道大学 Monitoring video concentration method based on multi-target processing unit

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Unsupervised video summarization using deep Non-Local video summarization networks;Sha-Sha Zang , Hui Yu a b, Yan Song , Ru Zeng;《Neurocomputing》;全文 *
基于草图交互的视频摘要方法及认知分析;马翠霞;刘永进;付秋芳;刘烨;傅小兰;戴国忠;王宏安;;中国科学:信息科学(08);全文 *

Also Published As

Publication number Publication date
CN116647690A (en) 2023-08-25

Similar Documents

Publication Publication Date Title
Chen et al. Anomaly detection in surveillance video based on bidirectional prediction
Feng et al. Online content-aware video condensation
Xu et al. Multiple human detection and tracking based on head detection for real-time video surveillance
Lamas et al. Human pose estimation for mitigating false negatives in weapon detection in video-surveillance
Lin et al. Social mil: Interaction-aware for crowd anomaly detection
Yang et al. Masked relation learning for deepfake detection
US20160029031A1 (en) Method for compressing a video and a system thereof
Yu et al. Remotenet: Efficient relevant motion event detection for large-scale home surveillance videos
CN107122751A (en) A kind of face tracking and facial image catching method alignd based on face
Zhao et al. Adversarial deep tracking
Ni et al. An improved ssd-like deep network-based object detection method for indoor scenes
Jiang et al. An efficient attention module for 3d convolutional neural networks in action recognition
Jiang et al. An Approach for Crowd Density and Crowd Size Estimation.
Zhang et al. Exploiting Offset-guided Network for Pose Estimation and Tracking.
Yuan et al. Structural target-aware model for thermal infrared tracking
CN112884808A (en) Video concentrator set partitioning method for reserving target real interaction behavior
Tao et al. An adaptive frame selection network with enhanced dilated convolution for video smoke recognition
Yu et al. The multi-level classification and regression network for visual tracking via residual channel attention
Li et al. Human-related anomalous event detection via memory-augmented Wasserstein generative adversarial network with gradient penalty
CN116647690B (en) Video concentration method based on space-time rotation
Fan et al. the application of artificial intelligence in distribution network engineering field
CN108921147B (en) Black smoke vehicle identification method based on dynamic texture and transform domain space-time characteristics
Zhang et al. Learning target-aware background-suppressed correlation filters with dual regression for real-time UAV tracking
CN117011342A (en) Attention-enhanced space-time transducer vision single-target tracking method
Wu et al. Dss-net: Dynamic self-supervised network for video anomaly detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant