WO2015184768A1 - 一种视频摘要生成方法及装置 - Google Patents
一种视频摘要生成方法及装置 Download PDFInfo
- Publication number
- WO2015184768A1 WO2015184768A1 PCT/CN2014/094701 CN2014094701W WO2015184768A1 WO 2015184768 A1 WO2015184768 A1 WO 2015184768A1 CN 2014094701 W CN2014094701 W CN 2014094701W WO 2015184768 A1 WO2015184768 A1 WO 2015184768A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- view
- important
- views
- object trajectory
- optimal
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8549—Creating video summaries, e.g. movie trailer
Definitions
- the present invention relates to the field of image recognition, and in particular, to a video summary generation method and apparatus.
- Video summary also known as video enrichment, is a generalization of video content. It automatically or semi-automatically analyzes moving targets by moving target analysis, then analyzes the motion trajectories of each target, and splicing different targets into a common background. In the scene, and combine them in some way.
- video technology the role of video digests in video analytics and content-based video retrieval is becoming more important.
- video surveillance systems have become an important part of maintaining social security and strengthening social management.
- video recording has the characteristics of large amount of stored data and long storage time. It is necessary to use a video to find clues and obtain evidence.
- the traditional practice requires a lot of manpower, material resources and time, and the efficiency is extremely low, so that the best time to solve the case is missed.
- an embodiment of the present invention provides a video summary generation method and apparatus.
- the embodiment of the present invention adopts the following technical solutions:
- a method for generating a video summary includes: dividing an original video into multiple views; and, according to how close the object trajectory is to each view, the object trajectories included in the original video, Dividing into the closest view of the object trajectory; calculating the activity indicator of the view according to the activity level of the object trajectory in the view, and classifying each view as important according to whether the activity indicator exceeds a preset threshold Sight and secondary view; parallel processing of object trajectories in each important view and secondary view, and merging each view obtained after parallel processing to generate a video summary.
- the dividing the original video into multiple views includes: determining a direction of the scene in the original video; and dividing the original video into multiple views according to the direction of the scene, where the directions of the multiple views are The directions of the scenes are consistent.
- the determining the direction of the scene in the original video includes: acquiring an initial point and an ending point of the plurality of object tracks in the original video; performing coordinate difference calculation according to the initial point and the ending point of the object track, and determining a direction of the object trajectory; determining a direction of the scene in the original video according to a direction of a majority of the object trajectories of the plurality of object trajectories, the direction of the scene being consistent with a direction of a majority of the object trajectories of the plurality of object trajectories .
- the segmentation of each object track included in the original video into the view field closest to the object track according to the proximity of the object track to each view field includes: acquiring a line segment feature of each view field, the line segment The feature includes: a start and end point coordinate of the view field and a number of object trajectories included in the view field; acquiring start and end point coordinates of the object trajectory, calculating a proximity degree of the object trajectory and each view field; and including the original video according to the proximity degree
- Each object trajectory is divided into a view field in which the object trajectory is closest; and the line segment feature of the closest view field is updated according to the start and end point coordinates of the object trajectory.
- the activity field of the object is calculated according to the activity level of the object track in the view, and the field of view is divided into an important view and a secondary view according to whether the activity indicator exceeds a preset threshold.
- the method includes: the activity level is positively correlated with the object area corresponding to the object trajectory and the duration of the object trajectory, and the activity indicator of the statistical view field is: summing the activity levels of all the object trajectories in the view field to obtain the view field.
- the activity indicator is divided into an important view and a secondary view according to whether the activity indicator exceeds a preset threshold.
- performing parallel processing on the object trajectories in each of the important view and the secondary view, and combining the respective view fields obtained by the parallel processing to generate a video summary including: if the multiple views are For the important field of view, the first preset function is used to solve the optimal solution of the object trajectory combination in each view, and then the optimal object trajectory combination corresponding to the optimal solution is determined; the optimal object trajectory combination according to all the views , generate a video summary.
- performing parallel processing on the object trajectories in each of the important view and the secondary view, and combining the respective view fields obtained by the parallel processing to generate a video summary including: if the multiple views are For the secondary field of view, the second preset function is used to respectively solve the optimal solution of the object trajectory combination of each view domain, and then the optimal object trajectory combination corresponding to the optimal solution is determined; the optimal object trajectory according to all the view domains is determined. Combine to generate a video summary.
- performing parallel processing on the object trajectories in each of the important view and the secondary view, and combining the respective view fields obtained by the parallel processing to generate a video summary including: if the multiple views The important view and the secondary view are included. If two important views are adjacent, the two important views are merged into one important view, and the first preset function is used to solve the target track for the merged important view. The optimal solution of the combination; if the important fields of view are not adjacent to each other, the first predetermined function is used to respectively solve the optimal solution of the object trajectory combination of each important view, and then the optimal object trajectory combination corresponding to the optimal solution is determined. Solving each secondary field of view by using a second preset function The optimal solution of the object trajectory combination, and then the optimal object trajectory combination corresponding to the optimal solution is determined; the video summary is generated according to the optimal object trajectory combination of all the view domains.
- performing parallel processing on the object trajectories in each of the important view and the secondary view, and combining the respective view fields obtained by the parallel processing to generate a video summary including: if the multiple views The important view and the secondary view are included. If two important views are adjacent, the two important views are merged into one important view, and the first preset function is used to solve the target track for the merged important view. The optimal solution of the combination; if the important fields of view are not adjacent to each other, the first predetermined function is used to respectively solve the optimal solution of the object trajectory combination of each important view, and then the optimal object trajectory combination corresponding to the optimal solution is determined.
- the object track in the secondary view is copied to the background image according to the original video; according to the processing result, each view field is merged to generate a video summary.
- a video digest generating apparatus including: a first dividing module configured to divide an original video into a plurality of viewing zones; and a categorization module configured to The proximity of the viewing area is divided into the object trajectories included in the original video into the closest viewing direction of the object trajectory; the second dividing module is set to calculate the activity of the viewing Horizon according to the activity level of the object trajectory in the viewing area. Indicators, and according to whether the activity indicator exceeds a preset threshold, each view area is divided into an important view and a secondary view; the merge processing module is set to perform object trajectories in each important view and the secondary view. Parallel processing, and the respective fields of view obtained after parallel processing are combined to generate a video summary.
- the first dividing module includes: a first calculating unit configured to determine a direction of a scene in the original video; and a first dividing unit configured to divide the original video into multiple viewing areas according to the direction of the scene, where The direction of the plurality of views is consistent with the direction of the scene.
- the first calculating unit includes: a first acquiring unit configured to acquire initial points and ending points of the plurality of object tracks in the scene in the original video; and a difference calculating unit configured to be based on an initial point of the object track Performing a coordinate difference calculation with the termination point to determine a direction of the object trajectory; the determining unit is configured to determine a direction of the scene in the original video according to a direction of a majority of the object trajectories in the plurality of object trajectories, where the scene is The direction is consistent with the direction of most of the object tracks in the plurality of object tracks.
- the categorization module includes: a second acquiring unit, configured to acquire a line segment feature of each view field, where the line segment feature includes: a start and end point coordinate of the view field and a number of object trajectories included in the view field; a unit, configured to obtain a starting point and an ending point of the object trajectory, and calculate a proximity degree of the object trajectory and each view field; the first categorizing unit is configured to divide each object trajectory included in the original video according to the proximity degree to The object track is closest to the view field; the update unit is configured to update the line segment feature of the closest view according to the start and end point coordinates of the object track.
- the second partitioning module includes: an activity index calculation unit, wherein the activity level of the object track is positively correlated with the object area corresponding to the object track and the duration of the object track, and the activity indicator of the statistical view is: Performing a summation calculation on the activity level of all object trajectories in the viewing area to obtain an activity indicator of the viewing area; the second dividing unit is configured to divide each viewing area into an important viewing area according to whether the activity indicator exceeds a preset threshold Secondary sight.
- the merging processing module includes: a first merging unit, configured to solve an optimal trajectory combination of each gaze by using a first preset function if the plurality of gaze regions are all important gaze regions Solving, and further determining an optimal object trajectory combination corresponding to the optimal solution; the first processing unit is configured to generate a video summary according to the optimal object trajectory combination of all the views.
- the merging processing module includes: a second merging unit, configured to use the second preset function to respectively solve the most trajectory combination of the object trajectories in each view domain, if the plurality of gaze regions are all secondary gaze regions An optimal solution is used to determine an optimal target trajectory combination corresponding to the optimal solution; and a second processing unit is configured to generate a video summary according to the optimal object trajectory combination of all the views.
- the merging processing module includes: a third merging unit, configured to: if the multiple spectroscopy includes an important gaze and a secondary gaze, if two important spectacles are adjacent, merge the two An important field of view is an important field of view.
- the first predetermined function is used to solve the optimal solution of the object track combination for the merged important field of view. If the important fields of view are not adjacent to each other, the first preset function is used to solve the problem.
- the optimal solution of the object trajectory combination of each important field of view, and then the optimal object trajectory combination corresponding to the optimal solution is determined; the second preset function is used to respectively solve the optimal solution of the object trajectory combination of each secondary view, and then Determining an optimal object trajectory combination corresponding to the optimal solution; and the third processing unit is configured to generate a video summary according to the optimal object trajectory combination of all the views.
- the merging processing module includes: a fourth merging unit, configured to: if the multiple spectacles include an important gaze and a secondary gaze, if the two important spectacles are adjacent, merge the two An important field of view is an important field of view.
- the first predetermined function is used to solve the optimal solution of the object track combination for the merged important field of view. If the important fields of view are not adjacent to each other, the first preset function is used to solve the problem.
- the optimal solution of the object trajectory combination of each important view, and then the optimal object trajectory combination corresponding to the optimal solution is determined; the object trajectory in the secondary view is copied to the background image according to the original video; the fourth processing unit sets In order to combine the various views according to the processing result, a video summary is generated.
- the beneficial effects of the embodiments of the present invention in the video summary generating method of the embodiment of the present invention, the parallel processing of the object trajectories in the important view field and the secondary view field reduces the calculation amount of the trajectory combination and speeds up the operation speed. Make users more focused on the main goals in the important field of view.
- FIG. 1 is a flowchart of basic steps of a video summary generating method according to an embodiment of the present invention
- FIG. 2 is a schematic diagram of application of a video summary generating method according to an embodiment of the present invention.
- FIG. 3 is a second application diagram of a video summary generating method according to an embodiment of the present invention.
- FIG. 4 is a third application diagram of a video summary generating method according to an embodiment of the present invention.
- FIG. 5 is a fourth application diagram of a video summary generating method according to an embodiment of the present invention.
- FIG. 6 is a schematic structural diagram of a video summary generating apparatus according to an embodiment of the present invention.
- FIG. 1 and FIG. 2 it is a schematic diagram of an embodiment of the present invention.
- an embodiment of the present invention provides a video summary generating method, including:
- Step 101 Divide the original video into multiple views
- Step 102 According to the proximity of the object trajectory and each view domain, divide each object trajectory included in the original video into a view field closest to the object trajectory;
- Step 103 Calculate the activity indicator of the view according to the activity level of the object track in the view, and divide each view into an important view and a secondary view according to whether the activity indicator exceeds a preset threshold.
- Step 104 Perform parallel processing on the object trajectories in each of the important view and the secondary view, and combine the respective view fields obtained after the parallel processing to generate a video summary.
- the parallel processing of the object trajectories in the important field of view and the secondary field of view reduces the computational complexity of the trajectory combination, speeds up the operation speed, and enables the user to pay more attention to the important field of view.
- the main goal is to reduce the computational complexity of the trajectory combination, speeds up the operation speed, and enables the user to pay more attention to the important field of view.
- step 101 in the above embodiment of the present invention specifically includes:
- the original video is divided into multiple views according to the direction of the scene, and the directions of the multiple views are consistent with the direction of the scene.
- the original video can be divided into k fields of view according to actual needs, where k is a positive integer.
- the plurality of trajectories may take all the trajectories in the original video scene or part of the trajectories in the original video scene.
- the original video scene includes 100 object trajectories.
- 20 trajectories or all of the trajectories may be taken. 100 tracks.
- the coordinate difference between the initial point and the end point of the target trajectory is calculated as follows: the absolute value of the difference between the start and end point ordinate difference is greater than the absolute value of the abscissa, the direction of the trajectory is determined to be the longitudinal direction; If the absolute value of the point ordinate difference is smaller than the absolute value of the difference of the abscissa, it is judged that the direction of the trajectory is the lateral direction.
- the direction of most of the object trajectories refers to the direction of the object trajectories whose number of object trajectories in one direction is the largest compared with the number of object trajectories in other directions, for example, if the plurality of object trajectories
- the direction of most of the object tracks is the lateral direction or the longitudinal direction
- the corresponding direction of the scene is the lateral direction or the longitudinal direction.
- step 102 in the foregoing embodiment of the present invention includes:
- Obtaining a line segment feature of each view field where the line segment feature includes: a starting point of the view field, a termination point, and a number of object tracks included in the view field;
- the line segment features of the view field include, but are not limited to, the start and end point coordinates of the view field and the number of object tracks included in the view field.
- the proximity of the object trajectory to each view field can be calculated according to the distance calculation formula.
- each object track included in the original video is divided into the view field in which the object track is closest.
- the line segment feature of the view field may be updated according to the start and end point coordinates of the object track.
- the initial start point and the end point of the view field may be selected by adding a start point and a stop point of the first object track of the present view field.
- step 103 in the foregoing embodiment of the present invention includes:
- the activity level of the object trajectory is positively correlated with the object area corresponding to the object trajectory and the duration of the object trajectory.
- the activity index of the statistical field of view is: summation of the activity levels of all object trajectories in the viewing area to obtain the field of view.
- the object area of the object track can be calculated from the height and width of the object itself.
- Each view area is divided into an important view and a secondary view according to whether the activity indicator exceeds a preset threshold.
- the divided view as the important view and the secondary view.
- the original video is divided into three views, and the activity indicators of the three views are respectively calculated, and the three activities are compared.
- the relationship between the indicator and the preset threshold If the activity index of the view field is greater than the preset threshold, the view field is divided into important views; if the maximum activity index of the view field is still smaller than the preset Threshold, then the three fields of view are secondary sights.
- step 104 in the foregoing embodiment of the present invention includes:
- the first predetermined function is used to respectively solve the optimal solution of the object trajectory combination of each view, and then the optimal target trajectory corresponding to the optimal solution is determined;
- a video summary is generated based on the optimal object trajectory combination for all views.
- the preferred solution of the combination of the object trajectories in the respective fields of view can be used as a preferred embodiment.
- the preferred embodiment of the present invention further provides the following examples of the first preset function and the second preset function. Description.
- the first preset function in the embodiment of the present invention uses a complex transfer mapping energy function to solve the optimal solution of the object trajectory combination in each view, which can be solved by the following formula:
- E(MAP) is a complex transfer mapping energy function
- BO is a set of object trajectories in an important view
- E a (BO) is the active energy cost, indicating a penalty function if the target does not appear in the summary video
- Tps (BO) is the relevant positive sequence cost, indicating the penalty function if the target is not added in the summary video
- E ntps (BO) is the relevant reverse order cost, indicating two objects that should be related before and after, in the summary video The penalty penalty function brought by the reverse order addition
- E tc (BO) is the pseudo collision cost, indicating that the two objects that do not collide in the original video have a penalty function caused by the trajectory collision in the digest result
- E tct (BO) For the true collision cost, the two objects that collide in the original video do not have a penalty function in the summary result, E tct (BO) is negative, ⁇ , ⁇ , ⁇ , ⁇ are presets.
- the weight coefficient, the specific value
- FIG. 2 is one application diagram of a video summary generation method according to an embodiment of the present invention.
- the application is mainly used in a complex motion scenario, and the motion target is relatively large.
- the application is implemented by the following steps:
- Step 201 Initialize the number of views.
- the original video is divided into multiple views, and the specific division into several views can be determined according to actual needs, for example, can be divided into 3 or 5 fields of view.
- Step 202 Calculate the direction of the field of view.
- the direction of the view is calculated according to the direction of the scene in the original video. If the direction of the scene in the original video is horizontal or vertical, the direction of the corresponding view is horizontal or vertical.
- Step 203 Calculate the subject-view field of each object track.
- the proximity of the object trajectory to each view field may be calculated according to the distance calculation formula, and each object trajectory included in the original video is divided into the view field in which the object trajectory is closest.
- Step 204 The visual field straight line model is updated.
- the line segment feature of the view field may be updated according to the start and end point coordinates of the object track to join the next object track.
- Step 205 Calculate the visibility of the field of view.
- the activity indicator of the view area is counted according to the activity level of the object track in the view.
- Step 206 The visibility indicator is compared with a preset threshold.
- the view field activity indicator is greater than or less than the view field of the preset threshold, and is determined to be an important view or a secondary view. When it is determined to be an important view, step 207 is performed.
- Step 207 Process the object trajectory by using the first preset function.
- the calculated views are all important views, and the first preset function is used to respectively solve the optimal solution of the object trajectory combination of each view, and then the optimal solution is determined. Corresponding optimal object trajectory combination to generate a video summary.
- FIG. 1 and FIG. 3 it is a schematic diagram of an embodiment of the present invention.
- the embodiment of the present invention includes steps 101, 102, 103, and 104 in the first embodiment, except that step 104 and embodiment 1 in this embodiment are used.
- the implementation of the step 104 is different.
- the same parts of the embodiment are the same as those of the first embodiment. The following descriptions are only given in different parts:
- step 104 in the embodiment of the present invention includes:
- the second preset function is used to respectively solve the optimal solution of the object trajectory combination of each view, and then determine the optimal object trajectory combination corresponding to the optimal solution;
- a video summary is generated based on the optimal object trajectory combination for all views.
- the function of the prior art can be used to solve the optimal solution of the object trajectory combination of each view.
- the second preset function in the embodiment uses a simple transfer mapping energy function to solve each view.
- E(MAP)c is the simple transfer mapping energy function to solve the optimal solution of the object trajectory combination in each view
- b m and b b are the two moving object trajectories in the secondary view
- ⁇ is the preset weight The coefficient, its specific value can be determined according to the needs of the situation in the actual scene.
- FIG. 3 is a second application diagram of a video summary generation method according to an embodiment of the present invention.
- the application is mainly used in a simple motion scenario, and the motion target is relatively small and relatively small.
- the application is implemented by the following steps:
- Step 301 Initialize the number of views.
- the original video is divided into multiple views, and the specific division into several views can be determined according to actual needs, for example, can be divided into 3 or 5 fields of view.
- Step 302 Calculate the direction of the field of view.
- the direction of the view is calculated according to the direction of the scene in the original video. If the direction of the scene in the original video is horizontal or vertical, the direction of the corresponding view is horizontal or vertical.
- Step 303 Calculate the subject-view field of each object track.
- the proximity of the object trajectory to each view field may be calculated according to the distance calculation formula, and each object trajectory included in the original video is divided into the view field in which the object trajectory is closest.
- Step 304 The visual field straight line model is updated.
- the line segment feature of the view field may be updated according to the start and end point coordinates of the object track to join the next object track.
- step 305 calculating the visibility of the field of view.
- the activity indicator of the view area is counted according to the activity level of the object track in the view.
- Step 306 The visibility indicator is compared with a preset threshold.
- the view field activity indicator is greater than or less than the view field of the preset threshold, and is determined to be an important view or a secondary view. If the second view is determined, step 307 is performed.
- Step 307 Process the object trajectory by using a second preset function.
- the calculated field of view is a secondary view
- the second preset function is used to respectively solve the optimal solution of the object trajectory combination of each view, thereby determining the optimal The corresponding optimal object trajectory combination is solved, and a video summary is generated.
- FIG. 1 and FIG. 4 it is a schematic diagram of an embodiment of the present invention.
- the embodiment of the present invention includes steps 101, 102, 103, and 104 in the first embodiment.
- the difference is the step 104 and the first embodiment in the embodiment.
- the implementation of the step 104 is different.
- the same parts of the embodiment are the same as those of the first embodiment. The following descriptions are only given in different parts:
- step 104 in the embodiment of the present invention includes:
- the plurality of views include an important view and a secondary view, if two important views are adjacent, the two important views are merged into one important view, and the merged important view is adopted.
- the first preset function solves the optimal solution of the object trajectory combination; if the important fields of view are not adjacent to each other, the first predetermined function is used to respectively solve the optimal solution of the object trajectory combination of each important view, thereby determining the optimal Solving the corresponding optimal object trajectory combination; using the second preset function to respectively solve the optimal solution of the object trajectory combination of each secondary view, and then determining the optimal object trajectory combination corresponding to the optimal solution;
- a video summary is generated based on the optimal object trajectory combination for all views.
- the first predetermined function can respectively solve the optimal solution of the object trajectory combination of each important view, and then determine the optimal object trajectory combination corresponding to the optimal solution, and the implementation can use the function in the prior art to solve the important
- the first preset function in the embodiment uses a complex transfer mapping energy function to solve the optimal solution of the object trajectory combination in each view domain, which can be adopted. Solve as follows:
- E(MAP) is a complex transfer mapping energy function
- BO is a set of object trajectories in an important view
- E a (BO) is the active energy cost, indicating a penalty function if the target does not appear in the summary video
- Tps (BO) is the relevant positive sequence cost, indicating the penalty function if the target is not added in the summary video
- E ntps (BO) is the relevant reverse order cost, indicating two objects that should be related before and after, in the summary video The penalty penalty function brought by the reverse order addition
- E tc (BO) is the pseudo collision cost, indicating that the two objects that do not collide in the original video have a penalty function caused by the trajectory collision in the digest result
- E tct (BO) For the true collision cost, the two objects that collide in the original video do not have a penalty function in the summary result, E tct (BO) is negative, ⁇ , ⁇ , ⁇ , ⁇ are presets.
- the weight coefficient, the specific value
- the optimal solution of the object trajectory combination of each of the secondary views can be respectively solved by the second preset function, and then the optimal object trajectory combination corresponding to the optimal solution is determined, and the implementation can be solved by using a function in the prior art.
- the optimal solution of the combination of object trajectories in the secondary view as a preferred embodiment, the second preset function in the embodiment uses a simple transfer mapping energy function to solve the optimal solution of the object trajectory combination in each view. It can be solved by the following formula:
- E(MAP)c is an optimal transfer solution energy function for solving the optimal solution of the object trajectory combination of each view domain, wherein the simple transfer map energy function is relative to the complex transfer map energy function in the first embodiment.
- b m and b b are two moving object trajectories in the secondary field of view, and ⁇ is a preset weight coefficient, and the specific value thereof may be determined according to the situation in the actual scene.
- FIG. 4 is a third application diagram of a video summary generation method according to an embodiment of the present invention.
- the application is mainly used in a motion scene with complex structures, and the moving target is irregular. For example, some regions have simple movements and a small number of targets. Relative movement is complicated. As shown in Figure 4, the application is implemented by the following steps:
- Step 401 Initialize the number of views.
- the original video is divided into multiple views, and the specific division into several views can be determined according to actual needs, for example, can be divided into 3 or 5 fields of view.
- Step 402 Calculate the direction of the field of view.
- the direction of the view is calculated according to the direction of the scene in the original video. If the direction of the scene in the original video is horizontal or vertical, the direction of the corresponding view is horizontal or vertical.
- Step 403 Calculate the subject-view field of each object track.
- the proximity of the object trajectory to each view field may be calculated according to the distance calculation formula, and each object trajectory included in the original video is divided into the view field in which the object trajectory is closest.
- Step 404 The visual field straight line model is updated.
- the line segment feature of the view field may be updated according to the start and end point coordinates of the object track to join the next object track.
- Step 405 Calculate the visibility of the field of view.
- the activity indicator of the view area is counted according to the activity level of the object track in the view.
- Step 406 The visibility indicator is compared with a preset threshold.
- the view field activity indicator is greater than or less than the view field of the preset threshold, and is determined to be an important view or a secondary view.
- step 407 is performed, and when it is determined to be a secondary view, Go to step 410.
- Step 407 Whether two important fields of view are adjacent to each other.
- step 408 If the two important fields of view are adjacent to each other, proceed to step 408, otherwise step 409 is directly performed.
- Step 408 Merging. That is, merge two adjacent important views.
- Step 409 Processing, by using a first preset function, an object trajectory in an important view
- Step 410 The second preset function is used to process the object trajectory in the secondary view
- FIG. 1 and FIG. 5 it is a schematic diagram of an embodiment of the present invention.
- the embodiment of the present invention includes steps 101, 102, 103, and 104 in the first embodiment.
- the difference is the step 104 and the first embodiment in the embodiment.
- the implementation of the step 104 is different.
- the same parts of the embodiment are the same as those of the first embodiment. The following descriptions are only given in different parts:
- step 104 in the embodiment of the present invention includes:
- the plurality of views include an important view and a secondary view
- the two important views are merged into one important view, and the merged important view is adopted.
- the first preset function solves the optimal solution of the object trajectory combination; if the important fields of view are not adjacent to each other, the first predetermined function is used to respectively solve the optimal solution of the object trajectory combination of each important view, thereby determining the optimal Solving the corresponding optimal object trajectory combination; the object trajectory in the secondary view is copied into the background image according to the original video;
- a video summary is generated based on the optimal object trajectory combination for all views.
- the first predetermined function can respectively solve the optimal solution of the object trajectory combination of each important view, and then determine the optimal object trajectory combination corresponding to the optimal solution, and the implementation can use the function in the prior art to solve the important
- the first preset function in the embodiment uses a complex transfer mapping energy function to solve the optimal solution of the object trajectory combination in each view domain, which can be adopted. Solve as follows:
- E(MAP) is a complex transfer mapping energy function
- BO is a set of object trajectories in an important view
- E a (BO) is the active energy cost, indicating a penalty function if the target does not appear in the summary video
- Tps (BO) is the relevant positive sequence cost, indicating the penalty function if the target is not added in the summary video
- E ntps (BO) is the relevant reverse order cost, indicating two objects that should be related before and after, in the summary video The penalty penalty function brought by the reverse order addition
- E tc (BO) is the pseudo collision cost, indicating that the two objects that do not collide in the original video have a penalty function caused by the trajectory collision in the digest result
- E tct (BO) For the true collision cost, the two objects that collide in the original video do not have a penalty function in the summary result, E tct (BO) is negative, ⁇ , ⁇ , ⁇ , ⁇ are presets.
- the weight coefficient, the specific value
- the object track in the secondary view is copied into the background image according to the original video, and finally a video summary is generated.
- FIG. 5 is a fourth application diagram of a video summary generation method according to an embodiment of the present invention.
- the application is mainly used in a motion scene with complex structure, and the motion target is irregular. For example, some regions have simple motion and a small number, and some regional targets. Relative movement is complicated. As shown in Figure 5, the application is implemented by the following steps:
- Step 501 Initialize the number of views.
- the original video is divided into multiple views, and the specific division into several views can be determined according to actual needs, for example, can be divided into 3 or 5 fields of view.
- Step 502 Calculate the direction of the field of view.
- the direction of the view is calculated according to the direction of the scene in the original video. If the direction of the scene in the original video is horizontal or vertical, the direction of the corresponding view is horizontal or vertical.
- Step 503 Calculate the subject-view field of each object track.
- the proximity of the object trajectory to each view field may be calculated according to the distance calculation formula, and each object trajectory included in the original video is divided into the view field in which the object trajectory is closest.
- Step 504 The visual field straight line model is updated.
- the line segment feature of the view field may be updated according to the start and end point coordinates of the object track to join the next object track.
- Step 505 Calculate the visual field activity index.
- the activity indicator of the view area is counted according to the activity level of the object track in the view.
- Step 506 The visual field activity indicator is compared with a preset threshold.
- the view field activity indicator is greater than or less than the view field of the preset threshold, and is determined to be an important view or a secondary view.
- step 507 is performed, and when it is determined to be a secondary view, Go to step 510. .
- Step 507 Whether two important fields of view are adjacent to each other.
- step 508 If the two important fields of view are adjacent to each other, proceed to step 508, otherwise step 509 is directly performed.
- Step 508 Combine. That is, merge two adjacent important views.
- Step 509 processing, by using a first preset function, an object trajectory in an important view
- Step 510 copying the object track into the background image according to the original video
- an embodiment of the present invention further provides a video summary generating apparatus, where the apparatus 60 includes:
- the first dividing module 61 is configured to divide the original video into multiple viewing areas
- the categorization module 62 is configured to divide each object trajectory included in the original video into a view field closest to the object trajectory according to the proximity of the object trajectory to each view field;
- the second dividing module 63 is configured to calculate the activity indicator of the viewing area according to the activity level of the object track in the view, and divide each view into important views according to whether the activity indicator exceeds a preset threshold. Sight seeing
- the merge processing module 64 is configured to perform parallel processing on the object trajectories in each of the important view and the secondary view, and combine the respective view fields obtained after the parallel processing to generate a video summary.
- the first dividing module 61 includes: a first calculating unit, configured to determine a direction of a scene in the original video; and a first dividing unit, configured to divide the original video into multiple viewing areas according to the direction of the scene, The direction of the plurality of views is consistent with the direction of the scene.
- the first calculating unit includes: a first acquiring unit configured to acquire initial points and ending points of the plurality of object tracks in the scene in the original video; and a difference calculating unit configured to be based on an initial point of the object track Performing a coordinate difference calculation with the termination point to determine a direction of the object trajectory; the determining unit is configured to determine a direction of the scene in the original video according to a direction of a majority of the object trajectories in the plurality of object trajectories, where the scene is The direction is consistent with the direction of most of the object tracks in the plurality of object tracks.
- the categorization module 62 includes: a second acquiring unit, configured to acquire a line segment feature of each view field, where the line segment feature includes: a start and end point coordinate of the view field and a number of object trajectories included in the view field; a calculating unit, configured to acquire a starting point coordinate of the object trajectory, and calculate a proximity degree of the object trajectory and each view field; the first categorizing unit is configured to divide each object trajectory included in the original video into the according to the proximity degree The closest view of the object trajectory;
- the updating unit is configured to update the line segment feature of the closest view according to the start and end point coordinates of the object track.
- the second dividing module 63 includes: an activity index calculating unit, wherein the activity level of the object track is positively correlated with the object area corresponding to the object track and the duration of the object track, and the activity indicator of the statistical field is : Calculating the activity level of all the object trajectories in the viewing area to obtain the activity index of the viewing area; the second dividing unit is configured to divide each viewing area into important views according to whether the activity level indicator exceeds a preset threshold. Domain and secondary sight.
- the merging processing module 64 includes: a first merging unit, configured to use the first preset function to respectively solve the most trajectory combination of the object trajectories in each view domain, if the plurality of gaze regions are all important gaze regions An optimal solution is further determined to determine an optimal target trajectory combination corresponding to the optimal solution; the first processing unit is configured to generate a video summary according to the optimal object trajectory combination of all the views.
- the merging processing module 64 includes: a second merging unit, configured to use the second preset function to separately solve the object trajectory combination of each gaze field if the multiple gaze regions are secondary gaze regions
- the optimal solution is further determined to determine an optimal object trajectory combination corresponding to the optimal solution
- the second processing unit is configured to generate a video summary according to the optimal object trajectory combination of all the views.
- the merge processing module 64 includes: a third merging unit, configured to: if the multiple views include an important view and a secondary view, if two important views are adjacent, merge the The two important fields of view are an important field of view.
- the first predetermined function is used to solve the optimal solution of the object track combination for the merged important field of view. If the important fields of view are not adjacent to each other, the first preset function is used.
- the third processing unit is configured to generate a video summary according to the optimal object trajectory combination of all the views.
- the merge processing module 64 includes: a fourth merging unit, configured to: if the multiple views include an important view and a secondary view, if two important views are adjacent, merge the The two important fields of view are an important field of view.
- the first predetermined function is used to solve the optimal solution of the object track combination for the merged important field of view. If the important fields of view are not adjacent to each other, the first preset function is used. Solving the optimal solution of the object trajectory combination of each important view, and then determining the optimal object trajectory combination corresponding to the optimal solution; the object trajectory in the secondary view is copied to the background image according to the original video; the fourth processing unit, Set to combine the various views according to the processing result to generate a video summary.
- the parallel processing of the object trajectories in the important gaze and the secondary gaze reduces the computational complexity of the trajectory combination, speeds up the operation speed, and makes the user more simple and clear.
- the above technical solution provided by the embodiment of the present invention reduces the computational complexity of the trajectory combination by parallel processing of the object trajectories in the important gaze and the secondary gaze, and speeds up the operation speed, so that the user can pay more attention to the important view.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Computer Security & Cryptography (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Television Signal Processing For Recording (AREA)
Abstract
本发明提供了一种视频摘要生成方法及装置,该方法包括:将原始视频划分为多个视域;根据对象轨迹与各个视域的接近程度,将原始视频包含的各个对象轨迹,划分到该对象轨迹最为接近的视域中;根据视域中对象轨迹的活跃程度,统计视域的活跃度指标,并根据所述活跃度指标是否超出预设门限,将各个视域划分为重要视域和次要视域;对各个重要视域和次要视域内的对象轨迹进行并行处理,并将并行处理后得到的各个视域进行合并,生成视频摘要。在本发明的视频摘要生成方法中,通过对重要视域和次要视域内的对象轨迹的并行处理,减少了轨迹组合的运算量,加快了运算速度,使用户更加简单明了的关注重要视域内的主要目标。
Description
本发明涉及图像识别领域,具体涉及一种视频摘要生成方法及装置。
视频摘要又称为视频浓缩,是对视频内容的概括,以自动或半自动方式,通过运动目标分析,提取运动目标,然后对各个目标的运动轨迹进行分析,将不同的目标拼接到一个共同的背景场景中,并将它们以某种方式进行组合。随着视频技术的发展,视频摘要在视频分析和基于内容的视频检索中的作用愈加重要。
在社会公共安全领域,视频监控系统成为维护社会治安,加强社会管理的一个重要组成部分。然而视频录像存在存储数据量大,存储时间长等特点,通过录像寻找线索,获取证据传统的做法要耗费大量人力、物力以及时间,效率极其低下,以至于错过最佳破案时机。
针对现有技术中无法快速的从大规模视频数据中寻找最优的摘要视频的问题,目前尚未提出有效的解决方案。
发明内容
为了克服现有技术中的不足,本发明实施例提供了一种视频摘要生成方法及装置。
为了解决上述技术问题,本发明实施例采用如下技术方案:
依据本发明实施例的一个方面,提供了一种视频摘要生成方法,包括:将原始视频划分为多个视域;根据对象轨迹与各个视域的接近程度,将原始视频包含的各个对象轨迹,划分到该对象轨迹最为接近的视域中;根据视域中对象轨迹的活跃程度,统计视域的活跃度指标,并根据所述活跃度指标是否超出预设门限,将各个视域划分为重要视域和次要视域;对各个重要视域和次要视域内的对象轨迹进行并行处理,并将并行处理后得到的各个视域进行合并,生成视频摘要。
其中,所述将原始视频划分为多个视域,包括:确定原始视频中场景的方向;根据所述场景的方向,将原始视频划分为多个视域,所述多个视域的方向与所述场景的方向一致。
其中,所述确定原始视频中场景的方向,包括:获取所述原始视频中场景内的多条对象轨迹的初始点与终止点;根据对象轨迹的初始点与终止点进行坐标差值计算,确定对象轨迹的方向;根据所述多条对象轨迹中的大部分对象轨迹的方向,判断所述原始视频中场景的方向,所述场景的方向与多条对象轨迹中的大部分对象轨迹的方向一致。
其中,所述根据对象轨迹与各个视域的接近程度,将原始视频包含的各个对象轨迹,划分到该对象轨迹最为接近的视域中,包括:获取每个视域的线段特征,所述线段特征包括:视域的起止点坐标和视域内包含的对象轨迹的个数;获取对象轨迹的起止点坐标,计算对象轨迹与各个视域的接近程度;根据所述接近程度,将原始视频包含的各个对象轨迹,划分到该对象轨迹最为接近的视域中;根据该对象轨迹的起止点坐标,更新该最为接近的视域的线段特征。
其中,所述根据视域中对象轨迹的活跃程度,统计视域的活跃度指标,并根据所述活跃度指标是否超出预设门限,将各个视域划分为重要视域和次要视域,包括:所述活跃程度与对象轨迹对应的对象面积及对象轨迹的持续时间呈正相关,所述统计视域的活跃度指标为:将视域内所有对象轨迹的活跃程度进行求和计算,得到视域的活跃度指标;根据所述活跃度指标是否超出预设门限,将各个视域划分为重要视域和次要视域。
可选地,所述对各个重要视域和次要视域内的对象轨迹进行并行处理,并将并行处理后得到的各个视域进行合并,生成视频摘要,包括:若所述多个视域均为重要视域,则采用第一预设函数分别求解各个视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;根据所有视域的最优对象轨迹组合,生成视频摘要。
可选地,所述对各个重要视域和次要视域内的对象轨迹进行并行处理,并将并行处理后得到的各个视域进行合并,生成视频摘要,包括:若所述多个视域均为次要视域,则采用第二预设函数分别求解各个视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;根据所有视域的最优对象轨迹组合,生成视频摘要。
可选地,所述对各个重要视域和次要视域内的对象轨迹进行并行处理,并将并行处理后得到的各个视域进行合并,生成视频摘要,包括:若所述多个视域中包含重要视域和次要视域,若其中两个重要视域相邻,则合并该两个重要视域为一个重要视域,对合并后的重要视域采用第一预设函数求解对象轨迹组合的最优解;若重要视域彼此不相邻,则采用第一预设函数分别求解各个重要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;采用第二预设函数分别求解各个次要视域的
对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;根据所有视域的最优对象轨迹组合,生成视频摘要。
可选地,所述对各个重要视域和次要视域内的对象轨迹进行并行处理,并将并行处理后得到的各个视域进行合并,生成视频摘要,包括:若所述多个视域中包含重要视域和次要视域,若其中两个重要视域相邻,则合并该两个重要视域为一个重要视域,对合并后的重要视域采用第一预设函数求解对象轨迹组合的最优解;若重要视域彼此不相邻,则采用第一预设函数分别求解各个重要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;次要视域中的对象轨迹按照原始视频复制到背景图像中;根据处理结果,将各个视域进行合并,生成视频摘要。
依据本发明实施例的另一个方面,还提供了一种视频摘要生成装置,包括:第一划分模块,设置为将原始视频划分为多个视域;归类模块,设置为根据对象轨迹与各个视域的接近程度,将原始视频包含的各个对象轨迹,划分到该对象轨迹最为接近的视域中;第二划分模块,设置为根据视域中对象轨迹的活跃程度,统计视域的活跃度指标,并根据所述活跃度指标是否超出预设门限,将各个视域划分为重要视域和次要视域;合并处理模块,设置为对各个重要视域和次要视域内的对象轨迹进行并行处理,并将并行处理后得到的各个视域进行合并,生成视频摘要。
其中,所述第一划分模块包括:第一计算单元,设置为确定原始视频中场景的方向;第一划分单元,设置为根据所述场景的方向,将原始视频划分为多个视域,所述多个视域的方向与所述场景的方向一致。
其中,所述第一计算单元包括:第一获取单元,设置为获取所述原始视频中场景内的多条对象轨迹的初始点与终止点;差值计算单元,设置为根据对象轨迹的初始点与终止点进行坐标差值计算,确定对象轨迹的方向;判断单元,设置为根据所述多条对象轨迹中的大部分对象轨迹的方向,判断所述原始视频中场景的方向,所述场景的方向与多条对象轨迹中的大部分对象轨迹的方向一致。
其中,所述归类模块包括:第二获取单元,设置为获取每个视域的线段特征,所述线段特征包括:视域的起止点坐标和视域内包含的对象轨迹的个数;距离计算单元,设置为获取对象轨迹的起始点与终止点,计算对象轨迹与各个视域的接近程度;第一归类单元,设置为根据所述接近程度,将原始视频包含的各个对象轨迹,划分到该对象轨迹最为接近的视域中;更新单元,设置为根据该对象轨迹的起止点坐标,更新该最为接近的视域的线段特征。
其中,所述第二划分模块包括:活跃度指标计算单元,其中对象轨迹的活跃程度与对象轨迹对应的对象面积及对象轨迹的持续时间呈正相关,所述统计视域的活跃度指标为:将视域内所有对象轨迹的活跃程度进行求和计算,得到视域的活跃度指标;第二划分单元,设置为根据所述活跃度指标是否超出预设门限,将各个视域划分为重要视域和次要视域。
可选地,所述合并处理模块包括:第一合并单元,设置为若所述多个视域均为重要视域,则采用第一预设函数分别求解各个视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;第一处理单元,设置为根据所有视域的最优对象轨迹组合,生成视频摘要。
可选地,所述合并处理模块包括:第二合并单元,设置为若所述多个视域均为次要视域,则采用第二预设函数分别求解各个视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;第二处理单元,设置为根据所有视域的最优对象轨迹组合,生成视频摘要。
可选地,所述合并处理模块包括:第三合并单元,设置为若所述多个视域中包含重要视域和次要视域,若其中两个重要视域相邻,则合并该两个重要视域为一个重要视域,对合并后的重要视域采用第一预设函数求解对象轨迹组合的最优解;若重要视域彼此不相邻,则采用第一预设函数分别求解各个重要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;采用第二预设函数分别求解各个次要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;第三处理单元,设置为根据所有视域的最优对象轨迹组合,生成视频摘要。
可选地,所述合并处理模块包括:第四合并单元,设置为若所述多个视域中包含重要视域和次要视域,若其中两个重要视域相邻,则合并该两个重要视域为一个重要视域,对合并后的重要视域采用第一预设函数求解对象轨迹组合的最优解;若重要视域彼此不相邻,则采用第一预设函数分别求解各个重要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;次要视域中的对象轨迹按照原始视频复制到背景图像中;第四处理单元,设置为根据处理结果,将各个视域进行合并,生成视频摘要。
本发明实施例的有益效果:在本发明实施例的视频摘要生成方法中,通过对重要视域和次要视域内的对象轨迹的并行处理,减少了轨迹组合的运算量,加快了运算速度,使用户更加简单明了的关注重要视域内的主要目标。
图1为本发明实施例的视频摘要生成方法的基本步骤流程图;
图2为本发明实施例的视频摘要生成方法的应用图之一;
图3为本发明实施例的视频摘要生成方法的应用图之二;
图4为本发明实施例的视频摘要生成方法的应用图之三;
图5为本发明实施例的视频摘要生成方法的应用图之四;
图6为本发明实施例的视频摘要生成装置的结构示意图。
为使本发明要解决的技术问题、技术方案和优点更加清楚,下面将结合附图及具体实施例进行详细描述。
实施例一
如图1和图2所示,为本发明实施例示意图,如图1所示,本发明实施例提供了一种视频摘要生成方法,包括:
步骤101,将原始视频划分为多个视域;
步骤102,根据对象轨迹与各个视域的接近程度,将原始视频包含的各个对象轨迹,划分到该对象轨迹最为接近的视域中;
步骤103,根据视域中对象轨迹的活跃程度,统计视域的活跃度指标,并根据所述活跃度指标是否超出预设门限,将各个视域划分为重要视域和次要视域;
步骤104,对各个重要视域和次要视域内的对象轨迹进行并行处理,并将并行处理后得到的各个视域进行合并,生成视频摘要。
在本发明的视频摘要生成方法中,通过对重要视域和次要视域内的对象轨迹的并行处理,减少了轨迹组合的运算量,加快了运算速度,使用户更加简单明了的关注重要视域内的主要目标。
进一步地,本发明的上述实施例中的步骤101具体包括:
确定原始视频中场景的方向;
根据所述场景的方向,将原始视频划分为多个视域,所述多个视域的方向与所述场景的方向一致。
也就是可根据实际情况需要划分原始视频为k个视域,其中k为正整数。
其中,上述实施例中的计算原始视频中场景的方向可通过如下计算方式实现:
首先,获取原始视频中场景内的多条对象轨迹的初始点与终止点;
其中,该多条轨迹可取原始视频场景中的所有轨迹或原始视频场景中的部分轨迹,比如该原始视频场景包括100条对象轨迹,计算场景的方向时,可取其中的20条轨迹或者可取全部的100条轨迹。
接着,根据对象轨迹的初始点与终止点进行坐标差值计算,确定对象轨迹的方向;
其中,若对象轨迹的初始点与终止点的坐标差值计算结果为:起止点纵坐标差值的绝对值大于横坐标的差值的绝对值,则判断该轨迹的方向为纵向方向;若起止点纵坐标差值的绝对值小于横坐标的差值的绝对值,则判断该轨迹的方向为横向方向。
根据所述多条对象轨迹中的大部分对象轨迹的方向,判断所述原始视频中场景的方向,所述场景的方向与多条对象轨迹中的大部分对象轨迹的方向一致。
也就是,大部分对象轨迹的方向是指在一方向上的对象轨迹的数量分别与其他的各个方向上的对象轨迹的数量相比最大的那些对象轨迹的方向,例如,若所述多条对象轨迹中的大部分对象轨迹的方向为横向方向或纵向方向,则相应的所述场景的方向为横向方向或纵向方向。
具体地,本发明上述实施例中的步骤102包括:
获取每个视域的线段特征,所述线段特征包括:视域的起始点、终止点和视域内包含的对象轨迹的个数;
其中,该视域的线段特征包括但不限于视域的起止点坐标和视域内包含的对象轨迹的个数。
获取对象轨迹的起止点坐标,计算对象轨迹与各个视域的接近程度;
其中,可根据距离计算公式对对象轨迹与各个视域的接近程度进行计算。
根据所述接近程度,将原始视频包含的各个对象轨迹,划分到该对象轨迹最为接近的视域中。
在本发明的实施例中,优选的,在某一视域中每加入一条对象轨迹后,还可以根据该对象轨迹的起止点坐标,更新该视域的线段特征。具体地,更新公式包括:nk=nk+1,这里nk为加入该对象轨迹前该视域包含的轨迹对象的个数,nk+1则为加入该对象轨迹后该视域包含的轨迹对象的个数;
其中,x′s、y′s为对象轨迹的起始点的横坐标和纵坐标,x′z、y′z为对象轨迹的终止点的横坐标和纵坐标,为视域的起始点的横坐标和纵坐标,为视域的终止点的横坐标和纵坐标。本发明实施例中,视域的初始起始点和终止点的选取,可以是通过加入本视域的第一个对象轨迹的起始点和终止点进行选取。
具体地,本发明上述实施例中的步骤103包括:
对象轨迹的活跃程度与对象轨迹对应的对象面积及对象轨迹的持续时间呈正相关,所述统计视域的活跃度指标为:将视域内所有对象轨迹的活跃程度进行求和计算,得到视域的活跃度指标;
其中,所述对象轨迹的对象面积可由对象本身的高度和宽度计算得到。
根据所述活跃度指标是否超出预设门限,将各个视域划分为重要视域和次要视域。
下面对划分视域为重要视域和次要视域进行解释,实际场景下,比如划分原始视频为3个视域,分别计算得到3个视域的活跃度指标,比较该3个活跃度指标与预设门限的大小关系,若其中有的视域的活跃度指标大于预设门限值,则划分该视域为重要视域;若其中视域的最大的活跃度指标仍然小于预设门限,则该3个视域均为次要视域。
具体地,本发明上述实施例中的步骤104包括:
若所述多个视域均为重要视域,则采用第一预设函数分别求解各个视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;
根据所有视域的最优对象轨迹组合,生成视频摘要。
其中,可采用现有技术中的函数求解各个视域的对象轨迹组合的最优解,作为优选的实施方式,本发明实施例还进一步提供以下第一预设函数和第二预设函数进行示例说明。在本发明实施例中的第一预设函数采用复杂的转移映射能量函数求解各个视域的对象轨迹组合的最优解,可通过如下公式进行求解:
E(MAP)=Ea(BO)+αEtps(BO)+β*Entps(BO)+γ*Etc(BO)+λEtct(BO)
其中,E(MAP)为复杂的转移映射能量函数;BO为重要视域内对象轨迹的集合;Ea(BO)为活动能量代价,表示如果在摘要视频中不出现该目标时的罚函数;Etps(BO)为相关正序代价,表示如果在摘要视频中不正序加入该目标时的罚函数;Entps(BO)为相关逆序代价,表示本应前后相关的两个对象,在摘要视频中逆序加入时带来的代价罚函数;Etc(BO)为伪碰撞代价,表示在原始视频中不发生碰撞的两个对象在摘要结果中发生了轨迹碰撞带来的罚函数;Etct(BO)为真碰撞代价,表示在原始视频中发生碰撞的两个对象在摘要结果中不发生了碰撞带来的罚函数,Etct(BO)为负值,α,β,γ,λ为预设的权重系数,其具体数值可根据实际场景中的情况需要而定。
图2为本发明实施例的视频摘要生成方法的应用图之一,该应用主要用于复杂运动场景下,运动目标比较大也比较多。如图2所示,该应用通过如下步骤实现:
步骤201:视域个数初始化。
也就是将原始视频划分为多个视域,具体划分为几个视域可根据实际需要而定,比如可划分为3个或5个视域等。
步骤202:计算视域方向。
具体地,根据原始视频中场景的方向计算视域的方向,若原始视频中场景的方向为横向或纵向,则相应的视域的方向为横向或纵向。
步骤203:计算每个对象轨迹隶属视域。
具体地,可根据距离计算公式对对象轨迹与各个视域的接近程度进行计算,将原始视频包含的各个对象轨迹,划分到该对象轨迹最为接近的视域中。
步骤204:视域直线模型更新。
具体地,某一视域中每加入一条对象轨迹后,还可以根据该对象轨迹的起止点坐标,更新该视域的线段特征,以加入下一个对象轨迹。
步骤205:视域活跃度指标计算。
具体地,根据视域中对象轨迹的活跃程度,统计视域的活跃度指标。
步骤206:视域活跃度指标与预设门限比较。
其中,视域活跃度指标大于或小于预设门限的视域,相应判定为重要视域或次要视域,在判定为重要视域时,执行步骤207。
步骤207:利用第一预设函数对对象轨迹进行处理。
具体地,由于本应用中场景的特殊性,计算出的视域均为重要视域,则利用第一预设函数分别求解各个视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合,生成视频摘要。
实施例二
如图1和图3所示,为本发明实施例示意图,本发明实施例包括实施例一中的步骤101、102、103、104,不同的是本实施例中的步骤104与实施例一中的步骤104的实现方式不同,对本实施例与实施例一相同的部分不再赘述,以下仅以不同部分说明:
具体地,本发明实施例中的步骤104包括:
若所述多个视域均为次要视域,则采用第二预设函数分别求解各个视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;
根据所有视域的最优对象轨迹组合,生成视频摘要。
其中,可采用现有技术中的函数求解各个视域的对象轨迹组合的最优解,作为优选的实施方式,在本实施例中的第二预设函数采用简单的转移映射能量函数求解各个视域的对象轨迹组合的最优解,其中简单的转移映射能量函数是相对于实施例一中的复杂的转移映射能量函数而言的,可通过如下公式进行求解:
其中,E(MAP)c为简单的转移映射能量函数求解各个视域的对象轨迹组合的最优解,bm和bb为次要视域内的两条运动对象轨迹,γ为预设的权重系数,其具体数值可根据实际场景中的情况需要而定。
图3为本发明实施例的视频摘要生成方法的应用图之二,该应用主要用于简单运动场景下,运动目标比较小也比较小。如图3所示,该应用通过如下步骤实现:
步骤301:视域个数初始化。
也就是将原始视频划分为多个视域,具体划分为几个视域可根据实际需要而定,比如可划分为3个或5个视域等。
步骤302:计算视域方向。
具体地,根据原始视频中场景的方向计算视域的方向,若原始视频中场景的方向为横向或纵向,则相应的视域的方向为横向或纵向。
步骤303:计算每个对象轨迹隶属视域。
具体地,可根据距离计算公式对对象轨迹与各个视域的接近程度进行计算,将原始视频包含的各个对象轨迹,划分到该对象轨迹最为接近的视域中。
步骤304:视域直线模型更新。
具体地,某一视域中每加入一条对象轨迹后,还可以根据该对象轨迹的起止点坐标,更新该视域的线段特征,以加入下一个对象轨迹。
其中,步骤305:视域活跃度指标计算。
具体地,根据视域中对象轨迹的活跃程度,统计视域的活跃度指标。
步骤306:视域活跃度指标与预设门限比较。
其中,视域活跃度指标大于或小于预设门限的视域,相应判定为重要视域或次要视域,在判定为次要视域时,执行步骤307。
步骤307:利用第二预设函数对对象轨迹进行处理。
具体地,由于本应用中场景的特殊性,计算出的视域均为次要视域,则利用第二预设函数分别求解各个视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合,生成视频摘要。
实施例三
如图1和图4所示,为本发明实施例示意图,本发明实施例包括实施例一中的步骤101、102、103、104,不同的是本实施例中的步骤104与实施例一中的步骤104的实现方式不同,对本实施例与实施例一相同的部分不再赘述,以下仅以不同部分说明:
具体地,本发明实施例中的步骤104包括:
若所述多个视域中包含重要视域和次要视域,若其中两个重要视域相邻,则合并该两个重要视域为一个重要视域,对合并后的重要视域采用第一预设函数求解对象轨迹组合的最优解;若重要视域彼此不相邻,则采用第一预设函数分别求解各个重要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;采用第二预设函数分别求解各个次要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;
根据所有视域的最优对象轨迹组合,生成视频摘要。
其中,可通过第一预设函数分别求解各个重要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合,该实现可采用现有技术中的函数求解重要视域内的对象轨迹组合的最优解,作为优选的实施方式,在本实施例中的第一预设函数采用复杂的转移映射能量函数求解各个视域的对象轨迹组合的最优解,可通过如下公式进行求解:
E(MAP)=Ea(BO)+αEtps(BO)+β*Entps(BO)+γ*Etc(BO)+λEtct(BO)
其中,E(MAP)为复杂的转移映射能量函数;BO为重要视域内对象轨迹的集合;Ea(BO)为活动能量代价,表示如果在摘要视频中不出现该目标时的罚函数;Etps(BO)为相关正序代价,表示如果在摘要视频中不正序加入该目标时的罚函数;Entps(BO)为相关逆序代价,表示本应前后相关的两个对象,在摘要视频中逆序加入时带来的代价罚函数;Etc(BO)为伪碰撞代价,表示在原始视频中不发生碰撞的两个对象在摘要结果中发生了轨迹碰撞带来的罚函数;Etct(BO)为真碰撞代价,表示在原始视频中发生碰撞的两个对象在摘要结果中不发生了碰撞带来的罚函数,Etct(BO)为负值,α,β,γ,λ为预设的权重系数,其具体数值可根据实际场景中的情况需要而定。
其中,可通过第二预设函数分别求解各个次要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合,该实现可采用现有技术中的函数求解次要视域内的对象轨迹组合的最优解,作为优选的实施方式,在本实施例中的第二预设函数采用简单的转移映射能量函数求解各个视域的对象轨迹组合的最优解,可通过如下公式进行求解:
其中,E(MAP)c为简单的转移映射能量函数求解各个视域的对象轨迹组合的最优解,其中简单的转移映射能量函数是相对于实施例一中的复杂的转移映射能量函数而言的,bm和bb为次要视域内的两条运动对象轨迹,γ为预设的权重系数,其具体数值可根据实际场景中的情况需要而定。
图4为本发明实施例的视频摘要生成方法的应用图之三,该应用主要用于结构复杂的运动场景下,运动目标不规律,比如某些区域目标运动简单且个数小,有些区域目标相对运动复杂。如图4所示,该应用通过如下步骤实现:
步骤401:视域个数初始化。
也就是将原始视频划分为多个视域,具体划分为几个视域可根据实际需要而定,比如可划分为3个或5个视域等。
步骤402:计算视域方向。
具体地,根据原始视频中场景的方向计算视域的方向,若原始视频中场景的方向为横向或纵向,则相应的视域的方向为横向或纵向。
步骤403:计算每个对象轨迹隶属视域。
具体地,可根据距离计算公式对对象轨迹与各个视域的接近程度进行计算,将原始视频包含的各个对象轨迹,划分到该对象轨迹最为接近的视域中。
步骤404:视域直线模型更新。
具体地,某一视域中每加入一条对象轨迹后,还可以根据该对象轨迹的起止点坐标,更新该视域的线段特征,以加入下一个对象轨迹。
步骤405:视域活跃度指标计算。
具体地,根据视域中对象轨迹的活跃程度,统计视域的活跃度指标。
步骤406:视域活跃度指标与预设门限比较。
其中,视域活跃度指标大于或小于预设门限的视域,相应判定为重要视域或次要视域,当判定为重要视域时,执行步骤407,当判定为次要视域时,执行步骤410。
步骤407:两个重要视域是否彼此相邻。
若两个重要视域是彼此相邻,则继续步骤408,否则直接执行步骤409。
步骤408:合并。即合并相邻的两个重要视域。
步骤409:利用第一预设函数对重要视域内的对象轨迹进行处理;
步骤410,利用第二预设函数对次要视域内的对象轨迹进行处理;
最后,根据所有视域的最优对象轨迹组合,生成视频摘要。
实施例四
如图1和图5所示,为本发明实施例示意图,本发明实施例包括实施例一中的步骤101、102、103、104,不同的是本实施例中的步骤104与实施例一中的步骤104的实现方式不同,对本实施例与实施例一相同的部分不再赘述,以下仅以不同部分说明:
具体地,本发明实施例中的步骤104包括:
若所述多个视域中包含重要视域和次要视域,若其中两个重要视域相邻,则合并该两个重要视域为一个重要视域,对合并后的重要视域采用第一预设函数求解对象轨迹组合的最优解;若重要视域彼此不相邻,则采用第一预设函数分别求解各个重要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;次要视域中的对象轨迹按照原始视频复制到背景图像中;
根据所有视域的最优对象轨迹组合,生成视频摘要。
其中,可通过第一预设函数分别求解各个重要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合,该实现可采用现有技术中的函数求解重要视域内的对象轨迹组合的最优解,作为优选的实施方式,在本实施例中的第一预设函数采用复杂的转移映射能量函数求解各个视域的对象轨迹组合的最优解,可通过如下公式进行求解:
E(MAP)=Ea(BO)+αEtps(BO)+β*Entps(BO)+γ*Etc(BO)+λEtct(BO)
其中,E(MAP)为复杂的转移映射能量函数;BO为重要视域内对象轨迹的集合;Ea(BO)为活动能量代价,表示如果在摘要视频中不出现该目标时的罚函数;Etps(BO)为相关正序代价,表示如果在摘要视频中不正序加入该目标时的罚函数;Entps(BO)为相关逆序代价,表示本应前后相关的两个对象,在摘要视频中逆序加入时带来的代价罚函数;Etc(BO)为伪碰撞代价,表示在原始视频中不发生碰撞的两个对象在摘要结果中发生了轨迹碰撞带来的罚函数;Etct(BO)为真碰撞代价,表示在原始视频中发生碰撞的两个对象在摘要结果中不发生了碰撞带来的罚函数,Etct(BO)为负值,α,β,γ,λ为预设的权重系数,其具体数值可根据实际场景中的情况需要而定。
对次要视域中的对象轨迹按照原始视频复制到背景图像中,最终生成视频摘要。
图5为本发明实施例的视频摘要生成方法的应用图之四,该应用主要用于结构复杂的运动场景下,运动目标不规律,比如某些区域目标运动简单且个数小,有些区域目标相对运动复杂。如图5所示,该应用通过如下步骤实现:
步骤501:视域个数初始化。
也就是将原始视频划分为多个视域,具体划分为几个视域可根据实际需要而定,比如可划分为3个或5个视域等。
步骤502:计算视域方向。
具体地,根据原始视频中场景的方向计算视域的方向,若原始视频中场景的方向为横向或纵向,则相应的视域的方向为横向或纵向。
步骤503:计算每个对象轨迹隶属视域。
具体地,可根据距离计算公式对对象轨迹与各个视域的接近程度进行计算,将原始视频包含的各个对象轨迹,划分到该对象轨迹最为接近的视域中。
步骤504:视域直线模型更新。
具体地,某一视域中每加入一条对象轨迹后,还可以根据该对象轨迹的起止点坐标,更新该视域的线段特征,以加入下一个对象轨迹。
步骤505:视域活跃度指标计算。
具体地,根据视域中对象轨迹的活跃程度,统计视域的活跃度指标。
步骤506:视域活跃度指标与预设门限比较。
其中,视域活跃度指标大于或小于预设门限的视域,相应判定为重要视域或次要视域,当判定为重要视域时,执行步骤507,当判定为次要视域时,执行步骤510。。
步骤507:两个重要视域是否彼此相邻。
若两个重要视域是彼此相邻,则继续步骤508,否则直接执行步骤509。
步骤508:合并。即合并相邻的两个重要视域。
步骤509:利用第一预设函数对重要视域内的对象轨迹进行处理;
步骤510,将对象轨迹按照原始视频复制到背景图像中;
最后,根据所有视域的最优对象轨迹组合,生成视频摘要。
实施例五
如图6所示,本发明实施例还提供了一种视频摘要生成装置,所述装置60包括:
第一划分模块61,设置为将原始视频划分为多个视域;
归类模块62,设置为根据对象轨迹与各个视域的接近程度,将原始视频包含的各个对象轨迹,划分到该对象轨迹最为接近的视域中;
第二划分模块63,设置为根据视域中对象轨迹的活跃程度,统计视域的活跃度指标,并根据所述活跃度指标是否超出预设门限,将各个视域划分为重要视域和次要视域;
合并处理模块64,设置为对各个重要视域和次要视域内的对象轨迹进行并行处理,并将并行处理后得到的各个视域进行合并,生成视频摘要。
其中,所述第一划分模块61包括:第一计算单元,设置为确定原始视频中场景的方向;第一划分单元,设置为根据所述场景的方向,将原始视频划分为多个视域,所述多个视域的方向与所述场景的方向一致。
其中,所述第一计算单元包括:第一获取单元,设置为获取所述原始视频中场景内的多条对象轨迹的初始点与终止点;差值计算单元,设置为根据对象轨迹的初始点
与终止点进行坐标差值计算,确定对象轨迹的方向;判断单元,设置为根据所述多条对象轨迹中的大部分对象轨迹的方向,判断所述原始视频中场景的方向,所述场景的方向与多条对象轨迹中的大部分对象轨迹的方向一致。
其中,所述归类模块62包括:第二获取单元,设置为获取每个视域的线段特征,所述线段特征包括:视域的起止点坐标和视域内包含的对象轨迹的个数;距离计算单元,设置为获取对象轨迹的起止点坐标,计算对象轨迹与各个视域的接近程度;第一归类单元,设置为根据所述接近程度,将原始视频包含的各个对象轨迹,划分到该对象轨迹最为接近的视域中;
更新单元,设置为根据该对象轨迹的起止点坐标,更新该最为接近的视域的线段特征。
其中,所述第二划分模块63,包括:活跃度指标计算单元,其中对象轨迹的活跃程度与对象轨迹对应的对象面积及对象轨迹的持续时间呈正相关,所述统计视域的活跃度指标为:将视域内所有对象轨迹的活跃程度进行求和计算,得到视域的活跃度指标;第二划分单元,设置为根据所述活跃度指标是否超出预设门限,将各个视域划分为重要视域和次要视域。
可选地,所述合并处理模块64包括:第一合并单元,设置为若所述多个视域均为重要视域,则采用第一预设函数分别求解各个视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;第一处理单元,设置为根据所有视域的最优对象轨迹组合,生成视频摘要。
可选地,所述合并处理模块64包括:第二合并单元,设置为若所述多个视域均为次要视域,则采用第二预设函数分别求解各个视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;第二处理单元,设置为根据所有视域的最优对象轨迹组合,生成视频摘要。
可选地,所述合并处理模块64包括:第三合并单元,设置为若所述多个视域中包含重要视域和次要视域,若其中两个重要视域相邻,则合并该两个重要视域为一个重要视域,对合并后的重要视域采用第一预设函数求解对象轨迹组合的最优解;若重要视域彼此不相邻,则采用第一预设函数分别求解各个重要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;采用第二预设函数分别求解各个次要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;第三处理单元,设置为根据所有视域的最优对象轨迹组合,生成视频摘要。
可选地,所述合并处理模块64包括:第四合并单元,设置为若所述多个视域中包含重要视域和次要视域,若其中两个重要视域相邻,则合并该两个重要视域为一个重要视域,对合并后的重要视域采用第一预设函数求解对象轨迹组合的最优解;若重要视域彼此不相邻,则采用第一预设函数分别求解各个重要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;次要视域中的对象轨迹按照原始视频复制到背景图像中;第四处理单元,设置为根据处理结果,将各个视域进行合并,生成视频摘要。
在本发明实施例的视频摘要生成方法中,通过对重要视域和次要视域内的对象轨迹的并行处理,减少了轨迹组合的运算量,加快了运算速度,使用户更加简单明了的关注重要视域内的主要目标。
以上所述是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明所述原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。
基于本发明实施例提供的上述技术方案,通过对重要视域和次要视域内的对象轨迹的并行处理,减少了轨迹组合的运算量,加快了运算速度,使用户更加简单明了的关注重要视域内的主要目标。
Claims (18)
- 一种视频摘要生成方法,包括:将原始视频划分为多个视域;根据对象轨迹与各个视域的接近程度,将原始视频包含的各个对象轨迹,划分到该对象轨迹最为接近的视域中;根据视域中对象轨迹的活跃程度,统计视域的活跃度指标,并根据所述活跃度指标是否超出预设门限,将各个视域划分为重要视域和次要视域;对各个重要视域和次要视域内的对象轨迹进行并行处理,并将并行处理后得到的各个视域进行合并,生成视频摘要。
- 根据权利要求1所述的方法,其中,所述将原始视频划分为多个视域,包括:确定所述原始视频中场景的方向;根据所述场景的方向,将所述原始视频划分为多个视域,所述多个视域的方向与所述场景的方向一致。
- 根据权利要求2所述的方法,其中,所述确定原始视频中场景的方向,包括:获取所述原始视频中场景内的多条对象轨迹的初始点与终止点;根据对象轨迹的初始点与终止点进行坐标差值计算,确定所述对象轨迹的方向;根据所述多条对象轨迹中的大部分对象轨迹的方向,判断所述原始视频中场景的方向,所述场景的方向与多条对象轨迹中的大部分对象轨迹的方向一致。
- 根据权利要求1所述的方法,其中,所述根据对象轨迹与各个视域的接近程度,将原始视频包含的各个对象轨迹,划分到该对象轨迹最为接近的视域中,包括:获取每个视域的线段特征,所述线段特征包括:视域的起止点坐标和视域内包含的对象轨迹的个数;获取对象轨迹的起止点坐标,计算所述对象轨迹与各个视域的接近程度;根据所述接近程度,将所述原始视频包含的各个对象轨迹,划分到所述对象轨迹最为接近的视域中;根据所述对象轨迹的起止点坐标,更新所述最为接近的视域的线段特征。
- 根据权利要求1所述的方法,其中,所述根据视域中对象轨迹的活跃程度,统计视域的活跃度指标,并根据所述活跃度指标是否超出预设门限,将各个视域划分为重要视域和次要视域,包括:所述活跃程度与所述对象轨迹对应的对象面积及所述对象轨迹的持续时间呈正相关,所述统计视域的活跃度指标为:将视域内所有对象轨迹的活跃程度进行求和计算,得到视域的活跃度指标;根据所述活跃度指标是否超出预设门限,将各个视域划分为重要视域和次要视域。
- 根据权利要求1所述的方法,其中,所述对各个重要视域和次要视域内的对象轨迹进行并行处理,并将并行处理后得到的各个视域进行合并,生成视频摘要,包括:若所述多个视域均为重要视域,则采用第一预设函数分别求解各个视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;根据所有视域的最优对象轨迹组合,生成视频摘要。
- 根据权利要求1所述的方法,其中,所述对各个重要视域和次要视域内的对象轨迹进行并行处理,并将并行处理后得到的各个视域进行合并,生成视频摘要,包括:若所述多个视域均为次要视域,则采用第二预设函数分别求解各个视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;根据所有视域的最优对象轨迹组合,生成视频摘要。
- 根据权利要求1所述的方法,其中,所述对各个重要视域和次要视域内的对象轨迹进行并行处理,并将并行处理后得到的各个视域进行合并,生成视频摘要,包括:若所述多个视域中包含重要视域和次要视域,若其中两个重要视域相邻,则合并该两个重要视域为一个重要视域,对合并后的重要视域采用第一预设函数求解对象轨迹组合的最优解;若重要视域彼此不相邻,采用第一预设函数分别求解各个重要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;采用第二预设函数分别求解各个次要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;根据所有视域的最优对象轨迹组合,生成视频摘要。
- 根据权利要求1所述的方法,其中,所述对各个重要视域和次要视域内的对象轨迹进行并行处理,并将并行处理后得到的各个视域进行合并,生成视频摘要,包括:若所述多个视域中包含重要视域和次要视域,若其中两个重要视域相邻,则合并该两个重要视域为一个重要视域,对合并后的重要视域采用第一预设函数求解对象轨迹组合的最优解;若重要视域彼此不相邻,则采用第一预设函数分别求解各个重要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;次要视域中的对象轨迹按照原始视频复制到背景图像中;根据处理结果,将各个视域进行合并,生成视频摘要。
- 一种视频摘要生成装置,包括:第一划分模块,设置为将原始视频划分为多个视域;归类模块,设置为根据对象轨迹与各个视域的接近程度,将原始视频包含的各个对象轨迹,划分到该对象轨迹最为接近的视域中;第二划分模块,设置为根据视域中对象轨迹的活跃程度,统计视域的活跃度指标,并根据所述活跃度指标是否超出预设门限,将各个视域划分为重要视域和次要视域;合并处理模块,设置为对各个重要视域和次要视域内的对象轨迹进行并行处理,并将并行处理后得到的各个视域进行合并,生成视频摘要。
- 根据权利要求10所述的装置,其中,所述第一划分模块包括:第一计算单元,设置为确定所述原始视频中场景的方向;第一划分单元,设置为根据所述场景的方向,将所述原始视频划分为多个视域,所述多个视域的方向与所述场景的方向一致。
- 根据权利要求11所述的装置,其中,所述第一计算单元包括:第一获取单元,设置为获取所述原始视频中场景内的多条对象轨迹的初始点与终止点;差值计算单元,设置为根据对象轨迹的初始点与终止点进行坐标差值计算,确定所述对象轨迹的方向;判断单元,设置为根据所述多条对象轨迹中的大部分对象轨迹的方向,判断所述原始视频中场景的方向,所述场景的方向与多条对象轨迹中的大部分对象轨迹的方向一致。
- 根据权利要求10所述的装置,其中,所述归类模块包括:第二获取单元,设置为获取每个视域的线段特征,所述线段特征包括:视域的起止点坐标和视域内包含的对象轨迹的个数;距离计算单元,设置为获取对象轨迹的起止点坐标,计算所述对象轨迹与各个视域的接近程度;第一归类单元,设置为根据所述接近程度,将所述原始视频包含的各个对象轨迹,划分到所述对象轨迹最为接近的视域中;更新单元,设置为根据所述对象轨迹的起止点坐标,更新所述最为接近的视域的线段特征。
- 根据权利要求10所述的装置,其中,所述第二划分模块包括:活跃度指标计算单元,设置为计算视域的活跃度指标,其中,所述活跃程度与所述对象轨迹对应的对象面积及所述对象轨迹的持续时间呈正相关,所述统计视域的活跃度指标为:将视域内所有对象轨迹的活跃程度进行求和计算,得到视域的活跃度指标;第二划分单元,设置为根据所述活跃度指标是否超出预设门限,将各个视域划分为重要视域和次要视域。
- 根据权利要求10所述的装置,其中,所述合并处理模块包括:第一合并单元,设置为若所述多个视域均为重要视域,则采用第一预设函数分别求解各个视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;第一处理单元,设置为根据所有视域的最优对象轨迹组合,生成视频摘要。
- 根据权利要求10所述的装置,其中,所述合并处理模块包括:第二合并单元,设置为若所述多个视域均为次要视域,则采用第二预设函数分别求解各个视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;第二处理单元,设置为根据所有视域的最优对象轨迹组合,生成视频摘要。
- 根据权利要求10所述的装置,其中,所述合并处理模块包括:第三合并单元,设置为若所述多个视域中包含重要视域和次要视域,若其中两个重要视域相邻,则合并该两个重要视域为一个重要视域,对合并后的重要视域采用第一预设函数求解对象轨迹组合的最优解;若重要视域彼此不相邻,则采用第一预设函数分别求解各个重要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;采用第二预设函数分别求解各个次要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合;第三处理单元,设置为根据所有视域的最优对象轨迹组合,生成视频摘要。
- 根据权利要求10所述的装置,其中,所述合并处理模块包括:第四合并单元,设置为若所述多个视域中包含重要视域和次要视域,若其中两个重要视域相邻,则合并该两个重要视域为一个重要视域,对合并后的重要视域采用第一预设函数求解对象轨迹组合的最优解;若重要视域彼此不相邻,则采用第一预设函数分别求解各个重要视域的对象轨迹组合的最优解,进而确定该最优解对应的最优对象轨迹组合,次要视域中的对象轨迹按照原始视频复制到背景图像中;第四处理单元,设置为根据处理结果,将各个视域进行合并,生成视频摘要。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410570690.4 | 2014-10-23 | ||
CN201410570690.4A CN105530554B (zh) | 2014-10-23 | 2014-10-23 | 一种视频摘要生成方法及装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015184768A1 true WO2015184768A1 (zh) | 2015-12-10 |
Family
ID=54766027
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2014/094701 WO2015184768A1 (zh) | 2014-10-23 | 2014-12-23 | 一种视频摘要生成方法及装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN105530554B (zh) |
WO (1) | WO2015184768A1 (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106227759A (zh) * | 2016-07-14 | 2016-12-14 | 中用科技有限公司 | 一种动态生成视频摘要的方法及装置 |
CN107995535A (zh) * | 2017-11-28 | 2018-05-04 | 百度在线网络技术(北京)有限公司 | 一种展示视频的方法、装置、设备和计算机存储介质 |
CN108959312A (zh) * | 2017-05-23 | 2018-12-07 | 华为技术有限公司 | 一种多文档摘要生成的方法、装置和终端 |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110505534B (zh) * | 2019-08-26 | 2022-03-08 | 腾讯科技(深圳)有限公司 | 监控视频处理方法、装置及存储介质 |
CN111526434B (zh) * | 2020-04-24 | 2021-05-18 | 西北工业大学 | 基于转换器的视频摘要方法 |
CN112884808B (zh) * | 2021-01-26 | 2022-04-22 | 石家庄铁道大学 | 保留目标真实交互行为的视频浓缩管集划分方法 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007120716A2 (en) * | 2006-04-12 | 2007-10-25 | Google, Inc. | Method and apparatus for automatically summarizing video |
US20090007202A1 (en) * | 2007-06-29 | 2009-01-01 | Microsoft Corporation | Forming a Representation of a Video Item and Use Thereof |
CN102906816A (zh) * | 2010-05-25 | 2013-01-30 | 伊斯曼柯达公司 | 视频概要方法 |
CN103200463A (zh) * | 2013-03-27 | 2013-07-10 | 天脉聚源(北京)传媒科技有限公司 | 一种视频摘要生成方法和装置 |
CN103345764A (zh) * | 2013-07-12 | 2013-10-09 | 西安电子科技大学 | 一种基于对象内容的双层监控视频摘要生成方法 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5600040B2 (ja) * | 2010-07-07 | 2014-10-01 | 日本電信電話株式会社 | 映像要約装置,映像要約方法および映像要約プログラム |
CN102375816B (zh) * | 2010-08-10 | 2016-04-20 | 中国科学院自动化研究所 | 一种在线视频浓缩装置、系统及方法 |
CN102256065B (zh) * | 2011-07-25 | 2012-12-12 | 中国科学院自动化研究所 | 基于视频监控网络的视频自动浓缩方法 |
CN103092925B (zh) * | 2012-12-30 | 2016-02-17 | 信帧电子技术(北京)有限公司 | 一种视频摘要生成方法和装置 |
CN103092963A (zh) * | 2013-01-21 | 2013-05-08 | 信帧电子技术(北京)有限公司 | 一种视频摘要生成方法和装置 |
CN103686453A (zh) * | 2013-12-23 | 2014-03-26 | 苏州千视通信科技有限公司 | 通过划分区域并设置不同粒度来提高视频摘要精度的方法 |
-
2014
- 2014-10-23 CN CN201410570690.4A patent/CN105530554B/zh active Active
- 2014-12-23 WO PCT/CN2014/094701 patent/WO2015184768A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007120716A2 (en) * | 2006-04-12 | 2007-10-25 | Google, Inc. | Method and apparatus for automatically summarizing video |
US20090007202A1 (en) * | 2007-06-29 | 2009-01-01 | Microsoft Corporation | Forming a Representation of a Video Item and Use Thereof |
CN102906816A (zh) * | 2010-05-25 | 2013-01-30 | 伊斯曼柯达公司 | 视频概要方法 |
CN103200463A (zh) * | 2013-03-27 | 2013-07-10 | 天脉聚源(北京)传媒科技有限公司 | 一种视频摘要生成方法和装置 |
CN103345764A (zh) * | 2013-07-12 | 2013-10-09 | 西安电子科技大学 | 一种基于对象内容的双层监控视频摘要生成方法 |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106227759A (zh) * | 2016-07-14 | 2016-12-14 | 中用科技有限公司 | 一种动态生成视频摘要的方法及装置 |
CN106227759B (zh) * | 2016-07-14 | 2019-09-13 | 中用科技有限公司 | 一种动态生成视频摘要的方法及装置 |
CN108959312A (zh) * | 2017-05-23 | 2018-12-07 | 华为技术有限公司 | 一种多文档摘要生成的方法、装置和终端 |
CN108959312B (zh) * | 2017-05-23 | 2021-01-29 | 华为技术有限公司 | 一种多文档摘要生成的方法、装置和终端 |
US10929452B2 (en) | 2017-05-23 | 2021-02-23 | Huawei Technologies Co., Ltd. | Multi-document summary generation method and apparatus, and terminal |
CN107995535A (zh) * | 2017-11-28 | 2018-05-04 | 百度在线网络技术(北京)有限公司 | 一种展示视频的方法、装置、设备和计算机存储介质 |
CN107995535B (zh) * | 2017-11-28 | 2019-11-26 | 百度在线网络技术(北京)有限公司 | 一种展示视频的方法、装置、设备和计算机存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN105530554A (zh) | 2016-04-27 |
CN105530554B (zh) | 2020-08-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2015184768A1 (zh) | 一种视频摘要生成方法及装置 | |
Shuai et al. | Siammot: Siamese multi-object tracking | |
Zhu et al. | Multi-drone-based single object tracking with agent sharing network | |
EP2956891B1 (en) | Segmenting objects in multimedia data | |
CN103336959B (zh) | 一种基于gpu多核并行加速的车辆检测方法 | |
US11514625B2 (en) | Motion trajectory drawing method and apparatus, and device and storage medium | |
Chen et al. | Asynchronous tracking-by-detection on adaptive time surfaces for event-based object tracking | |
Chen et al. | Using FTOC to track shuttlecock for the badminton robot | |
Xu et al. | Dynamic obstacle detection based on panoramic vision in the moving state of agricultural machineries | |
CN103985257A (zh) | 一种智能交通视频分析方法 | |
Cancela et al. | Unsupervised trajectory modelling using temporal information via minimal paths | |
CN105469427A (zh) | 一种用于视频中目标跟踪方法 | |
Zhai et al. | Scale-context perceptive network for crowd counting and localization in smart city system | |
CN110956062A (zh) | 轨迹路线生成方法、设备及计算机可读存储介质 | |
Liu et al. | Multi-lane detection by combining line anchor and feature shift for urban traffic management | |
TWI783572B (zh) | 物件追蹤方法及物件追蹤裝置 | |
Li et al. | UniMODE: Unified Monocular 3D Object Detection | |
Liu et al. | T‐ESVO: Improved Event‐Based Stereo Visual Odometry via Adaptive Time‐Surface and Truncated Signed Distance Function | |
Fan et al. | Global contextual attention for pure regression object detection | |
Lai et al. | A survey of deep learning application in dynamic visual SLAM | |
Namitha et al. | An improved interaction estimation and optimization method for surveillance video synopsis | |
CN109799905A (zh) | 一种手部跟踪方法和广告机 | |
Du et al. | Video retargeting based on spatiotemporal saliency model | |
Wang et al. | A robust long-term pedestrian tracking-by-detection algorithm based on three-way decision | |
Jin et al. | Multi-camera pedestrian tracking using group structure |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14894107 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14894107 Country of ref document: EP Kind code of ref document: A1 |