CN102495907B - Video summary with depth information - Google Patents

Video summary with depth information Download PDF

Info

Publication number
CN102495907B
CN102495907B CN 201110437761 CN201110437761A CN102495907B CN 102495907 B CN102495907 B CN 102495907B CN 201110437761 CN201110437761 CN 201110437761 CN 201110437761 A CN201110437761 A CN 201110437761A CN 102495907 B CN102495907 B CN 102495907B
Authority
CN
China
Prior art keywords
mobiles
scene
cutout
animation
making
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN 201110437761
Other languages
Chinese (zh)
Other versions
CN102495907A (en
Inventor
胡大鹏
李志前
周晓
麦振文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hong Kong Applied Science and Technology Research Institute ASTRI
Original Assignee
Hong Kong Applied Science and Technology Research Institute ASTRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hong Kong Applied Science and Technology Research Institute ASTRI filed Critical Hong Kong Applied Science and Technology Research Institute ASTRI
Priority to CN 201110437761 priority Critical patent/CN102495907B/en
Publication of CN102495907A publication Critical patent/CN102495907A/en
Application granted granted Critical
Publication of CN102495907B publication Critical patent/CN102495907B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention relates to a video summary with depth information. A computer executable method for creating a summary video with the depth information comprises the following steps: recognizing moving objects from an input primary video; generating animated moving object cut blocks aiming at recognized moving objects through copying and piling sequential frames in the input primary video containing moving object images; establishing a scene background through using a structure of a scene in the input primary video and estimating any loss part; establishing a three-dimensional scene through using a foreground object and the depth information of the scene background in the input primary video, and overlaying the animated moving object cut blocks on the three-dimensional scene according to respective longitude, dimensionality and depth positions of the animated moving object cut blocks in the three-dimensional scene so as to display a dynamic 3D (three-dimensional) scene; and compositing to obtain the summary video through using the dynamic 3D scene.

Description

Video summary with depth information
Technical field
The present invention relates generally to video analysis, index and retrieval in the video monitoring.Particularly, the present invention relates to a kind of analysis and summary video to help to expect the search of content and the method and system of identification.
Background technology
Video is sheared classification locate to try in some perhaps that event is a dull and time-consuming procedure.The beholder must carefully check the whole video shearing, and these videos shearings may comprise in each frame or may not comprise interested scene.If in the video monitoring midium or long term Video Capture scene of checking ceaselessly, this problem is just more serious.In addition, in commercial and public safety monitoring, it comprises the network of hundreds of monitor video video camera usually, and these monitor video video cameras are caught a plurality of unlimited video data streams.At All Around The World billions of monitor video video cameras is installed.Southern city in China only in Shenzhen, just is provided with according to estimates above 1,000,000 video cameras.
Therefore, need a kind of method to summarize or the compressed video shearing, thereby only show that part of video of expecting content that comprises probably.Some traditional video summarization technology are activity compression object and the result is presented in the conventional two dimensional motion image in time.But Ya Suo two dimensional motion image can make mobiles get together and can be difficult to digestion for the human vision comprehension like this.Other traditional video summarization technology are deleted static frames simply from the source video is sheared, this can not obtain best summary effect.
Summary of the invention
The object of the present invention is to provide a kind of by activity compression object in time and the result is presented at the method for the summary video in the three-dimensional scenic.Additional depth dimension allows the nature person to visually perceive effectively: parallax and difference are with helping the vision comprehension of mobiles along with the time location.Because the video summarization method generates the summary video that has three-dimensional information that produces, so can create the scene view of the novelty of being caught by virtual video camera thereupon.
A further object of the present invention is to provide a kind of summary video that comprises two viewing areas: object tabulation and the scene of appearance are looked back part.The object tabulation that occurs is by getting rid of background information so that the user can only pay close attention to object and show events object cutout.Scene is looked back part and is shown the three-dimensional scenic view with object cutout.
Description of drawings
Hereinafter, embodiments of the present invention is described in detail by reference to the accompanying drawings for the general, wherein:
Fig. 1 shows an embodiment of summary video, and this summary video comprises object tabulation and the scene review part of appearance;
Fig. 2 shows exemplary computer system application user interface, and this user interface covers the summary video with relevant classification and sequencing feature; And
Fig. 3 shows original view and the new view of catching from the different virtual video camera observation point of three-dimensional scenic.
Embodiment
In the following description, set forth the method and system embodiment of the video summary with depth dimension with the form of preferred embodiment.For one of ordinary skill in the art, can under the situation that does not deviate from scope and spirit of the present invention, comprise the modification that increases and/or reduce.Can omit detail, to avoid fuzzy the present invention; But disclosed content should be write and can be made one of ordinary skill in the art just can put into practice the instruction of this paper under the situation of not carrying out undo experimentation.
But the invention provides a kind ofly for the computing machine implementation method of summarizing video, the mobiles that it is at first identified in the original video of input then utilizes the three-dimensional information of scene to synthesize the summary video.Object identification can be based on choice criteria, for example the form of the shape of object and structure, color, accurate outward appearance and spatial movement.
With reference to Fig. 1, by copying and piling up the original video frame of the continuous input that comprises live image and abandon mobiles two field picture pixel on every side and carry out the cartoon making of each mobiles cutout (cutout) 122.Therefore, each mobiles cutout is one group of successive video frames with set time order.Like this, each sequence of motion of making the mobiles cutout of animation is held down with its spatial movement longitude, dimension and depth location data in scene.Then, the mobiles cutout frame group of making animation is kept in the permanent memory data storage.The structure of each scene in the original video of utilization input makes up the background 121 of scene.The disappearance part can be estimated automatically.Can also import background by the user.
Equally with reference to Fig. 1.Computing machine of the present invention can be implemented the summary video that the dynamic 3D scene of video summarization method utilization is synthesized output, and wherein dynamically the 3D scene had both shown that static background 121 was also shown in the time and goes up chaotic and be made into the mobiles cutout 122 of animation.The depth information of background and foreground object is used for creating dynamic 3D scene.Then, by will overlapping onto on the three-dimensional scenic from the frame group of the mobiles cutout of making animation of permanent memory memory fetch according to longitude, dimension and the depth location of each mobiles in scene, to present dynamic 3D scene.The generation of dynamic 3D scene can generally be described by following step:
1. the depth information of known background scene.
2. the 3D position of the mobiles of known every frame.
3. will estimate automatically or the background image structure mapping of user input to the degree of depth of 3D scene.
4. the user can select the 3D representative of every type of object.For example, the object of regulation or human 3D model can be represented the people, and object or the vehicle 3D model of regulation can be represented vehicle.
5. each object cutout has the 3D representative of appointment.
6. for each mobiles, the frame of object cutout can be regarded as the structure that will be mapped to selected 3D object (object described in step 4), to force outward appearance separately.
7. the 3D representative with structure mapping is put on the 3D scene.
8. pass when the time, upgrade the position (according to the 3D position of each mobiles in each time) of the 3D representative of structure mapping.
9. simultaneously, with the outward appearance (that is structure) of the next frame renewal 3D representative of the object cutout of each object.
10. repeating step (8) and (9) always are shown up to all mobileses, disappear then.
According to one embodiment of present invention, the time sequencing of making the mobiles cutout of animation can change.For example, can overlap onto on the three-dimensional scenic in their each positions in scene by the frame group of making the mobiles cutout of animation with two, be made into and appear at together at one time in the dynamic 3D scene and will appear at two objects in the scene at two different times.Therefore, can be by side by side a plurality of mobileses being illustrated in together the length that shortens the summary video in the dynamic 3D scene in large quantities, wherein mobiles can appear in the original video of input individually in different periods.The user can dispose how many mobiles cutouts can occur simultaneously and which mobiles cutout will appear in the dynamic 3D scene.
According to another embodiment, can from the frame group of the mobiles cutout of making animation, select a frame, and after the frame group of the mobiles cutout of making animation is completely reproduced in dynamic 3D scene, this frame its oneself position in scene is overlapped onto on the three-dimensional scenic.Can be based on the choice criteria of user's input or specific time sequencing or the position the frame group from the frame group selection frame of object cutout.When other frame groups of making the mobiles cutout of animation also were reproduced, this can be used as the position mark of object in scene.
According to an embodiment again, dynamically the 3D scene can use virtual video camera to watch three-dimensional scenic from different angles.Like this, can generate snapshot image and the video at the new visual angle of three-dimensional scenic.For example, Fig. 3 shows the original view 301 of three-dimensional scenic in the left side; And show the new view 302 that observation point captures on the right side, this observation point tilts to the right or counter-clockwisely slightly from raw observation point.
According to another embodiment, as shown in fig. 1, the summary video of output is made up of two following viewing areas: object tabulation 101 and the scene of appearance are looked back part 102.As shown in fig. 1, the object of appearance tabulation 102 shows snapshot or the animation 111 of mobiles, the mobiles cutout of making animation 122 current the appearing in the dynamic 3D scene of its correspondence.If object tabulation 101 in mobiles illustrate with snapshot, a frame of making so in the frame group of mobiles cutout of animation just is used as snapshot.If object tabulation 101 in mobiles illustrate with animation, the frame of making so in the frame group of mobiles cutout of animation just is reproduced in scene under the situation of not considering spatial movement.Scene is looked back part has shown dynamic 3D scene by the mobiles cutout of making animation virtual video camera view.The object tabulation 101 that occurs can be vertically or flatly is placed on any position in the display.The object tabulation 101 that occurs can also overlapping scene be looked back part 102.In this case, the object of appearance is tabulated and 101 is occurred translucently, looks back part 102 to avoid blocking scene.
According to an embodiment, the order that mobiles 111 in the object tabulation 101 that occurs occurs from the top to the bottom and scene are looked back in the part 102 the corresponding mobiles cutout 122 of making animation that occurs in dynamic 3D scene time sequencing is identical, and the object at top is corresponding to the mobiles cutout of making animation that occurs recently in the object tabulation 101 that wherein occurs.
With reference to Fig. 2.According to each embodiment of the present invention, use the summary video of dynamic 3D scene can be included in the computer system with using user interface.An embodiment of user interface comprises that the user imports Standard Selection window 201, has the object relevant criterion therein, for example, if the space operation of shape, color, object type, mobiles or direction of motion and mobiles are the license plate numbers under the situation of vehicle.Each mobiles cutout of making animation based on it and selected relevant criterion closely the degree of coupling be assigned with one relevant number.Then, the mobiles in the object of the appearance tabulation 202 is labeled their separately relevant numbers.Snapshot or the animation of the mobiles cutout of making animation in the object tabulation that occurs are classified by their relevance rankings separately.In one embodiment, the correlativity of making the mobiles cutout of animation can also be used to specifying the mobiles cutout of making animation to appear at time sequencing in the dynamic 3D scene.
Equally with reference to Fig. 2.In one embodiment, the computer system application user interface comprises the virtual video camera controller 203 for the visual angle of adjusting dynamic 3D scene.
Can use universal or special computing equipment, computer processor or electronic circuit system to realize embodiment disclosed herein, described electronic circuit system includes but not limited to digital signal processor (DSP), special IC (ASIC), field programmable gate array (FPGA) and other programmable logic device (PLD) that dispose or programme according to the instruction of present disclosure.The computing machine appointment or the software code that run in universal or special computing equipment, computer processor or the programmable logic device (PLD) can easily be prepared according to the instruction of present disclosure by the technician of software or electronic applications.
In certain embodiments, the present invention includes computer storage media may, this computer storage media may has the computing machine that is stored in wherein specifies or software code, and they can be used for designated computer or microprocessor and carry out any processing of the present invention.Medium can include but not limited to floppy disk, CD, Blu-ray Disc, DVD, CD-ROM and magneto-optic disk, ROM, RAM, flash memory device or be suitable for storing medium or the equipment of any kind of appointment, code and/or data.
For demonstration and illustrative purposes, provide aforementioned description of the present invention.It is not intended to limit of the present invention or is limited in disclosed exact form.For one of ordinary skill in the art, a lot of modifications and modification will be apparent.
In order to explain principle of the present invention and practical application thereof better, select and described embodiment, thereby make one of ordinary skill in the art understand the present invention by each embodiment, and make one of ordinary skill in the art can understand the present invention to have various modifications, the practical application that these modifications are suitable for expecting.Scope of the present invention is limited by appended claims and equivalent thereof.

Claims (10)

1. one kind is used for the computer executing method that establishment has the summary video of depth information, comprising:
Receive the original video of input by computer processor;
By the original video identification activity object of computer processor from input, wherein said mobiles identification is based on the choice criteria that comprises body form and structure, color, accurate outward appearance and space operation form;
Copy and pile up the successive frame in the original video of described input of the image that comprises each mobiles and abandon two field picture pixel around described each mobiles by computer processor, come to generate the mobiles cutout of making animation for the mobiles of each identification;
Use the scene in the original video of importing by computer processor and estimate that any disappearance partly makes up scene background;
Produce dynamic 3D scene by computer processor through following steps:
The depth information of known background scene; The 3D position of the mobiles of known every frame; With estimate automatically or the background image structure mapping of user's input to the degree of depth of 3D scene; The user can select the 3D representative of every type of object; Each object cutout has the 3D representative of appointment; For each mobiles, the frame of object cutout can be regarded as and will be mapped to selected 3D object; The 3D representative of structure mapping is put on the 3D scene; Pass when the time, the position of the 3D representative of continuous updating structure mapping and simultaneously with the outward appearance of the next frame renewal 3D representative of the object cutout of each object, repeat the step of this continuous updating structure mapping and renewal 3D representative outward appearance, shown up to all mobileses, disappear then;
Use dynamic 3D scene to synthesize the summary video by computer processor.
2. method according to claim 1, wherein dynamically the 3D scene comprises for the virtual video camera of watching three-dimensional scenic from all angles.
3. method according to claim 1, wherein said scene background is made up automatically by computer processor or is imported by the user.
4. method according to claim 1, the time sequencing that the wherein said mobiles cutout of making animation occurs in dynamic 3D scene is configurable.
5. method according to claim 4, wherein by in the described separately position of mobiles cutout in three-dimensional scenic of making animation each described mobiles cutout of making animation being overlapped on the three-dimensional scenic together, two or more mobileses that different times is occurred appear in the dynamic 3D scene simultaneously together.
6. method according to claim 1 also comprises:
Use dynamic 3D scene to synthesize the summary video by computer processor, described summary video comprises two viewing areas: object tabulation and the scene of appearance are looked back part;
The object tabulation of wherein said appearance shows current described snapshot or the animation of making the mobiles cutout of animation that appears in the dynamic 3D scene; And
Wherein said scene is looked back the virtual video camera view that part shows the dynamic 3D scene with described mobiles cutout of making animation.
7. method according to claim 6, wherein to look back the time sequencing that appears at the corresponding mobiles cutout of making animation in the dynamic 3D scene in the part identical for the appearance order of the snapshot of the mobiles cutout of making animation in the object tabulation that occurs or animation and scene.
8. method according to claim 6, each snapshot or the animation of the mobiles cutout of making animation during wherein the object that occurs with relevant labelled notation is tabulated, the described mobiles cutout of making animation of relevant number expression and one group of optional relevant criterion of the user degree of mating wherein, the optional relevant criterion of described user comprise that the space operation of shape, color, object type, mobiles or direction of motion and mobiles are the license plate numbers under the situation of vehicle.
9. method according to claim 8 wherein sorts to described mobiles animation cutout according to the described correlativity separately of making the mobiles cutout of animation; And wherein by the described relevance ranking separately of making the mobiles cutout of animation described snapshot or the animation of making the mobiles cutout of animation in the object tabulation that occurs classified.
10. method according to claim 8 wherein sorts to the described mobiles cutout of making animation according to the described correlativity separately of making the mobiles cutout of animation; And the time sequencing that the mobiles cutout of wherein dynamically making animation described in the 3D scene occurs is specified by the described relevance ranking of making the mobiles cutout of animation.
CN 201110437761 2011-12-23 2011-12-23 Video summary with depth information Active CN102495907B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110437761 CN102495907B (en) 2011-12-23 2011-12-23 Video summary with depth information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110437761 CN102495907B (en) 2011-12-23 2011-12-23 Video summary with depth information

Publications (2)

Publication Number Publication Date
CN102495907A CN102495907A (en) 2012-06-13
CN102495907B true CN102495907B (en) 2013-07-03

Family

ID=46187732

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110437761 Active CN102495907B (en) 2011-12-23 2011-12-23 Video summary with depth information

Country Status (1)

Country Link
CN (1) CN102495907B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107493441B (en) * 2016-06-12 2020-03-06 杭州海康威视数字技术股份有限公司 Abstract video generation method and device
CN107547804A (en) * 2017-09-21 2018-01-05 北京奇虎科技有限公司 Realize the video data handling procedure and device, computing device of scene rendering

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1822645A (en) * 2005-02-15 2006-08-23 乐金电子(中国)研究开发中心有限公司 Mobile communication terminal capable of briefly offering activity video and its abstract offering method
CN101064825A (en) * 2006-04-24 2007-10-31 中国科学院自动化研究所 Mobile equipment based sport video personalized customization method and apparatus thereof
CN101262568A (en) * 2008-04-21 2008-09-10 中国科学院计算技术研究所 A method and system for generating video outline
CN101366027A (en) * 2005-11-15 2009-02-11 耶路撒冷希伯来大学伊森姆研究发展公司 Method and system for producing a video synopsis
CN101640809A (en) * 2009-08-17 2010-02-03 浙江大学 Depth extraction method of merging motion information and geometric information
CN102289490A (en) * 2011-08-11 2011-12-21 杭州华三通信技术有限公司 Video summary generating method and equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010087751A1 (en) * 2009-01-27 2010-08-05 Telefonaktiebolaget Lm Ericsson (Publ) Depth and video co-processing

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1822645A (en) * 2005-02-15 2006-08-23 乐金电子(中国)研究开发中心有限公司 Mobile communication terminal capable of briefly offering activity video and its abstract offering method
CN101366027A (en) * 2005-11-15 2009-02-11 耶路撒冷希伯来大学伊森姆研究发展公司 Method and system for producing a video synopsis
CN101064825A (en) * 2006-04-24 2007-10-31 中国科学院自动化研究所 Mobile equipment based sport video personalized customization method and apparatus thereof
CN101262568A (en) * 2008-04-21 2008-09-10 中国科学院计算技术研究所 A method and system for generating video outline
CN101640809A (en) * 2009-08-17 2010-02-03 浙江大学 Depth extraction method of merging motion information and geometric information
CN102289490A (en) * 2011-08-11 2011-12-21 杭州华三通信技术有限公司 Video summary generating method and equipment

Also Published As

Publication number Publication date
CN102495907A (en) 2012-06-13

Similar Documents

Publication Publication Date Title
US8719687B2 (en) Method for summarizing video and displaying the summary in three-dimensional scenes
CN106355153B (en) A kind of virtual objects display methods, device and system based on augmented reality
CN101657839B (en) System and method for region classification of 2D images for 2D-to-3D conversion
CN101479765B (en) Methods and systems for converting 2d motion pictures for stereoscopic 3d exhibition
JP7128708B2 (en) Systems and methods using augmented reality for efficient collection of training data for machine learning
CN101542536A (en) System and method for compositing 3D images
CN104349155B (en) Method and equipment for displaying simulated three-dimensional image
US20140181630A1 (en) Method and apparatus for adding annotations to an image
CA2668941A1 (en) System and method for model fitting and registration of objects for 2d-to-3d conversion
CN106648098B (en) AR projection method and system for user-defined scene
JP2006507609A (en) Image capture and display system and method for generating composite image
KR20140082610A (en) Method and apaaratus for augmented exhibition contents in portable terminal
US20130188862A1 (en) Method and arrangement for censoring content in images
CN102170578A (en) Method and apparatus for processing stereoscopic video images
CN102474636A (en) Adjusting perspective and disparity in stereoscopic image pairs
US9589385B1 (en) Method of annotation across different locations
CN104486584A (en) City video map method based on augmented reality
CN111862866B (en) Image display method, device, equipment and computer readable storage medium
EP2936442A1 (en) Method and apparatus for adding annotations to a plenoptic light field
CN105611267A (en) Depth and chroma information based coalescence of real world and virtual world images
CN107636728A (en) For the method and apparatus for the depth map for determining image
JP2023172882A (en) Three-dimensional representation method and representation apparatus
WO2007048197A1 (en) Systems for providing a 3d image
CN102495907B (en) Video summary with depth information
CN113382224B (en) Interactive handle display method and device based on holographic sand table

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant