CN103997687A

CN103997687A - Techniques for adding interactive features to videos

Info

Publication number: CN103997687A
Application number: CN201410055610.1A
Authority: CN
Inventors: D·C·米德尔顿; O·内斯塔雷斯; L·B·安斯沃思
Original assignee: Intel Corp
Current assignee: Intel Corp
Priority date: 2013-02-20
Filing date: 2014-02-19
Publication date: 2014-08-20
Anticipated expiration: 2034-02-19
Also published as: CN103997687B

Abstract

Techniques are disclosed for adding interactive features to videos to enable users to create new media using a dynamic blend of motion and still imagery. The interactive techniques can include allowing a user to change the starting time of one or more subjects in a given video frame, or only animate/play a portion of a given frame scene. The techniques may include segmenting each frame of a video to identify one or more subjects within each frame, selecting (or receiving selections of) one or more subjects within the given frame scene, tracking the selected subject(s) from frame to frame, and alpha-matting to play/animate only the selected subject(s). In some instances, segmentation, selection, and/or tracking may be improved and/or enhanced using pixel depth information (e.g., using a depth map).

Description

For increase the technology of interaction feature to video

Background technology

How rest image and video all can describe event in time for them all has benefit and limitation.They for the mutual restriction that also has regulation of media.Conventionally, they are attractive for creator, and lack property of participation for spectators.For example, creating after video, user only can expectedly navigate (for example, play, refund, advance, suspend and stop) by the frame of video according to creator conventionally at first passively, and user has no chance and video interaction.Similarly restriction is equally applicable to rest image.In this sense, video and rest image do not invite user to input.

Brief description of the drawings

Fig. 1 a-c shows three kinds of methods, show according to one or more embodiment of present disclosure for increasing the technology of interaction feature to video.

Fig. 2 a-g ' shows the example images according to some embodiment, and these example images are exemplified with the technology of Fig. 1 a-c.

Fig. 3 a-b shows the screenshot capture according to one or more embodiment, exemplified with exemplary user interface, for comprise the media reciprocation of interaction feature as described herein.

Fig. 4 shows the example system according to one or more embodiment, and it can implement as herein described for increase the technology of interaction feature to video.

Fig. 5 shows the embodiment of small form factor apparatus, can embody therein the system of Fig. 4.

Embodiment

Disclose the technology for increase interaction feature to video, can create new media with the dynamic mixing of motion and standstill image so that obtain user.Interaction technique can comprise that permission user changes the time started of one or more objects in given frame of video, or only animation (animate)/broadcasting part is given framing scene.This technology can comprise to be cut apart each frame of video, to identify the one or more objects in each frame, select (or receive right ... selection) give the one or more objects in framing scene, frame by frame is followed the trail of selected objects, and Alpha-masking-out processing (alpha-matting) is with only play/animation selected objects.In some instances, can use pixel depth information (for example using depth map) to improve and/or strengthen cuts apart, selects and/or follow the trail of.According to present disclosure, many variations will be apparent.

overview

As previously explained, rest image and video have the restriction of regulation, and they are all attractive for creator conventionally, but lack property of participation for spectators/beholder.At present, watch video conventionally only to comprise at once the ability of playing, refund, advance, suspending and stop all vision contents.Now not simply and intuitively technology for and video interactive, only to play partial video scene at every turn, or change the time/position of partial video, to make creating new visual media, wherein the remainder of part scene and scene is out-of-sequence.

Like this, and according to one or more embodiment of present disclosure, the technology for increase interaction feature to video is disclosed.Alleged video comprises the sequence of at least two rest image/frames herein, one group of photo of for example film or use burst mode (burst mode).The entirety of single frame is referred to herein as " scene ", and the interior interested object of the scene of frame or region (for example, people, animal, various project, background or background portion are graded) will be called " object " in this article.The interaction feature being produced by technology described herein comprises can be from new media below video creation: 1) new rest image, has the object at one or more videos in the moment different from the remainder of scene (or from different frame); 2) new video artifacts, has the one or more objects that do not start in order; And 3) new visual media pseudomorphism, at this, plays one or more objects, but the remainder transfixion (being similar to dynamic picture) of frame scene.Like this, in one or more embodiments, interaction feature is included in the dynamic mixing that generates motion and standstill image in the scene of broadcasting.Can for example, preserve and/or share new media with dynamic-form (for example, further mutual possible in the situation that) or static form (further mutual impossible situation under), as discussed in more detail below.

In certain embodiments, the technology for increasing interaction feature as herein described can at least comprise following content: cut apart, selection, tracking and the processing of Alpha-masking-out.As what will appreciate that according to present disclosure, can change the order of these functions.Cut apart and can comprise each frame of video is divided into its semantic component (semantic component), for example to use without supervision figure cutting method (unattended graph cut method) or other applicable methods, identify the one or more objects in each frame scene based on corresponding pixel groups.In some instances, cut apart and can completely automatically complete; But in other examples, cutting apart can be semi-automatic or manual execution.Selection can comprise click (for example,, in the situation that mouse is inputted) or touch/rap (for example, under touch-sensitive input condition) video present the one or more objects in frame.In certain embodiments, for example can use, for the pixel depth information (, depth map) of each frame of video and improve and cut apart, select and/or follow the trail of.In some this embodiment, can produce depth information by solid or array camera, as discussed in more detail below.Note, in certain embodiments, before selection can occur in and cut apart, this can contribute to improve and/or improve cutting procedure.

Tracking can comprise that frame by frame follows the trail of selected objects from the frame of video, to identify the respective groups of pixels that comprises selected objects in each frame.Can carry out the processing of Alpha-masking-out by several different methods.A this illustrative methods comprises the transparent masking-out that forms and match from the shape of the one or more selected objects to framing scene, to allow the one or more holes displaying video by being created by described transparent masking-out, wherein, upgrade the shape in one or more holes in given scenario for each frame of video, to mate the shape of the one or more selected objects in the frame of just playing.Another illustrative methods comprises around one or more selected objects in each frame and forms transparent masking-out, to allow by the one or more selected objects in the frame of just playing are copied to carrying out displaying video on framing scene.According to present disclosure, other applicable Alpha-masking-out processing methods can be apparent.

As previously mentioned, using technology described herein to be increased to the interaction feature of video can be for creating new visual media pseudomorphism, and at this, one or more objects are play, but the remainder of frame scene keeps transfixion.In only giving a part for framing scene, realize animation, keep residue to the constant and static aspect of framing scene, this exemplary new media type is similar to dynamic picture.But the interaction feature that uses technology described herein to be increased to video provides the multiple benefits that are better than conventional dynamic picture creation method.First, interaction feature as herein described allows the dynamic change to scene, and dynamic picture to be non-interactive type constant video circulation.Secondly, can increase interaction feature as herein described by complete or semi-automatic technology, and dynamic picture establishment is mainly manual procedure.The 3rd, dynamic picture uses coarse border, has caused unexpected visual artefacts, use as herein describedly cut apart, tracking and Alpha-masking-out treatment technology can avoid or eliminate it.According to present disclosure, other benefits that are better than conventional dynamic picture creation method will be apparent.

According to some embodiment, for example, can for example, be detected the use of disclosed technology by the visual examination/assessment of media that comprises interaction feature as herein described (, only playing the ability of a part of video).Can also detect the use of technology disclosed herein by the result visual media based on producing.For example, can produce the only image of some scene animation for increasing the technology of interaction feature to video with what describe by different way herein, or the video that do not start in order of object.According to present disclosure, many variations or structure will be apparent.

method or exemplary application

Fig. 1 a-c shows respectively method 100a-c, method 100a-c exemplified with according to one or more embodiment of present disclosure for increasing the technology of interaction feature to video.Fig. 2 a-g ' shows the example images according to some embodiment, and they show the technology of Fig. 1 a-c.As previously mentioned, be mainly to have discussed technology under the linguistic context that increases interaction feature to the video with multiple frames herein; But, needn't so limit this technology.For example, the technology shown in method 100a-c can be for increasing interaction features to rest image group or to other visual medias of the sequence that comprises at least two rest image/frames, as can be apparent according to present disclosure.Method 100a-c all comprise cut apart 102, select 104, follow the trail of 106 and Alpha-masking-out process 108, below will discuss in more detail wherein each.

Fig. 2 a shows the image/frame 210 that stands in the people 214 before waterfall 216 according to exemplary embodiment.As can be seen, in this example frame, people 214 makes the motion of waving.In frame 210, also show prospect 212 and sky 218.Fig. 2 b shows and is carrying out in order to identify people 214 the example images after 102 of cutting apart.Cut apart 102 and can comprise the frame of video is divided into its semantic component, to identify the one or more objects in each frame.Can carry out and cut apart 102 with any known dividing method, for example figure split plot design, clustering procedure, Threshold, the method based on compression, method, rim detection, the region-growing method based on block diagram, cut apart-fusion method, the method based on partial differential equation (PDE), watershed (watershed) method, or be apparent any other applicable method according to present disclosure.In one exemplary embodiment, use without supervision figure cutting method and carry out and cut apart 102.Depend on structure used or method, it can be full automatic, automanual or manual cutting apart 102.

In certain embodiments, can cut apart them based on the corresponding pixel groups of one or more objects.For example, Fig. 2 b shows and represents to carry knapsack and the people's 214 that waves pixel by (in the frame 210) of shape 215 outlinings, and represents the pixel by the sky 218 of shape 219 outlinings.Note, according to such as what identify without the exemplary segmentation procedure 102 of supervision (automatically) figure cutting method, only comprised the object of people 214 or sky 218.Other objects in frame 210 can comprise waterfall district 216, prospect 212 or interested any other applicable object or region.As previously mentioned, depend on used 102 processes of cutting apart, can automatic, semi-automatic or manually identify one or more objects.

In certain embodiments, can improve or strengthen with the depth information of the frame for video and cut apart 102.For example can provide or produce depth data with the depth map of frame.In some instances, each pixel can comprise RGB-D data, at this, and RGB relevant with the color of each pixel (red, green, blue color model), D is relevant with the depth information of each pixel.In certain embodiments, can collect depth information by the particular device of the capturing video for technology described herein.This equipment can comprise various stereocameras, array camera, light field camera or other depth transducers or depth perception survey technology.In particular instance, even under low-light level (low-light) condition, Infrared Projector and monochromatic complementary metal oxide semiconductors (CMOS) (CMOS) transducer (for example for in) also can be for catching three dimensional video data.In certain embodiments, can be the video estimating depth information having existed.For example, in some instances, the movable information of the video having existed can be for estimating depth information.In some cases, can be for estimating depth information from the room and time information of the consecutive frame of monoscopic video.Depend on structure used and method, can produce depth map by automatic, semi-automatic or manual technique and estimate.

Fig. 2 c shows according to the exemplary selection 104 of object in the frame of video of an embodiment.More specifically, illustrate that hand 250 selects the people 214 in frame 210.Select 104 can comprise that selection is to the one or more objects in framing.In this way, for example, method 100a-c can be configured to receive and select 104 from user.Can carry out and select 104 with various input equipments, click the object of expection such as using mouse or tracking plate, use touch-sensitive device to touch the object (for example using rapping of suitably placing on the equipment with touch screen) of expection, or by any other applicable method, such as the posture of being made by people, or from people's sound or spoken word.Fig. 2 d shows the example of frame 210 after having selected people 214.As can be seen, as the result of the selection of Fig. 2 c, highlighted people 214.Note, in this embodiment, people's shape 215 has been identified by cutting apart 102 processes.But, in other embodiments, can not cut apart 102, select after 104 until carried out, as will be discussed more in detail herein.In certain embodiments, pixel depth information can for example, for automatic alternative (, automatically selecting to the prospect in framing scene and background), or strengthens and select (user who for example, strengthens the pixel groups of sharing the same or similar degree of depth selects).

Fig. 2 e-f shows and between the first frame 210 and the second frame 220, follows the trail of 106 example according to embodiment.Follow the trail of 106 and can comprise that frame by frame follows the trail of selected objects from the frame of video.In this exemplary embodiment, the first frame 210 and the second frame 220 are the series of frames from video.Fig. 2 e shows the first frame 210, and the profile 215 that comprises people 214 and cut apart is foregoing.Fig. 2 f shows the second frame 220, have corresponding in the first frame 210 those numbering (for example, for the sky of the first frame 218 and for the sky of the second frame 228 etc.).The second frame 220 in this exemplary embodiment comprises the people 224 identical with the first frame 210, and except his hand position has moved, because as can be seen, people 214,224 is waving.Segmentation contour 225 shows the result of new pixel groups, is illustrated in its left hand in brandishing and has changed the people 224 of position.Follow the trail of 106 and can comprise frame by frame tracking selected objects, to identify the correct pixel groups in each frame.For example, after having carried out and cutting apart 102, can follow the trail of as pixel groups 215 and 225 and know others 214,224 from the first frame 210 to second frames.In certain embodiments, pixel depth information can be followed the trail of (for example, increasing the effect of frame by frame identifying object with this depth information) for strengthening.

As can be seen in Fig. 1 a-c, carried out and cut apart 102 before or after select 104.In illustrative methods 100a, carry out and cut apart 102, be the selection 104 of one or more objects afterwards, follow the trail of afterwards 106 selected objects.In this embodiment, selecting 104 front execution to cut apart 102 delays that can reduce between selection 104 and media playback.In another illustrative methods 100b, carry out the selection 104 of one or more objects, be to cut apart 102 afterwards, be to follow the trail of 106 subsequently.In this embodiment, the refinement can be used as cutting apart 102 processes increases by 104 information of selection (for example, selecting coordinate) (for example increasing selection coordinate to cut algorithm without supervision figure).In another illustrative methods 100c, carry out and cut apart 102, be to follow the trail of 106 afterwards, be the selection 104 of one or more objects afterwards.In this embodiment, select to carry out before 104 cut apart 102 and follow the trail of 106 can reduce to select 104 and media playback between delays.In illustrative methods 100a-c, cutting apart 102, select 104 and follow the trail of 106 and carry out Alpha-masking-out processing 108 after completing; But, according to present disclosure, obviously need not to be this situation.In other exemplary embodiments, method can be included in to carry out and follow the trail of 106 and Alpha-masking-out and process multiple before 108 and cut apart 102 and 104 processes of selection.For example, in this embodiment, method can comprise auto Segmentation process, and user selects, subsequently based on selecting input again to cut apart.Can repeat this exemplary series, until user obtains the fidelity of expected degree.

Fig. 2 g-g ' shows the example of processing 108 according to the Alpha of the frame 210 of some embodiment-masking-out.Alpha-masking-out processes 108 can comprise frame by frame isolation selected objects, only to make selected objects animation.For example, Alpha-masking-out processes 108 and can comprise: 1) form the transparent masking-out with the form fit of the selected objects in framing scene, to allow by the hole displaying video being created by transparent masking-out, at this, upgrade the shape of given scenario mesopore for each frame of video, to mate the shape of the selected objects in the frame of just playing; Or 2) form transparent masking-out around selected objects in each frame, to allow by selected objects in the frame of just playing is copied to carrying out displaying video on framing scene.In other words, process in 108 processes in exemplary Alpha-masking-out, video initial/cut one or more holes in giving framing, it represents the shape of selected objects, (having hole) initial frame is stacked on each subsequent frame of video, with by hole displaying video, at this, on connecing the basis of a frame, a frame upgrades the hole in initial frame, to mate the shape of selected objects in the current frame of just playing.Process in 108 processes in another exemplary Alpha-masking-out, initially/be starting point again to framing, except in this example process, (for example isolate one or more selected objects from each subsequent frame, by excision, remove the residue scene of each subsequent frame or make that it is transparent), connect on the basis of a frame at a frame subsequently, will copy on initial frame from the selected objects of the current frame of just playing, to play described video.

Fig. 2 g shows Alpha-masking-out for primitive frame 210(is for example processed to 108 methods, the frame at 104 places that make a choice) example of the image that produces.As can be seen, having excised with people 214(him from primitive frame 210 is unique object of selecting in advance) the hole 217 that matches of shape.Can pass through subsequently hole 217 displaying videos, at each subsequent frame, reset primitive frame 210, and create new hole in primitive frame, the selected objects (being people 214 in the case) of itself and present frame matches.The excision matching with present frame the original image in new hole can cover subsequently on present frame to play this frame.Can proceed hole excision-overwrite procedure (because the 106 this information of tracking) with displaying video for each successive frames.

Fig. 2 g ' alternatively shows another kind of Alpha-masking-out is processed to the example images that 108 methods produce for primitive frame 210.Process in 108 methods in this interchangeable Alpha-masking-out, from subsequent frame 220, excised the profile of having described for illustrative purposes frame 220 around people 224 scene 230().Subsequently can be by having excised after the scene 230 of people 224 shape remaining copying image to displaying video on primitive frame 210.In this way, only selected objects (for example, being people 224 in the case) is copied on primitive frame, to make selected objects animation in the time of displaying video.Can proceed to excise-copy on primitive frame (because the 106 this information of tracking) with displaying video around the scene of object for each successive frames.Note, although only used an object (people 214,224) in these Alpha-masking-outs are processed the example of 108 processes, can use multiple objects.For example, if select sky 218 as extra animation object, in Fig. 2 g, will excise sky 218, sky 218 also can be shown in Fig. 2 g '.Also note, in certain embodiments, excise the selected one or more object scene of selected one or more objects (or around) and can form the selected one or more objects scene of selected one or more objects (or around) are set as transparent.

exemplary media creates

According to one or more embodiment, increasing interaction feature (use the techniques described herein) to video can be for creating polytype media.Media can comprise: 1) new rest image, has the one or more objects at the video in the moment different from scene remainder (or from different frame); 2) new video artifacts, has the one or more objects that do not start in order; And 3) new visual media pseudomorphism, plays one or more objects at this, but the remainder of frame scene keeps transfixion (being similar to dynamic picture).Provide this three examples for illustrational object, and be not intended to limit present disclosure, below will illustrate in greater detail them.

Can use and be increased to by technology described herein first exemplary new media that the interaction feature of video obtains and comprise and create new rest image to there are the one or more objects at the video in the moment different from scene remainder (or from different frame).This can, by selecting to the one or more objects in framing, with animation or play those objects, to the residue scene in framing to remain unchanged to realize simultaneously.In certain embodiments, interaction feature can allow animation/broadcasting and stop to the one or more objects in framing at different frame subsequently.In some this embodiment, then interaction feature can allow user's animation/broadcasting and stop subsequently different one or more objects, with make at least two objects can with respect to remaining give framing scene in different frame position.Therefore, in such an embodiment, can there are three different video time/frame position that are presented in single rest image.

Can use to be increased to by technology described herein second exemplary new media that the interaction feature of video obtains and to comprise and create new video artifacts, it has the one or more objects that do not start in order.This can be by selecting come animation or broadcasting and make subsequently the remainder of scene be played to realize to the one or more objects in framing.In certain embodiments, interaction feature can allow animation or broadcasting and stop to the one or more objects in framing at different frame subsequently.In some this embodiment, then interaction feature can allow user's animation/broadcasting and stop subsequently different one or more objects, with make at least two objects can with respect to residue to framing scene in different frame position.Therefore, in such an embodiment, so user can play whole media, wherein, two or more objects do not have not in order each other, and the remainder of two or more objects and frame is not in order.

Can use to be increased to by technology described herein the 3rd the exemplary new media that the interaction feature of video obtains and to comprise new visual media pseudomorphism, wherein play one or more objects, but the remainder of frame scene keep transfixion.This can, by selecting to come animation or broadcasting to the one or more objects in framing, to the remainder of framing Scene to remain unchanged to realize simultaneously.In certain embodiments, interaction feature can allow animation/broadcasting and stop to the one or more objects in framing according to order subsequently.In some this embodiment, so interaction feature allows user's animation/broadcasting and stops different one or more objects according to order equally.Therefore, in such an embodiment, so user can play media, wherein two or more objects do not have not in order each other, and the remainder of two or more objects and frame is not in order, but the remainder of primitive frame remains unchanged and be static.

Only realize animation in to framing scene in a part and retain residue constant and static to framing scene aspect, the 3rd exemplary new media is similar to dynamic picture.But the interaction feature that uses technology described herein to be increased to video provides the multiple benefits that are better than conventional dynamic picture creation method.First, interaction feature as herein described allows the dynamic change to scene, and dynamic picture is noninteractive constant video circulation.Secondly, can increase interaction feature as herein described by complete or semi-automatic technology, and dynamic picture establishment is mainly manual procedure.The 3rd, dynamic picture uses coarse border, has caused unexpected visual artefacts, use as herein describedly cut apart, tracking and Alpha-masking-out treatment technology can avoid or eliminate them.According to present disclosure, other benefits that are better than conventional dynamic picture creation method will be apparent.

Fig. 3 a-b shows the screenshot capture according to one or more embodiment, exemplified with exemplary user interface, for comprise the media reciprocation of interaction feature as described herein.Seen in Fig. 3 a, for user presents the first screenshot capture 310 of video, it is similar to frame 210 as herein described.For example, aforesaid three objects are still shown, people 314, waterfall 316 and sky 318.People 314 is depicted as has chain-dotted line profile, and waterfall 316 is depicted as has dash line profile, and sky 318 is depicted as has dash line profile.In this exemplary embodiment, cut apart, select, follow the trail of also Alpha-masking-out and processed three objects 314,316 and 318, allow user to select one or more in them, with the remainder broadcasting/animation selected objects with respect to the frame shown in the first screenshot capture 310.Instruction 311 is included in this exemplary UI, " to select you to want the object of broadcasting/animation " to user notification, " press and keep selecting multiple objects " and " again selecting the object of animation to stop it ".This exemplary UI and corresponding instruction are provided for illustrational object, have not been intended to limit present disclosure.

Fig. 3 b shows the new media being created after selecting people 314 and sky 318 with these two objects of the scene broadcasting/animation with respect to around these two objects.As seen in the second screenshot capture 320, make people's 314 animation cause the reposition of his hand shown in brandishing, make sky animation cause cloud 328 to occur.Stop in the whole scene shown in the second screenshot capture 320 to be easy to discussion, this can be by individually selecting them when object 314 and 318 animation or for example, carrying out by some other applicable orders (, use and stop whole buttons or knock space bar).But, can be by user's animation stop object one at a time, with make object can with the moving out of turn and/or stop of the remainder of other objects and scene.Provide and continue button 321 to allow user to continue broadcasting/animation selected object in the past.In some instances, interaction feature can be configured to allow user continue to select before animation one or more objects stopping, broadcasting and/or reset object.Can comprise which object is various features can be used for selecting about to user notification, which object broadcasting/animation of current selection, stop/not animation of which object of current selection, which object is not in order (for example, use framing bit to show current object from which frame starts to play), or other can use herein the information of the interaction feature of description by different way by assisted user.

In certain embodiments, use new media that the interaction feature that is increased to video of describing by different way creates to preserve and/or to share (output, Email send, upload etc.) with dynamic or static format herein.Dynamically share and can comprise shared particular media type, no matter it is rest image, the video artifacts creating, or the pseudomorphism of similar dynamic picture, its mode makes the reciever of media or the later beholder can be further and media mutual (for example,, by changing the initial order of one or more objects).The static media that can comprise according to creating of sharing carry out shared medium.For example, represent can be shared as JPEG (joint photographic experts group) (JPEG) file or portable network image (PNG) file from the rest image in abiogenous different moment in video, only enumerate two common forms.Creating video, wherein, under the not sequenced exemplary cases of partial video, new media can be shared as Motion Picture Experts Group's (MPEG) file or Audio Video Interleaved form (AVI) file, only enumerates two common forms.Creating new visual media pseudomorphism, wherein only under the exemplary cases of a part of animation/broadcasting of frame, new media can be shared as GIF(Graphic Interchange format) (GIF) file, only enumerates a common form.In the example shown in Fig. 3 b, new media can save as dynamically or static file, or it can be dynamic or static file by selecting corresponding button 323,325 to share (output, Email send, upload etc.).

example system

Fig. 4 shows the example system 400 according to one or more embodiment, and it can implement as herein described for increase the technology of interaction feature to video.In certain embodiments, system 400 can be media system, although system 400 is not limited to this linguistic context.For example, system 400 can be included in personal computer (PC), notebook computer, super computer, panel computer, Trackpad, portable computer, Hand Personal Computer, palmtop PC, PDA(Personal Digital Assistant), cell phone, cell phone/PDA combination, TV, smart machine (for example, smart phone, Intelligent flat computer or intelligent television), mobile internet device (MID), information receiving and transmitting equipment, data communications equipment, Set Top Box, game machine, maybe can carry out in other these type of computing environment of graph rendering operation.

In certain embodiments, system 400 comprises platform 402, and it is coupled to display 420.Platform 402 can receive content from the content device such as content services devices 430 or content transmitting apparatus 440 or other similar content source.Comprise one or more navigation characteristic navigation controller 450 can for for example with platform 402 and/or display 420 reciprocations.Below illustrate in greater detail each in these example components.

In certain embodiments, platform 402 can comprise chipset 405, processor 410, memory 412, memory device 414, graphics subsystem 415, application program 416 and/or wireless device 418.Chipset 405 can provide intercommunication mutually between processor 410, memory 412, memory device 414, graphics subsystem 415, application program 416 and/or wireless device 418.For example, chipset 405 can comprise storage adapter (not shown), and it can provide the intercommunication mutually with memory device 414.

Processor 410 for example can be implemented as processor, multinuclear or any other microprocessor or the CPU (CPU) of complex instruction set computer (CISC) (CISC) or Reduced Instruction Set Computer (RISC) processor, compatible x86 instruction set.In certain embodiments, processor 410 can comprise that dual core processor, double-core move processor etc.Memory 412 for example can be implemented as volatile storage devices, such as but not limited to, random-access memory (ram), dynamic random access memory (DRAM) or static RAM (SRAM) (SRAM).Memory device 414 for example can be implemented as non-volatile memory device, such as but not limited to, disc driver, CD drive, tape drive, internal storage device, affixed storage device, flash memory, battery backup type SDRAM(synchronous dram), and/or network can obtain type memory device.In certain embodiments, for example, in the time having comprised multiple hard disk drive, memory device 414 can comprise the technology that the memory property of valuable Digital Media is strengthened to protection in order to increase.

Graphics subsystem 415 can be carried out the processing to the image such as static or video, for showing.Graphics subsystem 415 can be for example graphics processing unit (GPU) or VPU (VPU).Analog or digital interface can be for graphics subsystem 415 and the display 420 of can being coupled communicatedly.For example, interface can be any one in the technology of HDMI (High Definition Multimedia Interface), display port (DisplayPort), radio HDMI and/or compatible wireless HD.Graphics subsystem 415 can be integrated in processor 410 or chipset 405.Graphics subsystem 415 can be independently to block, and it can be coupled to chipset 405 communicatedly.Can in various hardware structures, realize as describe by different way herein for increasing the technology of interaction feature to video.For example, cut apart 102, select 104, follow the trail of 106 and Alpha-masking-out process 108 and can all for example, carry out or receive by individual module (CPU), and in other examples, this processing can (for example be carried out in the module separating, carry out and cut apart 102 in cloud mode, receive and select 104 from touch screen input, on user's computer local carry out follow the trail of 106 and Alpha-masking-out process 108, or can more apparent other variations according to present disclosure).In certain embodiments, can be by specifying the discrete processors for this object to realize for the technology that increases interaction feature to video, or realized by one or more general processors (comprising polycaryon processor), it can access and carry out the software that embodies this technology.In addition, in certain embodiments, cut apart 102, select 104, follow the trail of 106 and Alpha-masking-out process 108 and can be stored in one or more modules, for example comprise memory 412, memory device 414 and/or application program 416.Under a this exemplary cases, this technology for encoding to be processed in application program to image, it is included in application program 416, and wherein, application program can be carried out on processor 410.Note, image is processed application program and can be directly loaded on user's computing system 400 in this locality.Alternatively, image is processed application program can be via the network such as network 460 (for example, local area network (LAN) and the Internet) and remote server be supplied to user's computing system 400, wherein, remote server is configured to as comprising or using image provided herein to process the main frame of the service of application program.In some this embodiment, image is processed the some parts of application program and can on server, be carried out, and can be used as the executable module of the browser of the computing system 400 that is supplied to user, other parts carry out via processor 410, as can be apparent according to present disclosure.

Wireless device 418 can comprise one or more wireless devices, and it can use various applicable wireless communication technologys to transmit and receive signal.This technology can comprise for example, communication across one or more wireless networks (, being included in network 460).Exemplary wireless network includes, but is not limited to wireless lan (wlan), Wireless Personal Network (WPAN), wireless MAN (WMAN), Cellular Networks and satellite network.In the communication across this network, wireless device 418 can operate according to the one or more applicable standard of any version.

In certain embodiments, display 420 can comprise any TV or computer type monitor or display.Display 420 for example can comprise equipment and/or the TV of liquid crystal display (LCD) screen, electrophoretic display device (EPD) (EPD) or liquid paper display (liquid paper display), flat-panel monitor, touch screen display, similar TV.Display 420 can be numeral and/or simulation.In certain embodiments, display 420 can be holography or three dimensional display.In addition, display 420 can be transparent surface, and it can receive visual projection.This projection can be passed on various forms of information, image and/or object.For example, this projection can be that the vision of mobile augmented reality (MAR) application is covered.Under the control of one or more software applications 416, platform 402 can show user interface 422 on display 420.

In certain embodiments, content services devices 430 can by any country, international and/or independently service institute (for example had, one or more remote servers, be configured to provide content, such as video, rest image and/or have function provided herein image process application program), thereby and can obtain via for example the Internet and/or other networks 460 for platform 402.Content services devices 430 can be coupled to platform 402 and/or display 420.Platform 402 and/or content services devices 430 can be coupled to network 460, transmit (for example send and/or receive) media information to be to and from network 460.Content delivery apparatus 440 is also coupled to platform 402 and/or display 420.In certain embodiments, content services devices 430 can comprise cable television box, personal computer, network, phone, can transmitting digital information and/or the equipment that has internet function or the device of content, and any other similarly can be via network 460 or the equipment of unidirectional or two-way transmission content between content provider and platform 402 and/or display 420 directly.Will appreciate that, can be to and from any one assembly and the unidirectional or two-way transmission content of content provider in system 400 via network 460.The example of content can comprise any media information, comprises for example video, music, figure, text, medical science and game content etc.

Content services devices 430 receives content, and for example cable television program comprises media information, digital information and/or other online contents (for example, video, rest image sequence etc.).Content provider's example can comprise any wired or satellite television or broadcast or ICP.In a this exemplary embodiment, user's computing system 400 can be processed application program or service via the image that obtains configuration as herein provided by the addressable ICP of network 460.As previously explained, this service can based on from so-called client-side (user's computing system 400) receive input (for example, any other input of selection 104 or agreement (engage) service), provide image to process the execution of application program on server side.Alternatively, service can provide executable code to client-side computing system 400, and it has comprised whole image and has processed application program.For example, service can provide one or more webpages to browser application, it has applicable user interface and is encoded in code wherein, and browser application operates on computing system 400 and is configured to associative processor 410 carries out this code effectively.Browser for example can be included in application program 416.In other embodiment, some image application program can be carried out on server side, and other parts can be carried out at client-side.Many this client-server structures can be apparent.The example providing is not intended to limit present disclosure.In certain embodiments, platform 402 can be from having navigation controller 450 reception control signals of one or more navigation characteristic.The navigation characteristic of controller 450 can be for for example mutual with user interface 422.In certain embodiments, navigation controller 450 can be pointing device, and it can be computer hardware component (being in particular human interface device), and it allows user that space (for example, continuous and multidimensional) data are input in computer.All allow user use body gesture or sound or voice command control computer or TV and provide data to it such as many systems of graphic user interface (GUI), TV and monitor.

The motion of the navigation characteristic of controller 450 can be reflected in (for example display 420) on display by indicating device, cursor, focusing ring or the motion that is presented at other visual detectors on display.For example, under the control of software application 416, the navigation characteristic being positioned on navigation controller 450 can be mapped to the virtual navigation feature being for example shown in user interface 422.In certain embodiments, controller 450 can not be the assembly separating, but is integrated in platform 402 and/or display 420.But will appreciate that, shown in embodiment is not limited to herein or in described element or linguistic context.

In certain embodiments, driver (not shown) can comprise making user for example realizing after initial start touch opening and closing immediately by button as the technology of the platform 402 of TV.In the time of " closing " platform, programmed logic can allow platform 402 to transmit content in stream mode to media filter or other guide service equipment 430 or content delivery apparatus 440.In addition, for example chipset 405 can comprise hardware and/or the software support to 5.1 surround sound audio frequency and/or high definition 7.1 surround sound audio frequency.Driver can comprise the graphdriver for integrated image platform.In certain embodiments, video driver device can comprise Peripheral Component Interconnect (PCI) express video card.

In multiple embodiment, any one or more assemblies shown in system 400 can be integrated.For example, platform 402 or content services devices 430 can be integrated, or platform 402 and content delivery apparatus 440 can be integrated, or for example platform 402, content services devices 430 and content delivery apparatus 440 can be integrated.In multiple embodiment, platform 402 and display 420 can be integrated unit.For example, display 420 and content services devices 430 can be integrated, or display 420 can be integrated with content delivery apparatus 440.These examples are not intended to limit present disclosure.

In multiple embodiment, system 400 can be implemented as wireless system, wired system or the combination of the two.In the time being embodied as wireless system, system 400 can comprise the assembly or the interface that are adapted to pass through wireless sharing media communication, and wireless sharing media are for example one or more antennas 404, transmitter, receiver, transceiver, amplifier, filter, control logic etc.The example of wireless sharing media can comprise part wireless frequency spectrum, for example RF spectrum etc.In the time being embodied as wired system, system 400 can comprise and be adapted to pass through assembly or the interface that wire communication media communicate, and wire communication media are for example I/O (I/O) adapters, physical connector in order to I/O adapter is connected with corresponding wire communication media, network interface unit (NIC), disk controller, Video Controller, Audio Controller etc.The example of wire communication media can comprise wire, cable, metal lead wire, printed circuit board (PCB) (PCB), base plate, switching fabric, semi-conducting material, twisted-pair feeder, coaxial cable, optical fiber etc.

Platform 402 can be set up one or more logic OR physical channels, in order to transmission information.Information can comprise media information and control information.Media information can refer to any data of the content that is intended for user.The example of content can comprise for example from voice conversation, video conference, stream video, Email or text message, voice mail message, alphanumeric notation, figure, image, video, text etc.Control information can refer to any data that represent the order, instruction or the control word that are intended for automated system.For example, control information can be for media information is sent by system route, or instructs node (for example, is used the interaction feature for video as herein described) in a predefined manner and processes media information.But embodiment is not limited to shown in Fig. 4 or described element or linguistic context.

As mentioned above, system 400 can embody with different physical type or form factor.Fig. 5 shows the embodiment of small form factor apparatus 500, can embody therein system 400.For example in certain embodiments, equipment 500 can be implemented as the mobile computing device with wireless capability.For example, mobile computing device can refer to and have treatment system or any equipment such as the portable power source of one or more batteries.

As previously mentioned, the example of mobile computing device can comprise personal computer (PC), notebook computer, super computer, panel computer, Trackpad, portable computer, Hand Personal Computer, palmtop PC, PDA(Personal Digital Assistant), cell phone, cell phone/PDA combination, TV, smart machine (for example, smart phone, Intelligent flat computer or intelligent television), mobile internet device (MID), information receiving and transmitting equipment, data communications equipment etc.

The example of mobile computing device can also comprise and is arranged as the computer of being dressed by people, for example wrist formula computer, finger computer, finger ring computer, glasses computer, strap clamp computer, arm band computer, footwear computer, clothes computer, and other wearable computers.For example, in certain embodiments, mobile computing device can be implemented as smart phone, and it can object computer application program, and voice communication and/or data communication.Although use the mobile computing device that is embodied as smart phone as an example to understand some embodiment, will appreciate that, also can realize other embodiment with other wireless mobile computing equipments.Embodiment is not limited to this linguistic context.

As shown in Figure 5, equipment 500 can comprise shell 502, display 504, I/O (I/O) equipment 506 and antenna 508.Equipment 500 can also comprise navigation characteristic 512.Display 504 can comprise any applicable display unit, for showing the information that is suitable for mobile computing device.I/O equipment 506 can comprise any applicable I/O equipment, for by input information to mobile computing device.The example of I/O equipment 506 can comprise alphanumeric keyboard, numerical keypad, Trackpad, enter key, button, switch, rocker switch, microphone, loud speaker, speech recognition apparatus and software etc.Can also via microphone by input information in equipment 500.This information can be by speech recognition apparatus digitlization.Embodiment is not limited to this linguistic context.

Can realize multiple embodiment with hardware element, software element or the combination of the two.The example of hardware element can comprise processor, microprocessor, circuit, circuit element (for example, transistor, resistor, capacitor, inductor etc.), integrated circuit, application-specific integrated circuit (ASIC) (ASIC), programmable logic device (PLD), digital signal processor (DSP), field programmable gate array (FPGA), gate, register, semiconductor device, chip, microchip, chipset etc.The example of software can comprise component software, program, application, computer program, application program, system program, machine program, operating system software, middleware, firmware, software module, routine, subroutine, function, method, process, software interface, application programming interfaces (API), instruction set, Accounting Legend Code, computer code, code segment, computer code segments, word, numerical value, symbol or its any combination.According to the factor of any amount, whether use hardware element and/or the software element can be different between embodiment, described factor be for example computation rate, power stage, thermal endurance, treatment cycle budget, input data transfer rate, output data rate, memory resource, data bus speed and other designs or the performance constraints of expection.

Some embodiment for example can realize with machine readable media or goods or computer program, described machine readable media or goods or computer program can be stored instruction or instruction set, in the time being carried out by machine, it can make machine carry out according to the method for the embodiment of present disclosure and/or operation.This machine for example can comprise any applicable processing platform, computing platform, computing equipment, treatment facility, computing system, treatment system, computer, processor etc., and can realize with the combination of any applicable hardware and software.Machine readable media or goods or computer program for example can comprise the non-transient memory cell of any suitable type, memory devices, memory goods, storage medium, memory device, storage goods, storage medium and/or memory cell, for example, memory, removable or irremovable medium, erasable or not erasable medium, can write or rewritable media, numeral or simulation medium, hard disk, floppy disk, compact-disc read-only memory (CD-ROM), can imprinting compact-disc (CD-R), can rewriteable compact disc (CD-RW), CD, magnetizing mediums, magnet-optical medium, mobile memory card or dish, various types of digital multi-purpose disks (DVD), tape, cassette tape etc.Instruction can comprise the executable code of any suitable type, and it is realized with senior, rudimentary, OO, visual, compiling and/or the explanatory programming language being applicable to arbitrarily.Some embodiment can realize in computer program, and it comprises disclosed for increase the function of the technology of interaction feature to video by different way herein, and this computer program can comprise one or more machine readable medias.

Unless specifically shown different, be appreciated that, term such as " processing ", " calculating ", " computing ", " determining " etc. refers to computer or computing system, or similarly action and/or the processing of electronic computing device, it will be expressed as the data manipulation of physical quantity (for example electronics) and/or be transformed to other data in the register of computing system and/or memory, is expressed as the physical quantity in memory, register or other this information storing devices, transfer equipment or the display of computing system like described other data class.Embodiment is not limited to this linguistic context.

further exemplary embodiment

Following instance is about further embodiment, and according to them, many displacements or structure will be apparent.

Example 1 is a kind of method, comprising: each frame of video is divided into its semantic component, and to identify the one or more objects in each frame scene based on corresponding pixel groups, wherein, video is a part for media; Receive the selection to the one or more objects in framing scene; Frame by frame is followed the trail of described one or more object from the frame of video, and to identify corresponding pixel groups, corresponding pixel groups comprises the described one or more objects in each frame; And media are carried out to the processing of Alpha-masking-out, isolate one or more selected objects with frame by frame.

Example 2 comprises the theme of example 1, wherein, media are carried out to the processing of Alpha-masking-out to be comprised: form the transparent masking-out matching with the shape of the described one or more selected objects to framing scene, to allow the one or more holes displaying video by being created by described transparent masking-out, wherein, upgrade the shape in one or more holes in given scenario for each frame of video, to mate the shape of the described one or more selected objects in the frame of just playing; Or form transparent masking-out around one or more selected objects described in each frame, to allow by the described one or more selected objects in the frame of just playing are copied to carrying out displaying video on framing scene.

Example 3 comprises the theme of example 1 or 2, wherein, uses without supervision figure cutting method and carries out cutting apart of each frame to video.

Example 4 comprises the theme of aforementioned any one example, further comprises pixel depth information, in order to cutting apart of the one or more objects in each frame of improvement identification.

Example 5 comprises the theme of example 4, further comprises and uses solid or array camera to produce described pixel depth information.

Example 6 comprises the theme of aforementioned any one example, further comprises from user and receives the selection to described one or more objects.

Example 7 comprises the theme of example 6, further comprises from the click of carrying out to the described one or more objects framing or rap input and receive user and select.

Example 8 comprises the theme of any one in example 1-7, is further included in and cuts apart the selection of reception to described one or more objects before of each frame, wherein, only cuts apart selected one or more object.

Example 9 comprises the theme of any one in example 1-7, before being further included in the selection that receives the object to one or more trackings, follows the trail of described one or more object.

Example 10 comprises the theme of any one in example 1-9, further comprises generation rest image, and wherein, described one or more selected objects are from the frame different from giving framing.

Example 11 comprises the theme of any one in example 1-9, further comprises generation video, and wherein, described one or more selected objects are with respect to not starting in order to framing.

Example 12 comprises the theme of any one in example 1-9, further comprises generation visual media, wherein, only plays described one or more selected objects, keeps the remainder transfixion to framing simultaneously.

Example 13 comprises the theme of any one in example 1-9, further comprises generation visual media, wherein, can select the one or more objects in the particular frame of video, so that selected one or more object is with respect to the remainder animation of described particular frame.

Example 14 is a kind of mobile computing systems, is configured to carry out the method for aforementioned any one example.

Example 15 is a kind of computing equipments, comprising: processor; Memory, can be by described processor access; And application program, be stored on described memory, and can be carried out by described processor, described application program is configured to: each frame of video is divided into its semantic component, to identify the one or more objects in each frame scene based on corresponding pixel groups, wherein, video is a part for media; Receive the selection to the one or more objects in framing scene; Frame by frame is followed the trail of described one or more object from the frame of video, and to identify corresponding pixel groups, corresponding pixel groups comprises the described one or more objects in each frame; And media are carried out to the processing of Alpha-masking-out, isolate one or more selected objects with frame by frame.

Example 16 comprises the theme of example 15, wherein, media are carried out to the processing of Alpha-masking-out to be comprised: form the transparent masking-out matching with the shape of the described one or more selected objects to framing scene, to allow the one or more holes displaying video by being created by described transparent masking-out, wherein, upgrade the shape in one or more holes in given scenario for each frame of video, to mate the shape of the described one or more selected objects in the frame of just playing; Or form transparent masking-out around one or more selected objects described in each frame, to allow by the described one or more selected objects in the frame of just playing are copied to carrying out displaying video on framing scene.

Example 17 comprises the theme of example 15 or 16, further comprises: display, is operationally coupled to described processor; And at least one input equipment, be operationally coupled to described processor, wherein, user can select to the described one or more objects in framing scene with described at least one input equipment.

Example 18 comprises the theme of example 15 or 16, further comprises the touch screen display that is coupled to described processor, and wherein, described touch screen is configured to input from user the selection receiving described one or more objects.

Example 19 is at least one computer program, it is encoded with instruction, in the time carrying out instruction by one or more processors, cause carrying out the process for increase interaction feature to video, described process comprises: each frame of video is divided into its semantic component, to identify the one or more objects in each frame scene based on corresponding pixel groups, wherein, video is a part for media; Receive the selection to the one or more objects in framing scene; Frame by frame is followed the trail of described one or more object from the frame of video, and to identify corresponding pixel groups, corresponding pixel groups comprises the described one or more objects in each frame; And media are carried out to the processing of Alpha-masking-out, isolate one or more selected objects with frame by frame.

Example 20 comprises the theme of example 19, wherein, media are carried out to the processing of Alpha-masking-out to be comprised: form the transparent masking-out matching with the shape of the described one or more selected objects to framing scene, to allow the one or more holes displaying video by being created by described transparent masking-out, wherein, upgrade the shape in one or more holes in given scenario for each frame of video, to mate the shape of the described one or more selected objects in the frame of just playing; Or form transparent masking-out around one or more selected objects described in each frame, to allow by the described one or more selected objects in the frame of just playing are copied to carrying out displaying video on framing scene.

Example 21 comprises the theme of example 19 or 20, wherein, uses without supervision figure cutting method and carries out cutting apart of each frame to video.

Example 22 comprises the theme of any one in example 19-21, further comprises pixel depth information, in order to cutting apart of the one or more objects in each frame of improvement identification.

Example 23 comprises the theme of example 22, further comprises and uses solid or array camera to produce described pixel depth information.

Example 24 comprises the theme of any one in example 19-23, further comprises from user and receives the selection to described one or more objects.

Example 25 comprises the theme of example 24, further comprises from the click of carrying out to the described one or more objects framing or rap input and receive user and select.

Example 26 comprises the theme of any one in example 19-25, is further included in and cuts apart the selection of reception to described one or more objects before of each frame, wherein, only cuts apart selected one or more object.

Example 27 comprises the theme of any one in example 19-25, before being further included in the selection that receives the object to one or more trackings, follows the trail of described one or more object.

Example 28 comprises the theme of any one in example 19-27, further comprises generation rest image, and wherein, described one or more selected objects are from the frame different from giving framing.

Example 29 comprises the theme of any one in example 19-27, further comprises generation video, and wherein, described one or more selected objects are with respect to not starting in order to framing.

Example 30 comprises the theme of any one in example 19-27, further comprises generation visual media, wherein, only plays described one or more selected objects, keeps the remainder transfixion to framing simultaneously.

Example 31 comprises the theme of any one in example 19-27, further comprise generation visual media, wherein, can select the one or more objects in the particular frame of video, so that selected one or more object is with respect to the remainder animation of described particular frame.

Example 32 is a kind of mobile computing systems, is configured at least one computer program of any one in running example 18-31.

The above stated specification of exemplary embodiment is provided for the object illustrating and illustrate.That it is not intended to exhaustive or present disclosure is confined to disclosed accurate form.According to present disclosure, many modifications and variations are possible.The scope of present disclosure is not intended to be limited by this embodiment part, but is limited by appending claims.The following application that requires the application's priority of submitting to can require disclosed theme by different way, and can comprise generally as the combination in any of one or more restrictions of disclosed or demonstration by different way herein.

Claims

1. a method, comprising:

Each frame of video is divided into its semantic component, and to identify the one or more objects in each frame scene based on corresponding pixel groups, wherein, described video is a part for media;

Receive the selection to the one or more objects in framing scene;

Frame by frame is followed the trail of described one or more object from the frame of video, and to identify corresponding pixel groups, wherein said corresponding pixel groups comprises the described one or more objects in each frame; And

Described media are carried out to the processing of Alpha-masking-out, isolate one or more selected objects with frame by frame.

2. method according to claim 1, wherein, described media are carried out to the processing of Alpha-masking-out and comprise:

Form the transparent masking-out matching with the shape of the described one or more selected objects to framing scene, play described video to allow the one or more holes by being created by described transparent masking-out, wherein, upgrade the shape in the described one or more holes in given scenario for each frame of described video, to mate the shape of the described one or more selected objects in the frame of just playing; Or

Described one or more selected objects in each frame form transparent masking-out, described to playing described video on framing scene to allow by the described one or more selected objects in the described frame of just playing are copied to.

3. method according to claim 1, wherein, is used without supervision figure cutting method and carries out cutting apart of each frame to described video.

4. method according to claim 1, further comprises pixel depth information, in order to cutting apart of one or more objects in each frame of improvement identification.

5. method according to claim 4, further comprises and produces described pixel depth information by stereocamera or array camera.

6. method according to claim 1, further comprises from user and receives the selection to described one or more objects.

7. method according to claim 6, further comprises the click by carrying out on the described one or more objects in framing or raps input the selection that receives described user.

8. method according to claim 1, is further included in and cuts apart the selection of reception to described one or more objects before of each frame, wherein, only cuts apart selected one or more object.

9. method according to claim 1, before being further included in the selection that receives the object to one or more trackings, follows the trail of described one or more object.

10. according to the method described in any one in claim 1-9, further comprise generation rest image, wherein, described one or more selected objects come from to the different frame of framing.

11. according to the method described in any one in claim 1-9, further comprises generation video, and wherein, described one or more selected objects are with respect to not starting in order to framing.

12. according to the method described in any one in claim 1-9, further comprises generation visual media, wherein, only plays described one or more selected objects, keeps the remainder transfixion to framing simultaneously.

13. according to the method described in any one in claim 1-9, further comprise generation visual media, wherein, can select the one or more objects in the particular frame of described video, so that selected one or more object is with respect to the remainder animation of described particular frame.

14. at least one computer program of encoding with instruction, in the time carrying out described instruction by one or more processors, cause carrying out the method as described in any one in claim 1-9.

15. 1 kinds of computing equipments, comprising:

Processor;

Memory, it can be by described processor access; And

Application program, it is stored on described memory, and can be carried out by described processor, and described application program is configured to:

Receive the selection to the one or more objects in framing scene;

16. equipment according to claim 15, wherein, carry out the processing of Alpha-masking-out to described media and comprise:

Form and the transparent masking-out matching from the shape of described described one or more selected objects to framing scene, play described video to allow the one or more holes by being created by described transparent masking-out, wherein, upgrade the shape in the described one or more holes in given scenario for each frame of described video, to mate the shape of the described one or more selected objects in the frame of just playing; Or

Described one or more selected objects in each frame form transparent masking-out, described to playing described video on framing scene to allow by the described one or more selected objects in the frame of just playing are copied to.

17. according to the equipment described in claim 15 or 16, further comprise at least one input equipment that is operationally coupled to the display of described processor and is operationally coupled to described processor, wherein, user can select to the described one or more objects in framing scene with described at least one input equipment.

18. according to the equipment described in claim 15 or 16, further comprises the touch screen display that is coupled to described processor, and wherein, described touch screen is configured to input from user the selection receiving described one or more objects.

19. 1 kinds of methods, comprising:

Receive the selection to the one or more objects in framing scene;

Frame by frame is followed the trail of described one or more object from the frame of described video, and to identify corresponding pixel groups, wherein said corresponding pixel groups comprises the described one or more objects in each frame;

Described media are carried out to the processing of Alpha-masking-out, isolate one or more selected objects with frame by frame; And

Produce new media with described one or more selected objects.

20. methods according to claim 19, wherein, described media are carried out to the processing of Alpha-masking-out and comprise:

21. methods according to claim 19, wherein, described new media comprise rest image, wherein, described one or more selected objects come from to the different frame of framing.

22. methods according to claim 19, wherein, described new media comprise video, wherein, described one or more selected objects are with respect to not starting in order to framing.

23. methods according to claim 19, wherein, described new media comprise visual media, wherein, only play described one or more selected objects, keep the remainder transfixion to framing simultaneously.

24. methods according to claim 19, wherein, described new media comprise visual media, wherein, can select the one or more objects in the particular frame of described video, so that selected one or more object is with respect to the remainder animation of described particular frame.

25. at least one computer program of encoding with instruction, in the time carrying out described instruction by one or more processors, cause carrying out the method as described in any one in claim 19-24.