WO2022219202A1 - System and method for rendering key and fill video streams for video processing - Google Patents
System and method for rendering key and fill video streams for video processing Download PDFInfo
- Publication number
- WO2022219202A1 WO2022219202A1 PCT/EP2022/060310 EP2022060310W WO2022219202A1 WO 2022219202 A1 WO2022219202 A1 WO 2022219202A1 EP 2022060310 W EP2022060310 W EP 2022060310W WO 2022219202 A1 WO2022219202 A1 WO 2022219202A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- fill
- video
- frame
- key
- images
- Prior art date
Links
- 238000012545 processing Methods 0.000 title claims abstract description 56
- 238000009877 rendering Methods 0.000 title claims abstract description 25
- 238000000034 method Methods 0.000 title abstract description 35
- 239000002131 composite material Substances 0.000 claims abstract description 50
- 239000012634 fragment Substances 0.000 claims description 10
- 238000004519 manufacturing process Methods 0.000 description 25
- 238000003860 storage Methods 0.000 description 25
- 230000006835 compression Effects 0.000 description 13
- 238000007906 compression Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 7
- 238000009826 distribution Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 239000002609 medium Substances 0.000 description 6
- 230000002093 peripheral effect Effects 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000005236 sound signal Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000007630 basic procedure Methods 0.000 description 1
- 239000003990 capacitor Substances 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- RGNPBRKPHBKNKX-UHFFFAOYSA-N hexaflumuron Chemical compound C1=C(Cl)C(OC(F)(F)C(F)F)=C(Cl)C=C1NC(=O)NC(=O)C1=C(F)C=CC=C1F RGNPBRKPHBKNKX-UHFFFAOYSA-N 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 239000013587 production medium Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
- H04N19/88—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving rearrangement of data among different coding units, e.g. shuffling, interleaving, scrambling or permutation of pixel data or permutation of transform coefficient data among different blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/59—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
Definitions
- the present disclosure generally relates to video editing and content creation, and, more particularly, to a system and method for rendering key and fill proxy streams that consume a single video decoder and can be used in a system with limited processing resources.
- Video editors and content creation systems typically want to use video elements with varying levels of transparency. For example, an editor may want to add an animated caption in a lower-third of the picture or a logo in the upper-right. To add these elements, they can be overlaid on top of a traditional video track. In practice, these elements are usually created in an application, such as Adobe’s After Effects, and exported as a sequence of still images that support an alpha-channel, for example, PNG files.
- an application such as Adobe’s After Effects
- alpha compositing or alpha blending is the process of combining one image with a background to create the appearance of partial or full transparency.
- Alpha compositing is often useful to render picture elements (e.g., pixels) in separate passes or layers and then combine the resulting 2D images into a single, final image called the composite.
- an “alpha channel” is the additional per-pixel value that is combined with the Red, Green and Blue values to specify an opacity level (e.g., opaque, partial or full transparency) to ultimately create the composite.
- a “key” stream that is a monochrome video track that defines the transparency of each part (e.g., each pixel value) of the image
- a “fill” stream that defines the content to which the transparency is applied.
- a graphics processing unit GPU
- GPU will typically apply the transparency levels defined by the key stream to the fill stream to create the composite content that can then be rendered on a display.
- a system and method is needed that reduces total bandwidth consumption and minimizes the resource requirements for rendering key and fill proxy streams.
- a system and method is disclosed herein that provides for rendering key and fill proxy streams that consume a single video decoder and can be used in a system with limited processing resources.
- a system for rendering key and fill video streams for video processing.
- the system can include a database configured to store video data including at least one video stream comprising a sequence of images and alpha data defining transparency levels for each portion of each image in the sequence of images of the at least one video stream.
- the system includes a content provided with an encoder configured to encode each frame of the sequence of images to generate encoded composite frames that each contain a fill portion that corresponds to the video data and a key portion that is monochrome and defines transparency levels for each portion of the respective images, such that each encoded composite frame includes both the fill and key portions that are disposed horizontally and side-by side with respect to each other.
- a client device including a video editing application includes a Tenderer having a single decoder and is configured to decode each encoded composite frame to extract pixel values from the fill portion and respective transparency levels from the key portion, and further configured to apply the respectively transparency levels to the corresponding extracted pixel values to generate a proxy of the respective frame to be displayed by a video processing device.
- the fill portion comprises a higher horizontal resolution than a horizontal resolution of the key portion.
- video processing device is a video editing device in one aspect.
- the encoder can be configured to generate each encoded frame to allocate 75% of the frame to the fill portion and 25% of the frame to the key portion.
- the encoder is configured to generate each encoded frame to allocate 75% of the frame to the fill portion and 25% of the frame to the key portion.
- the encoder is configured to generate the encoded composite frames based at least partially on source data of the video data that includes R, G, B data and the alpha data.
- the system includes a user interface configured to define a picture / transparency ratio between the fill portion and the key portion.
- the Tenderer includes a fragment shader configured to render the proxy by applying the respectively transparency levels to the corresponding extracted pixel values.
- a system for rendering key and fill video streams for video processing.
- the system includes a database configured to store video data including a sequence of images and alpha data defining transparency levels for each portion of each image in the sequence of images; an encoder configured to encode each frame of the sequence of images to generate encoded composite frames that each contain a fill portion that corresponds to the video data and a key portion that is monochrome and defines transparency levels for each portion of the respective images, such that each encoded composite frame includes both the fill and key portions that are disposed horizontally and side-by side with respect to each other; and a Tenderer configured to receive a stream of the encoded composite frames and having a single decoder that is configured to decode each encoded composite frame to extract pixel values from the fill portion and respective transparency levels from the key portion, and further configured to apply the respectively transparency levels to the corresponding extracted pixel values to generate a proxy of the respective frame to be displayed by a video processing device.
- a system for rendering key and fill video streams for video processing.
- the system includes an encoder configured to encode each frame of a sequence of images to generate encoded composite frames that each contain a fill portion that corresponds to video data of the sequence of images and a key portion that is monochrome and defines respective transparency levels, such that each encoded composite frame includes both the fill portion and the key portion; and a Tenderer configured to receive a stream of the encoded composite frames and configured to decode each encoded composite frame to extract pixel values from the fill portion and respective transparency levels from the key portion, and further configured to apply the respectively transparency levels to the corresponding extracted pixel values to generate an output video comprising modified frames to be displayed by a video processing device.
- FIG. 1 is a block diagram of a system for rendering key and fill proxy streams for video processing according to an exemplary aspect.
- FIG. 2 is a block diagram of components shown in FIG. 1 of the system for rendering key and fill proxy streams for video processing according to an exemplary aspect.
- FIG. 3 illustrates an example of an encoded image resulting from the method of rendering key and fill proxy streams for video processing according to an exemplary aspect.
- FIG. 4 is a flowchart illustrating a method for rendering key and fill proxy streams for video processing according to an exemplary aspect.
- FIG. 5 is a block diagram illustrating a computer system on which aspects of systems and methods for rendering key and fill proxy streams for video processing according to an exemplary aspect.
- processors include microprocessors, microcontrollers, graphics processing units (GPUs), central processing units (CPUs), application processors, digital signal processors (DSPs), reduced instruction set computing (RISC) processors, systems on a chip (SoC), baseband processors, field programmable gate arrays (FPGAs), programmable logic devices (PLDs), state machines, gated logic, discrete hardware circuits, and other suitable hardware configured to perform the various functionality described throughout this disclosure.
- processors in the processing system may execute software.
- Software shall be construed broadly to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software components, applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, etc., whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise.
- the functions and algorithms described herein may be implemented in hardware, software, or any combination thereof. If implemented in software, the functions may be stored on or encoded as one or more instructions or code on a computer-readable medium.
- Computer-readable media may include transitory or non-transitory computer storage media for carrying or having computer-executable instructions or data structures stored thereon. Both transitory and non-transitory storage media may be any available media that can be accessed by a computer as part of the processing system.
- such computer-readable media can comprise a random-access memory (RAM), a read-only memory (ROM), an electrically erasable programmable ROM (EEPROM), optical disk storage, magnetic disk storage, other magnetic storage devices, combinations of the aforementioned types of computer-readable media, or any other medium that can be used to store computer executable code in the form of instructions or data structures that can be accessed by a computer.
- RAM random-access memory
- ROM read-only memory
- EEPROM electrically erasable programmable ROM
- optical disk storage magnetic disk storage
- magnetic disk storage other magnetic storage devices
- combinations of the aforementioned types of computer-readable media or any other medium that can be used to store computer executable code in the form of instructions or data structures that can be accessed by a computer.
- RAM random-access memory
- ROM read-only memory
- EEPROM electrically erasable programmable ROM
- optical disk storage magnetic disk storage
- magnetic disk storage other magnetic storage devices
- FIG. 1 is a block diagram of a system for rendering key and fill proxy streams for video processing according to an exemplary aspect.
- the system 100 is shown to be in the context of a media production environment for media broadcasting and content consumption.
- the disclosed system and method can be implemented in any type of video editing and content creation environments according to exemplary aspects.
- system 100 includes a plurality of content providing devices 102A and 102B.
- the plurality of content generating devices 102 A and 102B can be configured for an A/V feed (e.g., media streams) across links via the network 110.
- the plurality of content providing devices 102 A and 102B can also include, for example, remote cameras configured to capture live media content, such as the “talent” (e.g., news broadcasters, game commentators, or the like).
- the content providing devices 102A and 102B can include Esports (electronic sports) real-time content, or the like.
- the content providing devices 102A and 102B can be implemented as one or more remote media content servers, for example, that are configured to store media content (e.g., sequences if video images and content streams) and distribute this content through the network 110.
- the plurality of content providing devices 102 A and 102B can be coupled to a communication network, such as the Internet 110, and/or hardware conducive to intemet protocol (IP). That is, system 100 can be comprised of a network of servers and network devices configured to transmit and receive video and audio signals (and ancillary data) of various formats.
- IP intemet protocol
- system 100 can be comprised of a network of servers and network devices configured to transmit and receive video and audio signals (and ancillary data) of various formats.
- the processing components of system 100 can be executed in part of a cloud computing environment, which can be coupled to network 110.
- the media production system 101 can be configured to access the video and audio signals and/or streams generated and/or provided by the content providing devices 102 A and 102B, or information related to the various signals and content presented therein.
- cloud computing environments or cloud platforms are a virtualization and central management of data center resources as software-defined pools.
- Cloud computing provides the ability to apply abstracted compute, storage, and network resources to the work packages provided on a number of hardware nodes that are clustered together forming the cloud.
- the plurality of nodes each have their specialization, e.g., for running client micro services, storage, and backup.
- a management software layer for the application platform offered by the cloud will typically be provided on a hardware node and will include a virtual environment manager component that starts the virtual environments for the platform and can include micro- services and containers, for example.
- one or more of the components (or work packages) of system 100 that can be implemented in the cloud platform as described herein.
- system 100 can include one or more remote distribution node(s) 127, one or more processing node(s) 128, and one or more remote production switcher(s) 151.
- these components can be implemented as hardware components at various geographical locations or, in the alternative, as processing components as part of a cloud computing environment.
- the one or more distribution nodes 127 e.g., electronic devices
- the one or more distribution nodes 127 are configured to distribute the production media content from the media production system 101 to one or more distribution nodes (e.g., remote media devices), such as receivers 117A and 117B, which can be content consuming devices (e.g., televisions, computing devices, or the like), for example.
- the network can include any number of content consuming devices configured to receive and consume (e.g., playout) the media content, with such content consuming devices even being distributed across different countries or even different continents.
- the system 100 can be configured as a media network for real-time production and broadcasting of video and audio content.
- system 100 can include one or more content creation devices 115 (also considered to be video editors) for editing and processing video content as part of the media production environment.
- the content creation device 115 is communicatively coupled to network 110 and thus to the other components of system 100 through network 110.
- the content creation device 115 can be configured to receive media content (e.g., video stream and sequences of video images) from the one or more components, such as content providing devices 102A and 102B, and further configured to display this content within the confines of a video editing software (e.g., Adobe Premier®) to perform editing decisions and functions on the content before it is then finalized for the media production.
- a video editing software e.g., Adobe Premier®
- the content creation device 115 can be included as part of media production system 101 and/or communicatively coupled directly to media production system 101. Moreover, the content creation device 115 can be configured to decode and display the media content as a proxy signal on the video editing software to perform the video editing functions by the user of the content creation device 115.
- distribution node(s) 127 can be configured to distribute the finalized media content (that may be generated by media production system 101) throughout the distribution network to one or more processing node(s) 118, which may include a mix/effects engine, keyer or the like.
- remote distribution node(s) 127 can be configured to feed remote processing node(s) 128 via a direct link, or via Internet 103 connection. Examples of remote distribution node(s) 127 and processing node(s) 128 may include remote production switchers similar to remote production switcher 151 or remote signal processors.
- media content captured or otherwise generated/provided by content providing devices 102A and 102B will be captured in a high or broadcast quality encoding (e.g., HDR format).
- a high or broadcast quality encoding e.g., HDR format
- remote video editors in content creation and processing environments such as system 100 shown in Figure 1
- only require a lower or proxy quality encoding e.g., at a lower resolution
- the system and method disclosed herein takes advantage of the fact that the spatial quality of the “alpha” portion of the image (i.e., of the media content that is being edited) can be in a lower resolution than the visible portion of the image.
- FIG. 2 is a block diagram of components shown in FIG. 1 of the system for rendering key and fill proxy streams for video processing according to an exemplary aspect.
- the system 200 includes a content provider 210 and a content editor 230.
- the content provider 210 can correspond to any component of system 100 for providing content using the exemplary algorithms and techniques described herein.
- content provider 210 can be one or more of content providing devices 102A and 102B and/or media production system 101, as described above.
- content editor 230 can correspond to content creation device 115 of FIG. 1 and is configured to receive and decode content streams for video editing and content creation processing when the decoded content is rendered on a display for the video editing application.
- content provider 210 includes electronic memory 211 (e.g., a memory buffer or database of a sequence of image data) configured to receive media content (e.g., from content source 250) that is stored thereon.
- the media content can be stored as a plurality of source images 212 (e.g., a PNG file) that contains pixel values (e.g., RGB data 214), which is the fill data of the content, and alpha data 216, which is the monochrome video data that defines transparency levels for each portion of the respective source image 212.
- source images 212 e.g., a PNG file
- pixel values e.g., RGB data 214
- alpha data 216 which is the monochrome video data that defines transparency levels for each portion of the respective source image 212.
- the alpha data 216 is data for each pixel value (e.g., RGB data 214) that is stored in an alpha channel with a value ranging from 0 to 1.
- a value of “0” means that the pixel is fully transparent and does not provide any coverage information (i.e., there is no occlusion at the image pixel window because the geometry did not overlap this pixel)
- a value of “1” means that the pixel is fully opaque because the geometry completely overlaps the pixel window.
- one the editing application is Adobe’s After Effects, for example, the user can composite visual layers together. Moreover, the layers can be fully opaque, or partially transparent.
- anti-aliasing is usually performed on shape edges which will render some of those pixels partially transparent.
- the content provider 210 also includes encoder 218, transmitter 222, and optionally user interface and control 220.
- the encoder 218 can be implemented as a separate component and, for example, can be a software module implemented as a cloud service.
- Encoder 218 is configured to convert a sequence of source images (e.g., PNG images) to a compression format (e.g., H.264).
- the encoder 218 is configured to create a video stream in which the picture contains both the “key” portion (e.g., RGB data 214) and the “fill” portion (e.g., alpha data 216).
- both the key and fill portions will be squeezed horizontally (e.g., formatted) so they sit side-by-side as a single image or frame.
- the horizontal resolution of the “fill” portion will be higher than the “key” portion, since the “fill” portion is directly visible to the user as the RBG pixel values of the image.
- the “key” portion of the image will be monochrome, with fully black indicating full transparency, and white indicating full opacity according to an exemplary aspect.
- the encoder 218 is configured to separately take the RGB data 214 and alpha data 216 to generate an encoded composite image that includes both the fill portion and key portion.
- the encoder 218 creates the composite image dependent at least partially on the source data.
- the PNG data can be passed to a PNG decoder which will return a buffer that can contain 32 bits of data for each pixel (R, G, B, A), although the exact layout can vary.
- the encoder 218 can then be configured to duplicate that data and on one copy, the encoder 218 can resize the image for the fill portion, and copy out the RGB data. On the other copy, the encoder 218 can resize for the key portion and copy out the A data.
- a compression ratio can be defined between the fill portion and the key portion.
- this compression ratio can be predefined by a system administrator and set in advance of the video editing. For example, for given a source image 212, having a target video resolution of 512x288, the fill portion (e.g., RGB data 214) can be decoded and compressed to 75% of the target horizontal size (384x288). Moreover, the alpha data 216 can be decoded and compressed to 25% of the target size (128x288). That is, the left 75% of the composite image contains fill data and the right 25% of the composite image contains key data according to the exemplary aspect, although this ratio can be configured for particular system requirements as described herein.
- the fill portion e.g., RGB data 2114
- the smaller key portion e.g., alpha data 2136
- the encoder 218 can be configured to encode the sequence of these pictures to H.264 compression format or similar compression format, for example. It should be appreciate that while the exemplary aspect uses H.264 compression format, the exemplary system and method can be configured to work with other compression formats including, for example, H.265, VP8, VP9, MPEG2 or the like.
- the compression ratio is set to a 75% to 25% ratio in an exemplary aspect, the exact relative sizes may need to vary slightly to ensure that the dividing line does not bisect a macroblock, for example.
- the content provider 210 can optionally include a user interface and control 220 (e.g., a GUI) in an exemplary aspect.
- the defined ratio of picture/transparency e.g., 75% to 25%
- the resolution assigned to the key component e.g., alpha data 216) should be lower than the fill portion (e.g., RGB data 214).
- the content provider 210 includes a transmitter 222 that is configured to transmit the encoded composite image to content editor 230.
- Content editor 230 includes receiver 232 configured to receive the encoded composite image and load the received content to decoder 234.
- both transmitter 222 and receiver 232 can include standard input/output (I/O) ports, such as network interface controllers, that are configured to transmit and receive media streams as would be appreciated to those skilled in the art.
- content editor 230 includes decoder 234 (i.e., a single decoder) that is configured to decode the encoded composite image received from content provider 210, which is in a compressed format such as the H.264 compression format.
- the decoded image is then passed to a renderer/compositor, which can be implemented by a graphics processing unit (GPU) 236, for example.
- GPUs are specialized electronic circuits that are constructed to rapidly manipulate and alter their memory contents to accelerate the creation of images in a frame buffer that is configured for output to a display device (e.g., the display screen 238 of the content editor 230).
- Existing GPUs are very efficient at manipulating computer graphics and image processing by providing a highly parallel structure that processes large blocks of data in parallel.
- the renderer/compositor executed by GPU 236 is configured to extract the pixel values from the key portion (e.g., alpha data 216) of the frame, and apply them as transparency values to the left part of the frame, i.e., the fill portion (e.g., RGB data 214).
- rendering can be performed using WebGL or OpenGL, so a simple fragment shader can be defined and executed by GPU 236 to perform this operation.
- a shader is a type of software code (i.e., a computer program) that, when executed by a processor and specifically GPU 236, is for shading in 3D scenes (i.e., the production of appropriate levels of light, darkness, and color in a rendered image).
- a fragment shader is the stage of the shader (also executed by GPU 236) that is configured to determine the RGB A values for each rendered pixel of the frame. It is also noted that the decoded output frame will typically be in YUV color-space, so additional shader code may be needed to convert from YUV color-space to RGB according to an exemplary aspect.
- the content i.e., the fill portion with transparency values applied thereto
- the intended use of the rendered picture in the video editing and content creation environment is for proxy display (i.e., at the lower resolution) since the user of the video editing software has a reduced expectation and need for picture quality.
- This proxy video is displayed on display screen 238 and enables a user of content editor 230 to perform typical video editing and content creation processing using user interface and control 240.
- the resulting editing video content (e.g., video files for streaming content) can be stored in memory 242 of content provider 230, which can be included in content creation device 115 or as a separate memory device, such as another video server according to another exemplary aspect.
- the exemplary embodiments can be provided in other aspects of video product that utilize proxy signals.
- the content fill with transparency levels
- the content can be rendered and displayed as a source on a tile of a multiviewer display (i.e., as a proxy signal) according to an exemplary aspect.
- the edited video content can then be transmitted from content creation device 115 to media production system 101 according to an exemplary aspect.
- media production system 101 can be a remote production switcher or remote media production control center for finalizing the media production before distributing to content consuming devices, such as receiver 117A or 117B, for example.
- FIG. 3 illustrates an example of an encoded image resulting from the method of rendering key and fill proxy streams for video processing according to an exemplary aspect.
- image 310 illustrates a video frame or image (e.g., a PNG file) before encoding by encoder 218 using the techniques and algorithms described above.
- the resulting encoded composite image is shown as image 320 after encoded by encoder 218 as described above.
- the image 320 includes a visible portion on the left and transparency levels on the right.
- white indicates full opacity whereas black indicates full transparency.
- the renderer/compositor executed by GPU 236 is configured to render this encoded composite image 320 as also described above before the result image is displayed by the video editor, which can then be edited accordingly via user interface and control 240, for example.
- the compression ratio can be a predefined ratio for the fill and key portions (e.g., 75% to 25%).
- the fragment shader i.e., executed by the GPU 2366
- the fragment shader can be configured to use a hard-coded value of 0.75 as the dividing point between the fill and key portions of the encoded composite image.
- this compression can be made more flexible by making the division value a shader uniform that can be specified dynamically, for example, by user interface and control 220 of content provider 210.
- the result of applying this shader is that the transparency is correctly reassembled by GPU 236, and the render output looks similar to the first image (e.g., image 310), albeit with lower quality as a proxy signal.
- a system for video editing and content creation the provides for rendering key and fill proxy streams that consume a single video decoder (e.g., decoder 234) and can be used in a system (e.g., content editor 230) with limited processing resources.
- the disclosed system and method significantly reduces resource requirements for decoding and rendering key and fill video elements by sacrificing spatial quality of the overall rendered image. This is achieved by combining both key and fill portions into the same encoded video stream (i.e., a sequence of the encoded composite images described above) and uses a fragment shader (e.g., a GLSL fragment shader) to render the frames correctly on a display screen for subsequent editing thereof.
- a fragment shader e.g., a GLSL fragment shader
- FIG. 4 is a flowchart illustrating a method for rendering key and fill proxy streams for video processing according to an exemplary aspect.
- the method 400 may be performed using one or more of the exemplary systems described above with respect to FIGS. 1 and 2.
- image data e.g., RGB data 21
- alpha data e.g., alpha data 21
- the image data can be generated by a production camera, a video server or any similar electronic component configured to generate or otherwise provide a sequence of video images.
- each image of the video sequence is encoded to include both the key portion from the alpha data 216 and the fill portion from the image data 214.
- An example of the encoded composite image is shown as image 320 in FIG. 3 in which the fill portion is on the left side and lined up side-by-side with the key portion on the right side of the image.
- the sequence of encoded composite images is transmitted (or otherwise provided) to the video editor and content creator. That is, at the request of a client device to obtain media content for editing, for example, to prepare a video production, the content provided 210 can be configured to transmit the encoded images to the client device to be rendered by a video editor (e.g., Adobe Premier).
- the client device e.g., content creation device 115
- receives the encoded content which will be encoded using the H264 compression format, and decodes it by a single decoder 234.
- a fragment shader executed by GPU 236 (e.g., a Tenderer or compositor) renders the content by applying the key portion (i.e., the transparency levels) to each portion (e.g., each pixel value) of the fill portion of the image.
- the resulting image (at a lower quality as a proxy signal) is rendered by the video editing interface at step 404.
- the user of content creation device 115 can perform conventional editing and content creation processes on the rendered content by user interface and control 240, for example.
- the edited content can be stored in memory 242 and then can subsequently be used as part of a media production at step 406 using conventional techniques as described above and would be appreciated to one skilled in the art.
- both he “key” data and the “fill” data are included in a single composite image that can be decoded by a single decoder. This reduces the load on the decoding resources of the video editor and also reduces the required bandwidth that conventional systems use by transmitted separate “key” streams and “fill” streams.
- FIG. 5 is a block diagram illustrating a computer system on which aspects of systems and methods for rendering key and fill proxy streams for video processing according to an exemplary aspect.
- the computer system 20 can correspond to any computing system configured to execute the systems and methods described above, including the content provider 210 and/or content editor 230 (and specifically content creation device 115), for example.
- the computer system 20 can be in the form of multiple computing devices, or in the form of a single computing device, for example, a desktop computer, a notebook computer, a laptop computer, a mobile computing device, a smart phone, a tablet computer, a server, a mainframe, an embedded device, and other forms of computing devices.
- the computer system 20 includes a central processing unit (CPU) 21, a system memory 22, and a system bus 23 connecting the various system components, including the memory associated with the central processing unit 21.
- the system bus 23 may comprise a bus memory or bus memory controller, a peripheral bus, and a local bus that is able to interact with any other bus architecture. Examples of the buses may include PCI, ISA, PCI-Express, HyperTransportTM, InfiniBandTM, Serial ATA, I2C, and other suitable interconnects.
- the central processing unit 21 (also referred to as a processor) can include a single or multiple sets of processors having single or multiple cores.
- the processor 21 may execute one or more computer-executable codes implementing the techniques of the present disclosure.
- the system memory 22 may be any memory for storing data used herein and/or computer programs that are executable by the processor 21.
- the system memory 22 may include volatile memory such as a random access memory (RAM) 25 and non-volatile memory such as a read only memory (ROM) 24, flash memory, etc., or any combination thereof.
- the basic input/output system (BIOS) 26 may store the basic procedures for transfer of information between elements of the computer system 20, such as those at the time of loading the operating system with the use of the ROM 24.
- the computer system 20 may include one or more storage devices such as one or more removable storage devices 27, one or more non-removable storage devices 28, or a combination thereof.
- the one or more removable storage devices 27 and non-removable storage devices 28 are connected to the system bus 23 via a storage interface 32.
- the storage devices and the corresponding computer-readable storage media are power-independent modules for the storage of computer instructions, data structures, program modules, and other data of the computer system 20.
- the system memory 22, removable storage devices 27, and non-removable storage devices 28 may use a variety of computer-readable storage media.
- Examples of computer-readable storage media include machine memory such as cache, SRAM, DRAM, zero capacitor RAM, twin transistor RAM, eDRAM, EDO RAM, DDR RAM, EEPROM, NRAM, RRAM, SONOS, PRAM; flash memory or other memory technology such as in solid state drives (SSDs) or flash drives; magnetic cassettes, magnetic tape, and magnetic disk storage such as in hard disk drives or floppy disks; optical storage such as in compact disks (CD- ROM) or digital versatile disks (DVDs); and any other medium which may be used to store the desired data and which can be accessed by the computer system 20.
- the one or more removable storage devices 27 can correspond to scene script database 225, for example.
- the system memory 22, removable storage devices 27, and non-removable storage devices 28 of the computer system 20 may be used to store an operating system 35, additional program applications 37, other program modules 38, and program data 39.
- the computer system 20 may include a peripheral interface 46 for communicating data from input devices 40, such as a keyboard, mouse, stylus, game controller, voice input device, touch input device, or other peripheral devices, such as a printer or scanner via one or more I/O ports, such as a serial port, a parallel port, a universal serial bus (USB), or other peripheral interface.
- input devices 40 such as a keyboard, mouse, stylus, game controller, voice input device, touch input device, or other peripheral devices, such as a printer or scanner via one or more I/O ports, such as a serial port, a parallel port, a universal serial bus (USB), or other peripheral interface.
- I/O ports such as a serial port, a parallel port, a universal serial bus (USB), or other peripheral interface.
- a display device 47 such as one or more monitors, projectors, or integrated display, may also be connected to the system bus 23 across an output interface 48, such as a video adapter, and can be configured to generate user interface 205, for example.
- the computer system 20 may be equipped with other peripheral output devices (not shown), such as loudspeakers and other audiovisual devices.
- the display device 47 may be configured for the video editing and content creation applications as described herein.
- the computer system 20 may operate in a network environment, using a network connection to one or more remote computers 49.
- the remote computer (or computers) 49 may be local computer workstations or servers comprising most or all of the aforementioned elements in describing the nature of a computer system 20.
- the remote computer (or computers) 49 can correspond to any one of the remote processing nodes or client devices as described above with respect to FIGS. 1 and 2 as well as generally to a cloud computing platform for configuring the video editing and media production system.
- the computer system 20 may include one or more network interfaces 51 or network adapters for communicating with the remote computers 49 via one or more networks such as a local-area computer network (LAN) 50, a wide-area computer network (WAN), an intranet, and the Internet (e.g., Internet 103).
- networks such as a local-area computer network (LAN) 50, a wide-area computer network (WAN), an intranet, and the Internet (e.g., Internet 103).
- Examples of the network interface 51 may include an Ethernet interface, a Frame Relay interface, SONET interface, and wireless interfaces.
- the above-noted components may be implemented using a combination of both hardware and software. Accordingly, in one or more example aspects, the functions described may be implemented in hardware, software, firmware, or any combination thereof.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Television Signal Processing For Recording (AREA)
- Image Generation (AREA)
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB2317591.2A GB2621515A (en) | 2021-04-16 | 2022-04-19 | System and method for rendering key and fill video streams for video processing |
DE112022002186.1T DE112022002186T5 (en) | 2021-04-16 | 2022-04-19 | SYSTEM AND METHOD FOR DISPLAYING KEY AND FILL VIDEO STREAMS FOR VIDEO PROCESSING |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202163176069P | 2021-04-16 | 2021-04-16 | |
US63/176,069 | 2021-04-16 | ||
US17/658,943 | 2022-04-12 | ||
US17/658,943 US11967345B2 (en) | 2021-04-16 | 2022-04-12 | System and method for rendering key and fill video streams for video processing |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022219202A1 true WO2022219202A1 (en) | 2022-10-20 |
Family
ID=81654562
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2022/060310 WO2022219202A1 (en) | 2021-04-16 | 2022-04-19 | System and method for rendering key and fill video streams for video processing |
Country Status (3)
Country | Link |
---|---|
DE (1) | DE112022002186T5 (en) |
GB (1) | GB2621515A (en) |
WO (1) | WO2022219202A1 (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6227456B2 (en) * | 2014-03-28 | 2017-11-08 | 株式会社エクシング | Music performance apparatus and program |
EP2675171B1 (en) * | 2012-06-11 | 2018-01-24 | BlackBerry Limited | Transparency information in image or video format not natively supporting transparency |
US20180255308A1 (en) * | 2016-12-28 | 2018-09-06 | Kt Corporation | Video transmitting device and video playing device |
US10497180B1 (en) * | 2018-07-03 | 2019-12-03 | Ooo “Ai-Eksp” | System and method for display of augmented reality |
US20200133694A1 (en) * | 2018-10-26 | 2020-04-30 | Nvidia Corporation | Individual application window streaming suitable for remote desktop applications |
-
2022
- 2022-04-19 DE DE112022002186.1T patent/DE112022002186T5/en active Pending
- 2022-04-19 WO PCT/EP2022/060310 patent/WO2022219202A1/en active Application Filing
- 2022-04-19 GB GB2317591.2A patent/GB2621515A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2675171B1 (en) * | 2012-06-11 | 2018-01-24 | BlackBerry Limited | Transparency information in image or video format not natively supporting transparency |
JP6227456B2 (en) * | 2014-03-28 | 2017-11-08 | 株式会社エクシング | Music performance apparatus and program |
US20180255308A1 (en) * | 2016-12-28 | 2018-09-06 | Kt Corporation | Video transmitting device and video playing device |
US10497180B1 (en) * | 2018-07-03 | 2019-12-03 | Ooo “Ai-Eksp” | System and method for display of augmented reality |
US20200133694A1 (en) * | 2018-10-26 | 2020-04-30 | Nvidia Corporation | Individual application window streaming suitable for remote desktop applications |
Also Published As
Publication number | Publication date |
---|---|
DE112022002186T5 (en) | 2024-04-25 |
GB2621515A (en) | 2024-02-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10249019B2 (en) | Method and apparatus for mapping omnidirectional image to a layout output format | |
US10951874B2 (en) | Incremental quality delivery and compositing processing | |
US11967345B2 (en) | System and method for rendering key and fill video streams for video processing | |
KR101342781B1 (en) | Transforming video data in accordance with three dimensional input formats | |
TWI606718B (en) | Specifying visual dynamic range coding operations and parameters | |
US9013536B2 (en) | Augmented video calls on mobile devices | |
US11197010B2 (en) | Browser-based video decoder using multiple CPU threads | |
US10244215B2 (en) | Re-projecting flat projections of pictures of panoramic video for rendering by application | |
CN111316625B (en) | Method and apparatus for generating a second image from a first image | |
JP7359521B2 (en) | Image processing method and device | |
JP2018537898A (en) | High dynamic range video layer representation and delivery including CRC code | |
US20220141548A1 (en) | Streaming Volumetric and Non-Volumetric Video | |
WO2021093882A1 (en) | Video meeting method, meeting terminal, server, and storage medium | |
US11259036B2 (en) | Video decoder chipset | |
Diaz et al. | Integrating HEVC video compression with a high dynamic range video pipeline | |
CN107580228B (en) | Monitoring video processing method, device and equipment | |
WO2023193524A1 (en) | Live streaming video processing method and apparatus, electronic device, computer-readable storage medium, and computer program product | |
US20200269133A1 (en) | Game and screen media content streaming architecture | |
CN111246208B (en) | Video processing method and device and electronic equipment | |
CN115988171A (en) | Video conference system and immersive layout method and device thereof | |
JP2016525297A (en) | Picture reference control for video decoding using a graphics processor | |
WO2022219202A1 (en) | System and method for rendering key and fill video streams for video processing | |
CN114245027B (en) | Video data hybrid processing method, system, electronic equipment and storage medium | |
US11785281B2 (en) | System and method for decimation of image data for multiviewer display | |
US20220303596A1 (en) | System and method for dynamic bitrate switching of media streams in a media broadcast production |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22723608 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 202317591 Country of ref document: GB Kind code of ref document: A Free format text: PCT FILING DATE = 20220419 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 112022002186 Country of ref document: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 13-02-2024) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22723608 Country of ref document: EP Kind code of ref document: A1 |