WO2016036285A1 - Video stream encoding using a central processing unit and a graphical processing unit - Google Patents

Video stream encoding using a central processing unit and a graphical processing unit Download PDF

Info

Publication number
WO2016036285A1
WO2016036285A1 PCT/SE2014/051004 SE2014051004W WO2016036285A1 WO 2016036285 A1 WO2016036285 A1 WO 2016036285A1 SE 2014051004 W SE2014051004 W SE 2014051004W WO 2016036285 A1 WO2016036285 A1 WO 2016036285A1
Authority
WO
WIPO (PCT)
Prior art keywords
encoder device
frames
gpu
cpu
encoding
Prior art date
Application number
PCT/SE2014/051004
Other languages
French (fr)
Inventor
Per Hermansson
Original Assignee
Telefonaktiebolaget L M Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget L M Ericsson (Publ) filed Critical Telefonaktiebolaget L M Ericsson (Publ)
Priority to PCT/SE2014/051004 priority Critical patent/WO2016036285A1/en
Publication of WO2016036285A1 publication Critical patent/WO2016036285A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/436Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/31Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the temporal domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/423Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/577Motion compensation with bidirectional frame interpolation, i.e. using B-pictures

Definitions

  • the invention relates to a method, encoder devices, a computer program and a computer program product for encoding a video stream.
  • a source video stream is encoded in a suitable format for distribution to video clients.
  • Each video client then decodes the video and renders the decoded video on a display for a user.
  • the process of video encoding is a matter of encoding the input video frames in accordance with a
  • encoding standard e.g. H.264, HEVC (High Efficiency Video Coding), MPEG (Moving Picture Experts Group)-2, etc.
  • HEVC High Efficiency Video Coding
  • MPEG Motion Picture Experts Group-2
  • a common GOP structure uses dyadic hierarchical picture prediction. When using this structure, frames are grouped in GOPs, e.g. with 8 pictures in each GOP. Inside the GOP, frames are assigned to layers where a layer can only have dependencies on a lower layer. The lowest temporal layer, the base layer, consists of only I-frames and/or P-frames.
  • GOP structures are not commonly used because they introduce unwanted latency. This extra latency is due to the need to buffer all the pictures in the group before
  • US 8731047 B2 discloses a method and apparatus for mixing first and second video content portions.
  • the method comprises overlapping at least one frame of a first content portion comprising multiple frames with at least one frame of a second content portion comprising multiple frames to produce a composite video content.
  • At least one reference frame comprising a frame of the first or second content portion prior to the overlapping may be
  • the composite video content and the reference frame may then be transmitted, for example, to a client device.
  • the client device may use the reference frame to at least reduce an effect of the overlapping.
  • the disclosed method can only be performed once the second content portion is known, i.e. after a selection of a new video input stream.
  • a method for encoding a video stream the method being performed in an encoder device comprising a central processing unit, CPU, and a graphic processing unit, GPU.
  • the method comprises the steps of: encoding, in the CPU, bi-directional predicted frames for a current group of pictures, GOP, of an input video stream currently selected as source stream of the encoder device; and encoding, in the GPU, any intra coded frames and/ or predicted frames for a future GOP of at least one input video stream currently not selected as source stream of the encoder device.
  • the encoder device switches source stream from one input video stream to another input video stream, the base layer (intra coded frames and/or predicted frames) has already been encoded for the next GOP, whereby the CPU of the encoder device only needs to encode the higher layers, i.e. the bi-directional predicted frames. This reduces latency and CPU usage.
  • the GPU can continue to encode the base layer of the next GOP, but now based on the newly selected source stream.
  • the method may further comprise the step of: selecting an input video stream as source stream of the encoder device; after which the method is repeated.
  • the method may further comprise the step of: moving any intra coded frames and/ or predicted frames for the future GOP from a memory associated with the GPU to a memory associated with the CPU.
  • the encoding is based on intra coded frames and/ or predicted frames previously encoded by the GPU.
  • the method may further comprise: encoding, in the CPU, any intra coded frames and/ or predicted frames for a current GOP of at least one input video stream currently selected as source stream of the encoder device when these frames have not previously been encoded by the GPU.
  • the method may further comprise the step of: moving any intra coded frames and/ or predicted frames of the input video stream currently selected as source stream from a memory associated with the CPU to a memory associated with the GPU.
  • the step of encoding bi-directional predicted frames and the step of encoding any intra coded frames and/or predicted frames maybe performed in parallel.
  • the step of encoding any intra coded frames and/or predicted frames maybe performed for a plurality of input video streams currently not selected as source stream of the encoder device, wherein different input video streams are encoded in different cores of the GPU.
  • the step of encoding any intra coded frames and/or predicted frames may further comprise encoding, in the GPU, any intra coded frames and/ or predicted frames for a future GOP of the input video stream currently selected as source stream of the encoder device.
  • an encoder device for encoding a video stream.
  • the encoder device comprises: a CPU; a GPU; a CPU memory storing CPU instructions that, when executed by the CPU, causes the encoder device to: encode bi-directional predicted frames for a current group of pictures, GOP, of an input video stream currently selected as source stream of the encoder device; and a GPU memory storing GPU instructions that, when executed by the GPU, causes the encoder device to: encode any intra coded frames and/ or predicted frames for a future GOP of at least one input video stream currently not selected as source stream of the encoder device.
  • the CPU instructions may further comprise instructions that, when executed by the CPU, causes the encoder device to: select an input video stream as source stream of the encoder device; and to repeat the encoding of bidirectional predicted frames.
  • the GPU instructions further comprise instructions that, when executed by the GPU, causes the encoder device to repeat the encoding of any intra coded frames and/ or predicted frames.
  • the CPU instructions may further comprise instructions that, when executed by the CPU, causes the encoder device to move any intra coded frames and/ or predicted frames for the future GOP from a memory associated with the GPU to a memory associated with the CPU.
  • the CPU instructions may further comprise instructions that, when executed by the CPU, causes the encoder device to encode bi-directional predicted frames based on intra coded frames and/ or predicted frames previously encoded by the GPU.
  • the CPU instructions may further comprise instructions that, when executed by the CPU, causes the encoder device to encode any intra coded frames and/ or predicted frames for a current GOP of at least one input video stream currently selected as source stream of the encoder device when these frames have not previously been encoded by the GPU.
  • the CPU instructions further comprise instructions that, when executed by the CPU, causes the encoder device to move any intra coded frames and/or predicted frames of the input video stream currently selected as source stream from a memory associated with the CPU to a memory associated with the GPU.
  • the CPU and the GPU may execute their respective instructions in parallel.
  • the GPU instructions further comprise instructions that, when executed by the GPU, causes the encoder device to encode any intra coded frames and/or predicted frames for a plurality of input video streams currently not selected as source stream of the encoder device, and wherein different input video streams are encoded in different cores of the GPU.
  • the CPU instructions to encode any intra coded frames and/or predicted frames for a future GOP comprise instructions that, when executed by the CPU, causes the encoder device to encode, in the GPU, any intra coded frames and/ or predicted frames for a future GOP of the input video stream currently selected as source stream of the encoder device.
  • an encoder device comprising: means for encoding, in a CPU of the encoder device, bi-directional predicted frames for a current group of pictures, GOP, of an input video stream currently selected as source stream of the encoder device; and means for encoding, in the GPU of the encoder device, any intra coded frames and/or predicted frames for a future GOP of at least one input video stream currently not selected as source stream of the encoder device.
  • a computer program for encoding a video stream comprising computer program code which, when run on an encoder device comprising a CPU and a GPU, causes the encoder device to: encode, in the CPU, bi-directional predicted frames for a current group of pictures, GOP, of an input video stream currently selected as source stream of the encoder device; and encode, in the GPU, any intra coded frames and/ or predicted frames for a future GOP of at least one input video stream currently not selected as source stream of the encoder device.
  • a computer program product comprising a computer program according to the fourth aspect and a computer readable means on which the computer program is stored.
  • Fig l is a schematic diagram illustrating an encoder device having multiple input video stream and one output video stream according to one
  • Fig 2 is a schematic diagram illustrating how encoding of video frames is performed in the encoder device of Fig 1 using GOPs according to one embodiment
  • Fig 3 is a schematic entity relationship diagram illustrating entities related to encoding in the encoder device of Fig 1 according to one embodiment
  • Fig 4 is a schematic diagram showing some components of the encoder device of Fig 1;
  • Figs 5A-B are flow charts illustrating embodiments of methods for encoding a video stream performed in the encoder device of Figs 1 and 4;
  • Fig 6 is a schematic diagram showing functional modules of the encoder device 1 of Figs 1 and 4; and Fig 7 shows one example of a computer program product 90 comprising computer readable means.
  • Fig l is a schematic diagram illustrating an encoder device having multiple input video stream and one output video stream according to one
  • the encoder device ⁇ is here connected to n input video streams 2oa-n, from n respective video sources 2ia-n.
  • the encoder device ⁇ selects one of the input video streams 2oa-n to be a source stream.
  • the selected source stream is encoded and the encoded video stream is provided as an output stream 22 of the encoder device 1.
  • Each input video stream 2oa-n is of a format which can be understood by the encoder device 1.
  • each input video stream 2oa-n can be of a raw video format (uncompressed) or it can be encoded in a video format which the encoder device 1 can decode.
  • the encoder device 1 encodes the video of the output video stream in any suitable past, present or future video format, e.g. HEVC, H.264, MPEG (Moving Picture Experts Group)-2, etc. It is to be noted that there may different audio streams for the different input video streams 2oa-n, a common audio stream for all input video streams 2oa-n, or no audio stream at all.
  • Fig 2 is a schematic diagram illustrating how encoding of video frames is performed in the encoder device of Fig l using GOPs according to one embodiment.
  • Each GOP 10a, lob, IOC comprises a number of frames or pictures.
  • each GOP comprises eight frames, but this number can vary.
  • the first GOP loa comprises a first frame 15a, a second frame 15b, a third frame 15c, a fourth frame I5d, a fifth frame 15 ⁇ , a sixth frame I5f, a seventh frame i5g and an eighth frame 15I1.
  • the last frame 15' of the previous GOP is an intra coded frame, also known as an I-frame.
  • the eighth frame is a predicted frame, also known as a P-frame.
  • the first to seventh frames i5a-g are all bi-directional predicted frames, also known as B-frames.
  • I-frames are coded independently of all other frames.
  • P-frames comprise motion-compensated difference information in relation to at least one previous frame.
  • B-frames comprise motion-compensated difference information in relation to two (or more) frames.
  • the frames create a hierarchy of dependencies, which dictates a chronological order of encoding, resulting in temporal layers.
  • a base layer To the I-frame being the last frame 15' of the previous GOP and the P-frame being the eighth frame 15I1 can be encoded.
  • the P- frame only depends on the I-frame.
  • the base layer To only comprises the eighth frame 15I1.
  • a level one layer Ti then comprises the fourth frame i5d which depends on the last frame 15' of the previous GOP and the eighth frame 15I1.
  • a level two layer T2 comprises the second frame 15b, depending on the last frame 15' of the previous GOP and the fourth frame I5d, and the sixth frame I5f, depending on the fourth frame i5d and the eighth frame 15I1.
  • a level three layer T3 comprises the first frame 15a, depending on the last frame 15' of the previous GOP and the second frame 15b, the third frame 15c, depending on the second frame 15b and the fourth frame I5d, the fifth frame 15c, depending on the fourth frame I5d and the sixth frame I5f, and the seventh frame I5g, depending on the sixth frame i5f and the eighth frame 15I1. All layers from layer Ti and above are called higher layers (which thus only comprise B-frames) and the To layer is called the base layer (which comprises I-frames and/or P-frames).
  • each base layer needs to be encoded first, followed by the higher layers, layer by layer. Also, only frames of the base layer can be referred to when encoding frames in the next GOP.
  • Using such hierarchical GOP structures only allows for a few frames to be encoded at the same time since there are dependencies between frames that need to be respected, i.e. the encoding needs to happen layer by layer from the base layer. It is possible to start encoding the next GOP structure when the base layer of the current GOP has finished encoding.
  • each base layer encoding is essentially a sequential task which greatly limits the amount of parallelization within each GOP base layer encoding.
  • the base layers for future GOPs could encoded in advance by a graphic processing unit (GPU) for multiple streams.
  • the GPU typically comprises a large number of cores which can perform independent processing.
  • Each such core, or separate groups of cores then encodes the base layer for the next GOP for its own respective input video stream.
  • the GPU encodes the base layer for a future GOP for each one of the input video streams 2oa-n in parallel.
  • the central processing unit (CPU) of the encoder device only needs to encode the higher layers, i.e. the B-frames, which reduces latency and CPU usage.
  • the GPU can continue to encode the next GOP, but now based on the newly selected source stream.
  • Fig 3 is a schematic entity relationship diagram illustrating entities related to encoding in the encoder device of Fig ⁇ according to one embodiment.
  • the GPU 4 encodes the base layer 11, which comprises at least one I-frame or at least one P-frame.
  • the CPU 3 encodes at least one higher layer 12 which comprises only B-frames. It is the CPU 3 that is responsible for providing the output stream 22, which comprises a number of sequential GOPs 10. As described above, each GOP 10 is made up of a base layer 11 and one or more higher layers 12.
  • Fig 4 is a schematic diagram showing some components of the encoder device 1 of Fig 1.
  • a CPU 3 is capable of executing software instructions 66 stored in a CPU instruction memory 64, which can thus be a computer program product.
  • the CPU 3 can be configured to execute the CPU related steps of the method described with reference to Figs 5A-B below.
  • the CPU instruction memory 64 can be any combination of read and write memory (RAM) and read only memory (ROM).
  • the CPU instruction memory 64 comprises persistent storage, which, for example, can be any single one or combination of magnetic memory, optical memory, solid state memory or even remotely mounted memory.
  • a GPU 4 is capable of executing software instructions 67 stored in a GPU instruction memory 65, which can thus be a computer program product.
  • the GPU 4 can be configured to execute the GPU related steps of the method described with reference to Figs 5A-B below.
  • the GPU instruction memory 65 can be any combination of read and write memory (RAM) and read only memory (ROM).
  • the GPU instruction memory 65 may comprise persistent storage, which, for example, can be any single one or combination of magnetic memory, optical memory, solid state memory or even remotely mounted memory.
  • the software instructions 67 for the GPU can be preloaded in the GPU instruction memory 65 or dynamically provided by the CPU 3 at run-time.
  • a data memory 5 associated with the CPU 3 is also provided for reading and/or storing data during execution of software instructions in the CPU 3.
  • the data memory 5 associated with the CPU can be any combination of read and write memory (RAM) and read only memory (ROM).
  • the data memory 5 associated with the CPU can store video frames 7 used during encoding, e.g. higher level frames encoded by the CPU.
  • a data memory 6 associated with the GPU 4 is also provided for reading and/ or storing data during execution of software instructions in the GPU 4.
  • the data memory 6 associated with the GPU can be any combination of read and write memory (RAM) and read only memory (ROM).
  • the data memory 6 associated with the GPU can store video frames 8 used during encoding, e.g. base level frames encoded by the GPU.
  • the encoder device 1 further comprises an I/O interface 62 for
  • the I/O interface 62 also includes a user interface.
  • Figs 5A-B are flow charts illustrating embodiments of methods for encoding a video stream performed in the encoder device of Figs 1 and 4.
  • the encoder device 1 comprises both a CPU and a GPU. The method illustrated in Fig 5A will be described first.
  • B-frames for a current GOP of an input video stream are encoded in the CPU. This input video stream is currently selected as source stream of the encoder device. When I-frames and/ or P- frames previously encoded by the GPU are available, these form a base for the encoding of B-frames.
  • any I-frames and/ or P-frames i.e. the base layer
  • a future GOP e.g. the next GOP
  • the encoding of base layer frames can be performed for a plurality of input video streams, all of which currently are not selected as the source.
  • different input video streams are encoded in different cores of the GPU. Since the GPU can contain a great number of cores, it is able to handle a great number of parallel input video streams in this way.
  • this step comprises encoding, in the GPU, the base layer of the input video stream which is currently selected as source stream of the encoder device.
  • the GPU may be assigned to encode the base layer also for the active input video stream.
  • the GPU only encodes the base layer for input video streams which are not selected as source stream of the encoder device. In this way, when no switch occurs, the CPU continuously encodes the input video stream selected to be the source stream of the encoder device.
  • the CPU can encode the base layer of a future GOP.
  • the encode higher layers step 42 is performed in the CPU and the encode future base layer step 44 is performed in the GPU, the encode higher layers step 42 and the encode future base layer step 44 can be performed in parallel.
  • the CPU is relieved of the task of encoding the base layer. This not only reduces CPU usage, but also allows a switch between input video streams with reduced latency, since after a switch, the CPU can directly start on higher layer encoding and does not need to encode the base layer. Since there is a greater degree of parallelism possible for the higher layers, multiple cores of the CPU can be utilised to perform the higher layer encoding with relatively low latency.
  • any I-frames and/or P-frames (i.e. the base layer) of the input video stream currently selected as source stream is moved from the memory 5 associated with the CPU 3 to the memory 6 associated with the GPU 4.
  • the encoder device can switch to any of the input video streams for which the base layer is encoded in the GPU.
  • a conditional base layer done step 43 it is determined whether the base layer frame(s) of a current GOP have been encoded or not. If this is the case, the method continues to the encode higher layers step 42. Otherwise, the method continues to an encode base layer step 41. For instance, if the GPU is configured to encode the base layer also for the active stream, the base layer for the current GOP has been encoded. On the other hand, if the GPU is configured to only encode the base layer for the inactive streams, the base layer for the current GOP has not been encoded if there is no switch of input video stream for the current GOP (compared to the previous GOP). In the encode base layer step 41, the base layer for a current GOP of the input video stream currently selected as source stream of the encoder device is encoded by the CPU. This is of course more demanding on the CPU than letting the GPU do this, but when the base layer for the current GOP is not available, the CPU needs to perform this task.
  • an input video stream is selected as source stream of the encoder device 1.
  • the source stream could be the same as before (which is most common over time) or switched to a new input stream.
  • the selection of input video stream can be based on user input or a selection signal from another entity (not shown).
  • a conditional switch step 47 it is determined whether the source stream has been selected to be a new input stream. If this is the case, the method continues to a move base layer frames to CPU step 48. Otherwise, the method returns to the move base layer frames to GPU step 40. In the move base layer frames to CPU step 48, any of the base layer frames for the future (i.e. next) GOP is moved from the memory 6 associated with the GPU 4 to the memory 5 associated with the CPU 3.
  • Fig 6 is a schematic diagram showing functional modules of the encoder device 1 of Figs 1 and 4.
  • the modules can be implemented using software instructions such as a computer program executing in the network node 1 and/ or using hardware, such as application specific integrated circuits, field programmable gate arrays, discrete logical components, etc.
  • the modules correspond to the steps in the methods illustrated in Figs 5A-B.
  • a CPU higher layer encoder 72 is configured to encode B-frames for the current GOP of an input video stream currently selected as source stream of the encoder device. This module corresponds to the encode higher layers step 42 of Figs 5A-B.
  • a GPU base layer encoder 74 is configured to encode the base layer frames for a future GOP of at least one input video stream currently not selected as source stream of the encoder device. This module corresponds to the encode future base layer step 44 of Figs 5A-B.
  • a frame mover 70 is configured to move frames between the memories associated with the CPU and the GPU. This module corresponds to the move base layer frames to GPU step 40 and move base layer frames to CPU step 48 of Fig 5B.
  • a CPU base layer encoder 71 is configured to encode the base layer for a current GOP of the input video stream currently selected as source stream of the encoder device. This module corresponds to the encode base layer step 41 of Fig 5B.
  • a video stream selector 76 is configured to select an input video stream as source stream of the encoder device 1. This module corresponds to the select video stream step 46 of Fig 5B.
  • Fig 7 shows one example of a computer program product 90 comprising computer readable means.
  • a computer program 91 can be stored, which computer program can cause a processor to execute a method according to embodiments described herein.
  • the computer program product is an optical disc, such as a CD (compact disc) or a DVD (digital versatile disc) or a Blu-Ray disc.
  • the computer program product could also be embodied in a memory of a device, such as the computer program products 64 and/or 65 of Fig 4.
  • the computer program 91 is here schematically shown as a track on the depicted optical disk, the computer program can be stored in any way which is suitable for the computer program product, such as a removable solid state memory (e.g. a universal serial bus memory).

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

It is provided a method for encoding a video stream, the method being performed in an encoder device comprising a central processing unit, CPU, and a graphic processing unit, GPU. The method comprises the steps of: encoding, in the CPU, bi-directional predicted frames for a current group of pictures, GOP, of an input video stream currently selected as source stream of the encoder device; and encoding, in the GPU, any intra coded frames and/or predicted frames for a future GOP of at least one input video stream currently not selected as source stream of the encoder device.

Description

VIDEO STREAM ENCODING USING A CENTRAL PROCESSING UNIT AND A GRAPHICAL PROCESSING UNIT
TECHNICAL FIELD
The invention relates to a method, encoder devices, a computer program and a computer program product for encoding a video stream. BACKGROUND
In digital video processing a source video stream is encoded in a suitable format for distribution to video clients. Each video client then decodes the video and renders the decoded video on a display for a user.
When there is only one source video stream, the process of video encoding is a matter of encoding the input video frames in accordance with a
predetermined encoding standard, e.g. H.264, HEVC (High Efficiency Video Coding), MPEG (Moving Picture Experts Group)-2, etc. There maybe some latency due to the encoding and decoding, but this is typically not an issue since the encoding and decoding is continuous. HEVC as well as other video coding standards use a technique called group of pictures (GOP). This can be used to exploit redundant information in
consecutive frames. A common GOP structure uses dyadic hierarchical picture prediction. When using this structure, frames are grouped in GOPs, e.g. with 8 pictures in each GOP. Inside the GOP, frames are assigned to layers where a layer can only have dependencies on a lower layer. The lowest temporal layer, the base layer, consists of only I-frames and/or P-frames.
In low latency scenarios, such as real-time communication, GOP structures are not commonly used because they introduce unwanted latency. This extra latency is due to the need to buffer all the pictures in the group before
starting to encode it. This leads to extra delays when switching between video input streams.
US 8731047 B2 discloses a method and apparatus for mixing first and second video content portions. The method comprises overlapping at least one frame of a first content portion comprising multiple frames with at least one frame of a second content portion comprising multiple frames to produce a composite video content. At least one reference frame comprising a frame of the first or second content portion prior to the overlapping may be
designated. The composite video content and the reference frame may then be transmitted, for example, to a client device. The client device may use the reference frame to at least reduce an effect of the overlapping. However, the disclosed method can only be performed once the second content portion is known, i.e. after a selection of a new video input stream.
SUMMARY
It is an object to improve the efficiency of encoding a video stream, particularly in the situation where there are multiple input video streams to choose from.
According to a first aspect, it is provided a method for encoding a video stream, the method being performed in an encoder device comprising a central processing unit, CPU, and a graphic processing unit, GPU. The method comprises the steps of: encoding, in the CPU, bi-directional predicted frames for a current group of pictures, GOP, of an input video stream currently selected as source stream of the encoder device; and encoding, in the GPU, any intra coded frames and/ or predicted frames for a future GOP of at least one input video stream currently not selected as source stream of the encoder device. In this way, when the encoder device switches source stream from one input video stream to another input video stream, the base layer (intra coded frames and/or predicted frames) has already been encoded for the next GOP, whereby the CPU of the encoder device only needs to encode the higher layers, i.e. the bi-directional predicted frames. This reduces latency and CPU usage. After the switch, the GPU can continue to encode the base layer of the next GOP, but now based on the newly selected source stream.
The method may further comprise the step of: selecting an input video stream as source stream of the encoder device; after which the method is repeated. The method may further comprise the step of: moving any intra coded frames and/ or predicted frames for the future GOP from a memory associated with the GPU to a memory associated with the CPU.
In the step of encoding bi-directional predicted frames, the encoding is based on intra coded frames and/ or predicted frames previously encoded by the GPU.
The method may further comprise: encoding, in the CPU, any intra coded frames and/ or predicted frames for a current GOP of at least one input video stream currently selected as source stream of the encoder device when these frames have not previously been encoded by the GPU.
The method may further comprise the step of: moving any intra coded frames and/ or predicted frames of the input video stream currently selected as source stream from a memory associated with the CPU to a memory associated with the GPU. The step of encoding bi-directional predicted frames and the step of encoding any intra coded frames and/or predicted frames, maybe performed in parallel.
The step of encoding any intra coded frames and/or predicted frames maybe performed for a plurality of input video streams currently not selected as source stream of the encoder device, wherein different input video streams are encoded in different cores of the GPU.
The step of encoding any intra coded frames and/or predicted frames may further comprise encoding, in the GPU, any intra coded frames and/ or predicted frames for a future GOP of the input video stream currently selected as source stream of the encoder device.
According to a second aspect, it is provided an encoder device for encoding a video stream. The encoder device comprises: a CPU; a GPU; a CPU memory storing CPU instructions that, when executed by the CPU, causes the encoder device to: encode bi-directional predicted frames for a current group of pictures, GOP, of an input video stream currently selected as source stream of the encoder device; and a GPU memory storing GPU instructions that, when executed by the GPU, causes the encoder device to: encode any intra coded frames and/ or predicted frames for a future GOP of at least one input video stream currently not selected as source stream of the encoder device.
The CPU instructions may further comprise instructions that, when executed by the CPU, causes the encoder device to: select an input video stream as source stream of the encoder device; and to repeat the encoding of bidirectional predicted frames. In such a case, the GPU instructions further comprise instructions that, when executed by the GPU, causes the encoder device to repeat the encoding of any intra coded frames and/ or predicted frames.
The CPU instructions may further comprise instructions that, when executed by the CPU, causes the encoder device to move any intra coded frames and/ or predicted frames for the future GOP from a memory associated with the GPU to a memory associated with the CPU.
The CPU instructions may further comprise instructions that, when executed by the CPU, causes the encoder device to encode bi-directional predicted frames based on intra coded frames and/ or predicted frames previously encoded by the GPU.
The CPU instructions may further comprise instructions that, when executed by the CPU, causes the encoder device to encode any intra coded frames and/ or predicted frames for a current GOP of at least one input video stream currently selected as source stream of the encoder device when these frames have not previously been encoded by the GPU.
The CPU instructions further comprise instructions that, when executed by the CPU, causes the encoder device to move any intra coded frames and/or predicted frames of the input video stream currently selected as source stream from a memory associated with the CPU to a memory associated with the GPU. The CPU and the GPU may execute their respective instructions in parallel.
The GPU instructions further comprise instructions that, when executed by the GPU, causes the encoder device to encode any intra coded frames and/or predicted frames for a plurality of input video streams currently not selected as source stream of the encoder device, and wherein different input video streams are encoded in different cores of the GPU.
The CPU instructions to encode any intra coded frames and/or predicted frames for a future GOP comprise instructions that, when executed by the CPU, causes the encoder device to encode, in the GPU, any intra coded frames and/ or predicted frames for a future GOP of the input video stream currently selected as source stream of the encoder device.
According to a third aspect, it is provided an encoder device comprising: means for encoding, in a CPU of the encoder device, bi-directional predicted frames for a current group of pictures, GOP, of an input video stream currently selected as source stream of the encoder device; and means for encoding, in the GPU of the encoder device, any intra coded frames and/or predicted frames for a future GOP of at least one input video stream currently not selected as source stream of the encoder device.
According to a fourth aspect, it is provided a computer program for encoding a video stream, the computer program comprising computer program code which, when run on an encoder device comprising a CPU and a GPU, causes the encoder device to: encode, in the CPU, bi-directional predicted frames for a current group of pictures, GOP, of an input video stream currently selected as source stream of the encoder device; and encode, in the GPU, any intra coded frames and/ or predicted frames for a future GOP of at least one input video stream currently not selected as source stream of the encoder device.
According to a fifth aspect, it is provided a computer program product comprising a computer program according to the fourth aspect and a computer readable means on which the computer program is stored. Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to "a/an/the element, apparatus, component, means, step, etc." are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention is now described, by way of example, with reference to the accompanying drawings, in which:
Fig l is a schematic diagram illustrating an encoder device having multiple input video stream and one output video stream according to one
embodiment;
Fig 2 is a schematic diagram illustrating how encoding of video frames is performed in the encoder device of Fig 1 using GOPs according to one embodiment;
Fig 3 is a schematic entity relationship diagram illustrating entities related to encoding in the encoder device of Fig 1 according to one embodiment;
Fig 4 is a schematic diagram showing some components of the encoder device of Fig 1;
Figs 5A-B are flow charts illustrating embodiments of methods for encoding a video stream performed in the encoder device of Figs 1 and 4;
Fig 6 is a schematic diagram showing functional modules of the encoder device 1 of Figs 1 and 4; and Fig 7 shows one example of a computer program product 90 comprising computer readable means. DETAILED DESCRIPTION
The invention will now be described more fully hereinafter with reference to the accompanying drawings, in which certain embodiments of the invention are shown. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided by way of example so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Like numbers refer to like elements throughout the description. Fig l is a schematic diagram illustrating an encoder device having multiple input video stream and one output video stream according to one
embodiment.
The encoder device ι is here connected to n input video streams 2oa-n, from n respective video sources 2ia-n. The encoder device ι selects one of the input video streams 2oa-n to be a source stream. The selected source stream is encoded and the encoded video stream is provided as an output stream 22 of the encoder device 1.
Each input video stream 2oa-n is of a format which can be understood by the encoder device 1. For example, each input video stream 2oa-n can be of a raw video format (uncompressed) or it can be encoded in a video format which the encoder device 1 can decode.
The encoder device 1 encodes the video of the output video stream in any suitable past, present or future video format, e.g. HEVC, H.264, MPEG (Moving Picture Experts Group)-2, etc. It is to be noted that there may different audio streams for the different input video streams 2oa-n, a common audio stream for all input video streams 2oa-n, or no audio stream at all. Fig 2 is a schematic diagram illustrating how encoding of video frames is performed in the encoder device of Fig l using GOPs according to one embodiment.
There is here a first Group of Pictures (GOP) 10a, a second GOP lob and a third GOP loc. Each GOP 10a, lob, IOC comprises a number of frames or pictures. In this example, each GOP comprises eight frames, but this number can vary. The first GOP loa comprises a first frame 15a, a second frame 15b, a third frame 15c, a fourth frame I5d, a fifth frame 15ε, a sixth frame I5f, a seventh frame i5g and an eighth frame 15I1. There is also a last frame 15' of the previous GOP.
The last frame 15' of the previous GOP is an intra coded frame, also known as an I-frame. The eighth frame is a predicted frame, also known as a P-frame. The first to seventh frames i5a-g are all bi-directional predicted frames, also known as B-frames. I-frames are coded independently of all other frames. P-frames comprise motion-compensated difference information in relation to at least one previous frame. B-frames comprise motion-compensated difference information in relation to two (or more) frames.
Looking now to the first GOP, the frames create a hierarchy of dependencies, which dictates a chronological order of encoding, resulting in temporal layers. In a base layer To, the I-frame being the last frame 15' of the previous GOP and the P-frame being the eighth frame 15I1 can be encoded. The P- frame only depends on the I-frame. In fact, as part of the first GOP 10a, the base layer To only comprises the eighth frame 15I1. A level one layer Ti then comprises the fourth frame i5d which depends on the last frame 15' of the previous GOP and the eighth frame 15I1. A level two layer T2 comprises the second frame 15b, depending on the last frame 15' of the previous GOP and the fourth frame I5d, and the sixth frame I5f, depending on the fourth frame i5d and the eighth frame 15I1. A level three layer T3 comprises the first frame 15a, depending on the last frame 15' of the previous GOP and the second frame 15b, the third frame 15c, depending on the second frame 15b and the fourth frame I5d, the fifth frame 15c, depending on the fourth frame I5d and the sixth frame I5f, and the seventh frame I5g, depending on the sixth frame i5f and the eighth frame 15I1. All layers from layer Ti and above are called higher layers (which thus only comprise B-frames) and the To layer is called the base layer (which comprises I-frames and/or P-frames).
It can be seen from Fig 2 that for each GOP, the base layer needs to be encoded first, followed by the higher layers, layer by layer. Also, only frames of the base layer can be referred to when encoding frames in the next GOP. Using such hierarchical GOP structures only allows for a few frames to be encoded at the same time since there are dependencies between frames that need to be respected, i.e. the encoding needs to happen layer by layer from the base layer. It is possible to start encoding the next GOP structure when the base layer of the current GOP has finished encoding. However, each base layer encoding is essentially a sequential task which greatly limits the amount of parallelization within each GOP base layer encoding.
Another issue comes when switching video input streams. After switching has been done, any work done on encoding future GOP structures will have to be discarded since they are based on the old video input stream. This limits the amount of parallelization that can be exploited by a multiprocessor system until the base layer of the current GOP structure has finished encoding.
What the inventors have realised is that the base layers for future GOPs could encoded in advance by a graphic processing unit (GPU) for multiple streams. This is beneficial since the GPU typically comprises a large number of cores which can perform independent processing. Each such core, or separate groups of cores, then encodes the base layer for the next GOP for its own respective input video stream. In other words, looking also at Fig 1, the GPU encodes the base layer for a future GOP for each one of the input video streams 2oa-n in parallel. In this way, when the encoder device 1 switches source stream from one input video stream to another input video stream, the base layer has already been encoded for the next GOP, whereby the central processing unit (CPU) of the encoder device only needs to encode the higher layers, i.e. the B-frames, which reduces latency and CPU usage. After the switch, the GPU can continue to encode the next GOP, but now based on the newly selected source stream.
Fig 3 is a schematic entity relationship diagram illustrating entities related to encoding in the encoder device of Fig ι according to one embodiment.
There is here a CPU 3 and a GPU 4 which are both part of the encoder device. The GPU 4 encodes the base layer 11, which comprises at least one I-frame or at least one P-frame. The CPU 3 encodes at least one higher layer 12 which comprises only B-frames. It is the CPU 3 that is responsible for providing the output stream 22, which comprises a number of sequential GOPs 10. As described above, each GOP 10 is made up of a base layer 11 and one or more higher layers 12. Fig 4 is a schematic diagram showing some components of the encoder device 1 of Fig 1. A CPU 3 is capable of executing software instructions 66 stored in a CPU instruction memory 64, which can thus be a computer program product. The CPU 3 can be configured to execute the CPU related steps of the method described with reference to Figs 5A-B below. The CPU instruction memory 64 can be any combination of read and write memory (RAM) and read only memory (ROM). The CPU instruction memory 64 comprises persistent storage, which, for example, can be any single one or combination of magnetic memory, optical memory, solid state memory or even remotely mounted memory. A GPU 4 is capable of executing software instructions 67 stored in a GPU instruction memory 65, which can thus be a computer program product. The GPU 4 can be configured to execute the GPU related steps of the method described with reference to Figs 5A-B below. The GPU instruction memory 65 can be any combination of read and write memory (RAM) and read only memory (ROM). The GPU instruction memory 65 may comprise persistent storage, which, for example, can be any single one or combination of magnetic memory, optical memory, solid state memory or even remotely mounted memory. The software instructions 67 for the GPU can be preloaded in the GPU instruction memory 65 or dynamically provided by the CPU 3 at run-time.
A data memory 5 associated with the CPU 3 is also provided for reading and/or storing data during execution of software instructions in the CPU 3. The data memory 5 associated with the CPU can be any combination of read and write memory (RAM) and read only memory (ROM). For instance, the data memory 5 associated with the CPU can store video frames 7 used during encoding, e.g. higher level frames encoded by the CPU.
A data memory 6 associated with the GPU 4 is also provided for reading and/ or storing data during execution of software instructions in the GPU 4. The data memory 6 associated with the GPU can be any combination of read and write memory (RAM) and read only memory (ROM). For instance, the data memory 6 associated with the GPU can store video frames 8 used during encoding, e.g. base level frames encoded by the GPU.
The encoder device 1 further comprises an I/O interface 62 for
communicating with other external entities, e.g. to provide the output video stream for distribution to video clients or to obtain input video streams. Optionally, the I/O interface 62 also includes a user interface.
Other components of the encoder device 1 are omitted in order not to obscure the concepts presented herein.
Figs 5A-B are flow charts illustrating embodiments of methods for encoding a video stream performed in the encoder device of Figs 1 and 4. As shown in Fig 4, the encoder device 1 comprises both a CPU and a GPU. The method illustrated in Fig 5A will be described first.
In an encode higher layers step 42, B-frames for a current GOP of an input video stream are encoded in the CPU. This input video stream is currently selected as source stream of the encoder device. When I-frames and/ or P- frames previously encoded by the GPU are available, these form a base for the encoding of B-frames.
In an encode future base layer step 44, any I-frames and/ or P-frames (i.e. the base layer) for a future GOP (e.g. the next GOP) of at least one input video stream currently not selected as source stream of the encoder device are encoded in the GPU.
The encoding of base layer frames can be performed for a plurality of input video streams, all of which currently are not selected as the source. In such a case, different input video streams are encoded in different cores of the GPU. Since the GPU can contain a great number of cores, it is able to handle a great number of parallel input video streams in this way.
Optionally, this step comprises encoding, in the GPU, the base layer of the input video stream which is currently selected as source stream of the encoder device. In other words, the GPU may be assigned to encode the base layer also for the active input video stream.
Alternatively, the GPU only encodes the base layer for input video streams which are not selected as source stream of the encoder device. In this way, when no switch occurs, the CPU continuously encodes the input video stream selected to be the source stream of the encoder device. Optionally, the CPU can encode the base layer of a future GOP.
Since the encode higher layers step 42 is performed in the CPU and the encode future base layer step 44 is performed in the GPU, the encode higher layers step 42 and the encode future base layer step 44 can be performed in parallel.
In this way, the CPU is relieved of the task of encoding the base layer. This not only reduces CPU usage, but also allows a switch between input video streams with reduced latency, since after a switch, the CPU can directly start on higher layer encoding and does not need to encode the base layer. Since there is a greater degree of parallelism possible for the higher layers, multiple cores of the CPU can be utilised to perform the higher layer encoding with relatively low latency.
Due to the multiple cores of the GPU, this can be used for a great number of parallel potential input video streams compared to if a CPU is used for the base layer.
Since the last encoded GOP structure will form a reference for the input video stream encoding in the GPU, there is potentially high work per memory transfer ratio. This greatly benefits the GPU where transferring the reference frame is often a bottleneck.
Looking now to the method illustrated by the flow chart in Fig 5B, only new or modified steps compared to the method illustrated by the flow chart in Fig 5A will be described.
In a move base layer frames to GPU step 40, any I-frames and/or P-frames (i.e. the base layer) of the input video stream currently selected as source stream is moved from the memory 5 associated with the CPU 3 to the memory 6 associated with the GPU 4. This allows the GPU to, in the encode future base layer step 44, base the future base layer on the base layer of the current GOP conveniently stored in the memory 6 associated with the GPU 4. In this way, the encoder device can switch to any of the input video streams for which the base layer is encoded in the GPU.
In a conditional base layer done step 43, it is determined whether the base layer frame(s) of a current GOP have been encoded or not. If this is the case, the method continues to the encode higher layers step 42. Otherwise, the method continues to an encode base layer step 41. For instance, if the GPU is configured to encode the base layer also for the active stream, the base layer for the current GOP has been encoded. On the other hand, if the GPU is configured to only encode the base layer for the inactive streams, the base layer for the current GOP has not been encoded if there is no switch of input video stream for the current GOP (compared to the previous GOP). In the encode base layer step 41, the base layer for a current GOP of the input video stream currently selected as source stream of the encoder device is encoded by the CPU. This is of course more demanding on the CPU than letting the GPU do this, but when the base layer for the current GOP is not available, the CPU needs to perform this task.
In a select video stream step 46, an input video stream is selected as source stream of the encoder device 1. The source stream could be the same as before (which is most common over time) or switched to a new input stream. The selection of input video stream can be based on user input or a selection signal from another entity (not shown).
In a conditional switch step 47, it is determined whether the source stream has been selected to be a new input stream. If this is the case, the method continues to a move base layer frames to CPU step 48. Otherwise, the method returns to the move base layer frames to GPU step 40. In the move base layer frames to CPU step 48, any of the base layer frames for the future (i.e. next) GOP is moved from the memory 6 associated with the GPU 4 to the memory 5 associated with the CPU 3.
Fig 6 is a schematic diagram showing functional modules of the encoder device 1 of Figs 1 and 4. The modules can be implemented using software instructions such as a computer program executing in the network node 1 and/ or using hardware, such as application specific integrated circuits, field programmable gate arrays, discrete logical components, etc. The modules correspond to the steps in the methods illustrated in Figs 5A-B.
A CPU higher layer encoder 72 is configured to encode B-frames for the current GOP of an input video stream currently selected as source stream of the encoder device. This module corresponds to the encode higher layers step 42 of Figs 5A-B.
A GPU base layer encoder 74 is configured to encode the base layer frames for a future GOP of at least one input video stream currently not selected as source stream of the encoder device. This module corresponds to the encode future base layer step 44 of Figs 5A-B.
A frame mover 70 is configured to move frames between the memories associated with the CPU and the GPU. This module corresponds to the move base layer frames to GPU step 40 and move base layer frames to CPU step 48 of Fig 5B.
A CPU base layer encoder 71 is configured to encode the base layer for a current GOP of the input video stream currently selected as source stream of the encoder device. This module corresponds to the encode base layer step 41 of Fig 5B.
A video stream selector 76 is configured to select an input video stream as source stream of the encoder device 1. This module corresponds to the select video stream step 46 of Fig 5B.
Fig 7 shows one example of a computer program product 90 comprising computer readable means. On this computer readable means a computer program 91 can be stored, which computer program can cause a processor to execute a method according to embodiments described herein. In this example, the computer program product is an optical disc, such as a CD (compact disc) or a DVD (digital versatile disc) or a Blu-Ray disc. As explained above, the computer program product could also be embodied in a memory of a device, such as the computer program products 64 and/or 65 of Fig 4. While the computer program 91 is here schematically shown as a track on the depicted optical disk, the computer program can be stored in any way which is suitable for the computer program product, such as a removable solid state memory (e.g. a universal serial bus memory).
The invention has mainly been described above with reference to a few embodiments. However, as is readily appreciated by a person skilled in the art, other embodiments than the ones disclosed above are equally possible within the scope of the invention, as defined by the appended patent claims.

Claims

l6 CLAIMS
1. A method for encoding a video stream, the method being performed in an encoder device (1) comprising a central processing unit, CPU, (3) and a graphic processing unit, GPU, (4) the method comprising the steps of:
encoding (42), in the CPU, bi-directional predicted frames (B) for a current group of pictures, GOP, (10a) of an input video stream currently selected as source stream of the encoder device (1); and
encoding (44), in the GPU, any intra coded frames (I) and/ or predicted frames (P) for a future GOP (10b) of at least one input video stream currently not selected as source stream of the encoder device.
2. The method according to claim 1, further comprising the step of:
selecting (46) an input video stream as source stream of the encoder device (1);
after which the method is repeated.
3. The method according to claim 2, further comprising the step of:
moving (48) any intra coded frames (I) and/or predicted frames (P) for the future GOP (10b) from a memory (6) associated with the GPU (4) to a memory (5) associated with the CPU (3).
4. The method according to any one of the preceding claims, wherein in the step of encoding (42) bi-directional predicted frames (B), the encoding is based on intra coded frames (I) and/ or predicted frames (P) previously encoded by the GPU.
5. The method according to claim 4, further comprising:
encoding (41), in the CPU, any intra coded frames (I) and/or predicted frames (P) for a current GOP (10a) of at least one input video stream currently selected as source stream of the encoder device (1) when these frames have not previously been encoded by the GPU (4).
6. The method according to any one of the preceding claims, further comprising the step of: moving (40) any intra coded frames (I) and/ or predicted frames (P) of the input video stream currently selected as source stream from a memory (5) associated with the CPU (3) to a memory (6) associated with the GPU (4).
7. The method according to any one of the preceding claims, wherein the step of encoding (42) bi-directional predicted frames (B) and the step of encoding (44) any intra coded frames (I) and/ or predicted frames (P), are performed in parallel.
8. The method according to any one of the preceding claims, wherein the step of encoding (44) any intra coded frames (I) and/or predicted frames (P) is performed for a plurality of input video streams currently not selected as source stream of the encoder device, wherein different input video streams are encoded in different cores of the GPU (4).
9. The method according to any one of the preceding claims, wherein the step of encoding (44) any intra coded frames (I) and/or predicted frames (P) further comprises encoding (44), in the GPU, any intra coded frames (I) and/ or predicted frames (P) for a future GOP (10b) of the input video stream currently selected as source stream of the encoder device.
10. An encoder device (1) for encoding a video stream, the encoder device comprising:
a central processing unit, CPU (3);
a graphic processing unit, GPU (4);
a CPU memory (64) storing CPU instructions (66) that, when executed by the CPU (3), causes the encoder device to:
encode bi-directional predicted frames (B) for a current group of pictures, GOP, (10a) of an input video stream currently selected as source stream of the encoder device (1); and
a GPU memory (65) storing GPU instructions (67) that, when executed by the GPU (4), causes the encoder device to:
encode any intra coded frames (I) and/or predicted frames (P) for a l8 future GOP (10b) of at least one input video stream currently not selected as source stream of the encoder device.
11. The encoder device (1) according to claim 10, wherein the CPU instructions further comprise instructions that, when executed by the CPU
5 (3), causes the encoder device to:
select an input video stream as source stream of the encoder device (1); and to repeat the encoding of bi-directional predicted frames (B); and wherein the GPU instructions further comprise instructions that, when executed by the GPU (4), causes the encoder device to repeat the encoding of0 any intra coded frames (I) and/ or predicted frames (P).
12. The encoder device (1) according to claim 11, wherein the CPU instructions further comprise instructions that, when executed by the CPU (3), causes the encoder device to move any intra coded frames (I) and/ or predicted frames (P) for the future GOP (10b) from a memory (6) associated5 with the GPU (4) to a memory (5) associated with the CPU (3).
13. The encoder device (1) according to any one of claims 10 to 12, wherein the CPU instructions further comprise instructions that, when executed by the CPU (3), causes the encoder device to encode bi-directional predicted frames (B) based on intra coded frames (I) and/or predicted frames (P) o previously encoded by the GPU.
14. The encoder device (1) according to claim 13, wherein the CPU instructions further comprise instructions that, when executed by the CPU (3), causes the encoder device to encode any intra coded frames (I) and/ or predicted frames (P) for a current GOP (10b) of at least one input video5 stream currently selected as source stream of the encoder device (1) when these frames have not previously been encoded by the GPU (4).
15. The encoder device (1) according to any one of claims 10 to 14, wherein the CPU instructions further comprise instructions that, when executed by the CPU (3), causes the encoder device to move any intra coded frames (I)0 and/ or predicted frames (P) of the input video stream currently selected as source stream from a memory (5) associated with the CPU (3) to a memory (6) associated with the GPU (4).
16. The encoder device (1) according to any one of claims 10 to 15, wherein the CPU (3) and the GPU (4) execute their respective instructions in parallel.
17. The encoder device (1) according to any one of claims 10 to 16, wherein the GPU instructions further comprise instructions that, when executed by the GPU (4), causes the encoder device to encode any intra coded frames (I) and/ or predicted frames (P) for a plurality of input video streams currently not selected as source stream of the encoder device, and wherein different input video streams are encoded in different cores of the GPU (4).
18. The encoder device (1) according to any one of claims 10 to 17, wherein the CPU instructions to encode any intra coded frames (I) and/or predicted frames (P) for a future GOP comprise instructions that, when executed by the CPU (3), causes the encoder device to encode, in the GPU, any intra coded frames (I) and/ or predicted frames (P) for a future GOP (10b) of the input video stream currently selected as source stream of the encoder device.
19. An encoder device (1) comprising:
means for encoding, in a CPU (3) of the encoder device (1), bidirectional predicted frames (B) for a current group of pictures, GOP, (10a) of an input video stream currently selected as source stream of the encoder device (1); and
means for encoding, in the GPU of the encoder device (1), any intra coded frames (I) and/or predicted frames (P) for a future GOP (10b) of at least one input video stream currently not selected as source stream of the encoder device.
20. A computer program (91) for encoding a video stream, the computer program comprising computer program code which, when run on an encoder device (1) comprising a central processing unit, CPU, (3) and a graphic processing unit, GPU (4), causes the encoder device (1) to:
encode, in the CPU, bi-directional predicted frames (B) for a current group of pictures, GOP, (10a) of an input video stream currently selected as source stream of the encoder device (1); and
encode, in the GPU, any intra coded frames (I) and/or predicted frames (P) for a future GOP (10b) of at least one input video stream currently not selected as source stream of the encoder device.
21. A computer program product (90) comprising a computer program according to claim 20 and a computer readable means on which the computer program is stored.
PCT/SE2014/051004 2014-09-02 2014-09-02 Video stream encoding using a central processing unit and a graphical processing unit WO2016036285A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/SE2014/051004 WO2016036285A1 (en) 2014-09-02 2014-09-02 Video stream encoding using a central processing unit and a graphical processing unit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/SE2014/051004 WO2016036285A1 (en) 2014-09-02 2014-09-02 Video stream encoding using a central processing unit and a graphical processing unit

Publications (1)

Publication Number Publication Date
WO2016036285A1 true WO2016036285A1 (en) 2016-03-10

Family

ID=51627338

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2014/051004 WO2016036285A1 (en) 2014-09-02 2014-09-02 Video stream encoding using a central processing unit and a graphical processing unit

Country Status (1)

Country Link
WO (1) WO2016036285A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108933762A (en) * 2017-05-25 2018-12-04 腾讯科技(深圳)有限公司 A kind of play handling method and device of Media Stream

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6393057B1 (en) * 1998-08-14 2002-05-21 Dominique Thoreau MPEG stream switching process
JP2008092364A (en) * 2006-10-03 2008-04-17 Sanyo Electric Co Ltd Data processing device
US20090060032A1 (en) * 2007-05-11 2009-03-05 Advanced Micro Devices, Inc. Software Video Transcoder with GPU Acceleration
US20100008419A1 (en) * 2008-07-10 2010-01-14 Apple Inc. Hierarchical Bi-Directional P Frames

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6393057B1 (en) * 1998-08-14 2002-05-21 Dominique Thoreau MPEG stream switching process
JP2008092364A (en) * 2006-10-03 2008-04-17 Sanyo Electric Co Ltd Data processing device
US20090060032A1 (en) * 2007-05-11 2009-03-05 Advanced Micro Devices, Inc. Software Video Transcoder with GPU Acceleration
US20100008419A1 (en) * 2008-07-10 2010-01-14 Apple Inc. Hierarchical Bi-Directional P Frames

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BRIGHTWELL ET AL: "Flexible Switching and Editing of MPEG-2 Video Bitstreams", INTERNATIONAL BROADCASTING CONVENTION, LONDON, GB, 12 September 1997 (1997-09-12), pages 547 - 552, XP002098561 *
OLIVARES T ET AL: "Parallelization of the MPEG coding algorithm over a multicomputer. A proposal to evaluate its interconnection network", COMMUNICATIONS, COMPUTERS AND SIGNAL PROCESSING, 1997. 10 YEARS PACRIM 1987-1997 - NETWORKING THE PACIFIC RIM. 1997 IEEE PACIFIC RIM CONFERE NCE ON VICTORIA, BC, CANADA 20-22 AUG. 1997, NEW YORK, NY, USA,IEEE, US, vol. 1, 20 August 1997 (1997-08-20), pages 113 - 116, XP010244929, ISBN: 978-0-7803-3905-7, DOI: 10.1109/PACRIM.1997.619914 *
WEISS S M: "SWITCHING FACILITIES IN MPEG-2: NECESSARY BUT NOT SUFFICIENT", SMPTE - MOTION IMAGING JOURNAL, SOCIETY OF MOTION PICTURE AND TELEVISION ENGINEERS, WHITE PLAINS, NY, US, vol. 104, no. 12, 1 December 1995 (1995-12-01), pages 788 - 802, XP000543847, ISSN: 0036-1682 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108933762A (en) * 2017-05-25 2018-12-04 腾讯科技(深圳)有限公司 A kind of play handling method and device of Media Stream
CN108933762B (en) * 2017-05-25 2020-12-04 腾讯科技(深圳)有限公司 Media stream playing processing method and device

Similar Documents

Publication Publication Date Title
TWI603609B (en) Constraints and unit types to simplify video random access
JP6449852B2 (en) Motion-restricted tileset for region of interest coding
CN102447906B (en) Low-latency video decoding
CA2682461C (en) Selective information handling for video processing
JP4825644B2 (en) Image decoding apparatus, image encoding apparatus, and system LSI
CN100508585C (en) Apparatus and method for controlling reverse-play for digital video bit stream
JP2008118616A (en) Method and apparatus for multi-threaded video decoding
KR20150037944A (en) Transmitting apparatus and method thereof for video processing
EP3202145B1 (en) Encoding and decoding a video frame in separate processing units
KR102123620B1 (en) Method and apparatus for entropy encoding or entropy decoding of video signals for large-scale parallel processing
US9451251B2 (en) Sub picture parallel transcoding
JP5116704B2 (en) Image coding apparatus and image coding method
JP2010288166A (en) Moving picture encoder, broadcast wave recorder, and program
CN111147926A (en) Data transcoding method and device
WO2010062466A2 (en) Device for decoding a video stream and method therof
US8300692B2 (en) Moving picture coding method, moving picture decoding method, moving picture coding device, and moving picture decoding device
JP2023506876A (en) Video data stream, video encoder, apparatus and method for hypothetical reference decoder and output layer set
WO2016036285A1 (en) Video stream encoding using a central processing unit and a graphical processing unit
JP2007259323A (en) Image decoding apparatus
US20220109891A1 (en) Features of range asymmetric number system encoding and decoding
JP2007150569A (en) Device and method for decoding image
WO2013114826A1 (en) Image decoding device
US20200137134A1 (en) Multi-session low latency encoding
US8428444B2 (en) Video server and seamless playback method
US20140044165A1 (en) Method and Apparatus for Inverse Scan of Transform Coefficients in HEVC

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14777199

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14777199

Country of ref document: EP

Kind code of ref document: A1