WO2016036285A1

WO2016036285A1 - Video stream encoding using a central processing unit and a graphical processing unit

Info

Publication number: WO2016036285A1
Application number: PCT/SE2014/051004
Authority: WO
Inventors: Per Hermansson
Original assignee: Telefonaktiebolaget L M Ericsson (Publ)
Priority date: 2014-09-02
Filing date: 2014-09-02
Publication date: 2016-03-10

Abstract

It is provided a method for encoding a video stream, the method being performed in an encoder device comprising a central processing unit, CPU, and a graphic processing unit, GPU. The method comprises the steps of: encoding, in the CPU, bi-directional predicted frames for a current group of pictures, GOP, of an input video stream currently selected as source stream of the encoder device; and encoding, in the GPU, any intra coded frames and/or predicted frames for a future GOP of at least one input video stream currently not selected as source stream of the encoder device.

Description

VIDEO STREAM ENCODING USING A CENTRAL PROCESSING UNIT AND A GRAPHICAL PROCESSING UNIT

TECHNICAL FIELD

The invention relates to a method, encoder devices, a computer program and a computer program product for encoding a video stream. BACKGROUND

In digital video processing a source video stream is encoded in a suitable format for distribution to video clients. Each video client then decodes the video and renders the decoded video on a display for a user.

When there is only one source video stream, the process of video encoding is a matter of encoding the input video frames in accordance with a

predetermined encoding standard, e.g. H.264, HEVC (High Efficiency Video Coding), MPEG (Moving Picture Experts Group)-2, etc. There maybe some latency due to the encoding and decoding, but this is typically not an issue since the encoding and decoding is continuous. HEVC as well as other video coding standards use a technique called group of pictures (GOP). This can be used to exploit redundant information in

consecutive frames. A common GOP structure uses dyadic hierarchical picture prediction. When using this structure, frames are grouped in GOPs, e.g. with 8 pictures in each GOP. Inside the GOP, frames are assigned to layers where a layer can only have dependencies on a lower layer. The lowest temporal layer, the base layer, consists of only I-frames and/or P-frames.

In low latency scenarios, such as real-time communication, GOP structures are not commonly used because they introduce unwanted latency. This extra latency is due to the need to buffer all the pictures in the group before

starting to encode it. This leads to extra delays when switching between video input streams.

US 8731047 B2 discloses a method and apparatus for mixing first and second video content portions. The method comprises overlapping at least one frame of a first content portion comprising multiple frames with at least one frame of a second content portion comprising multiple frames to produce a composite video content. At least one reference frame comprising a frame of the first or second content portion prior to the overlapping may be

designated. The composite video content and the reference frame may then be transmitted, for example, to a client device. The client device may use the reference frame to at least reduce an effect of the overlapping. However, the disclosed method can only be performed once the second content portion is known, i.e. after a selection of a new video input stream.

SUMMARY

It is an object to improve the efficiency of encoding a video stream, particularly in the situation where there are multiple input video streams to choose from.

According to a first aspect, it is provided a method for encoding a video stream, the method being performed in an encoder device comprising a central processing unit, CPU, and a graphic processing unit, GPU. The method comprises the steps of: encoding, in the CPU, bi-directional predicted frames for a current group of pictures, GOP, of an input video stream currently selected as source stream of the encoder device; and encoding, in the GPU, any intra coded frames and/ or predicted frames for a future GOP of at least one input video stream currently not selected as source stream of the encoder device. In this way, when the encoder device switches source stream from one input video stream to another input video stream, the base layer (intra coded frames and/or predicted frames) has already been encoded for the next GOP, whereby the CPU of the encoder device only needs to encode the higher layers, i.e. the bi-directional predicted frames. This reduces latency and CPU usage. After the switch, the GPU can continue to encode the base layer of the next GOP, but now based on the newly selected source stream.

The method may further comprise the step of: selecting an input video stream as source stream of the encoder device; after which the method is repeated. The method may further comprise the step of: moving any intra coded frames and/ or predicted frames for the future GOP from a memory associated with the GPU to a memory associated with the CPU.

In the step of encoding bi-directional predicted frames, the encoding is based on intra coded frames and/ or predicted frames previously encoded by the GPU.

The method may further comprise: encoding, in the CPU, any intra coded frames and/ or predicted frames for a current GOP of at least one input video stream currently selected as source stream of the encoder device when these frames have not previously been encoded by the GPU.

The method may further comprise the step of: moving any intra coded frames and/ or predicted frames of the input video stream currently selected as source stream from a memory associated with the CPU to a memory associated with the GPU. The step of encoding bi-directional predicted frames and the step of encoding any intra coded frames and/or predicted frames, maybe performed in parallel.

The step of encoding any intra coded frames and/or predicted frames maybe performed for a plurality of input video streams currently not selected as source stream of the encoder device, wherein different input video streams are encoded in different cores of the GPU.

The step of encoding any intra coded frames and/or predicted frames may further comprise encoding, in the GPU, any intra coded frames and/ or predicted frames for a future GOP of the input video stream currently selected as source stream of the encoder device.

According to a second aspect, it is provided an encoder device for encoding a video stream. The encoder device comprises: a CPU; a GPU; a CPU memory storing CPU instructions that, when executed by the CPU, causes the encoder device to: encode bi-directional predicted frames for a current group of pictures, GOP, of an input video stream currently selected as source stream of the encoder device; and a GPU memory storing GPU instructions that, when executed by the GPU, causes the encoder device to: encode any intra coded frames and/ or predicted frames for a future GOP of at least one input video stream currently not selected as source stream of the encoder device.

The CPU instructions may further comprise instructions that, when executed by the CPU, causes the encoder device to: select an input video stream as source stream of the encoder device; and to repeat the encoding of bidirectional predicted frames. In such a case, the GPU instructions further comprise instructions that, when executed by the GPU, causes the encoder device to repeat the encoding of any intra coded frames and/ or predicted frames.

The CPU instructions may further comprise instructions that, when executed by the CPU, causes the encoder device to move any intra coded frames and/ or predicted frames for the future GOP from a memory associated with the GPU to a memory associated with the CPU.

The CPU instructions may further comprise instructions that, when executed by the CPU, causes the encoder device to encode bi-directional predicted frames based on intra coded frames and/ or predicted frames previously encoded by the GPU.

The CPU instructions may further comprise instructions that, when executed by the CPU, causes the encoder device to encode any intra coded frames and/ or predicted frames for a current GOP of at least one input video stream currently selected as source stream of the encoder device when these frames have not previously been encoded by the GPU.

The CPU instructions further comprise instructions that, when executed by the CPU, causes the encoder device to move any intra coded frames and/or predicted frames of the input video stream currently selected as source stream from a memory associated with the CPU to a memory associated with the GPU. The CPU and the GPU may execute their respective instructions in parallel.

The GPU instructions further comprise instructions that, when executed by the GPU, causes the encoder device to encode any intra coded frames and/or predicted frames for a plurality of input video streams currently not selected as source stream of the encoder device, and wherein different input video streams are encoded in different cores of the GPU.

The CPU instructions to encode any intra coded frames and/or predicted frames for a future GOP comprise instructions that, when executed by the CPU, causes the encoder device to encode, in the GPU, any intra coded frames and/ or predicted frames for a future GOP of the input video stream currently selected as source stream of the encoder device.

According to a third aspect, it is provided an encoder device comprising: means for encoding, in a CPU of the encoder device, bi-directional predicted frames for a current group of pictures, GOP, of an input video stream currently selected as source stream of the encoder device; and means for encoding, in the GPU of the encoder device, any intra coded frames and/or predicted frames for a future GOP of at least one input video stream currently not selected as source stream of the encoder device.

According to a fourth aspect, it is provided a computer program for encoding a video stream, the computer program comprising computer program code which, when run on an encoder device comprising a CPU and a GPU, causes the encoder device to: encode, in the CPU, bi-directional predicted frames for a current group of pictures, GOP, of an input video stream currently selected as source stream of the encoder device; and encode, in the GPU, any intra coded frames and/ or predicted frames for a future GOP of at least one input video stream currently not selected as source stream of the encoder device.

According to a fifth aspect, it is provided a computer program product comprising a computer program according to the fourth aspect and a computer readable means on which the computer program is stored. Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to "a/an/the element, apparatus, component, means, step, etc." are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention is now described, by way of example, with reference to the accompanying drawings, in which:

Fig l is a schematic diagram illustrating an encoder device having multiple input video stream and one output video stream according to one

embodiment;

Fig 2 is a schematic diagram illustrating how encoding of video frames is performed in the encoder device of Fig 1 using GOPs according to one embodiment;

Fig 3 is a schematic entity relationship diagram illustrating entities related to encoding in the encoder device of Fig 1 according to one embodiment;

Fig 4 is a schematic diagram showing some components of the encoder device of Fig 1;

Figs 5A-B are flow charts illustrating embodiments of methods for encoding a video stream performed in the encoder device of Figs 1 and 4;

Fig 6 is a schematic diagram showing functional modules of the encoder device 1 of Figs 1 and 4; and Fig 7 shows one example of a computer program product 90 comprising computer readable means. DETAILED DESCRIPTION

The invention will now be described more fully hereinafter with reference to the accompanying drawings, in which certain embodiments of the invention are shown. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided by way of example so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Like numbers refer to like elements throughout the description. Fig l is a schematic diagram illustrating an encoder device having multiple input video stream and one output video stream according to one

embodiment.

The encoder device ι is here connected to n input video streams 2oa-n, from n respective video sources 2ia-n. The encoder device ι selects one of the input video streams 2oa-n to be a source stream. The selected source stream is encoded and the encoded video stream is provided as an output stream 22 of the encoder device 1.

Each input video stream 2oa-n is of a format which can be understood by the encoder device 1. For example, each input video stream 2oa-n can be of a raw video format (uncompressed) or it can be encoded in a video format which the encoder device 1 can decode.

The encoder device 1 encodes the video of the output video stream in any suitable past, present or future video format, e.g. HEVC, H.264, MPEG (Moving Picture Experts Group)-2, etc. It is to be noted that there may different audio streams for the different input video streams 2oa-n, a common audio stream for all input video streams 2oa-n, or no audio stream at all. Fig 2 is a schematic diagram illustrating how encoding of video frames is performed in the encoder device of Fig l using GOPs according to one embodiment.

There is here a first Group of Pictures (GOP) 10a, a second GOP lob and a third GOP loc. Each GOP 10a, lob, IOC comprises a number of frames or pictures. In this example, each GOP comprises eight frames, but this number can vary. The first GOP loa comprises a first frame 15a, a second frame 15b, a third frame 15c, a fourth frame I5d, a fifth frame 15ε, a sixth frame I5f, a seventh frame i5g and an eighth frame 15I1. There is also a last frame 15' of the previous GOP.

The last frame 15' of the previous GOP is an intra coded frame, also known as an I-frame. The eighth frame is a predicted frame, also known as a P-frame. The first to seventh frames i5a-g are all bi-directional predicted frames, also known as B-frames. I-frames are coded independently of all other frames. P-frames comprise motion-compensated difference information in relation to at least one previous frame. B-frames comprise motion-compensated difference information in relation to two (or more) frames.

Looking now to the first GOP, the frames create a hierarchy of dependencies, which dictates a chronological order of encoding, resulting in temporal layers. In a base layer To, the I-frame being the last frame 15' of the previous GOP and the P-frame being the eighth frame 15I1 can be encoded. The P- frame only depends on the I-frame. In fact, as part of the first GOP 10a, the base layer To only comprises the eighth frame 15I1. A level one layer Ti then comprises the fourth frame i5d which depends on the last frame 15' of the previous GOP and the eighth frame 15I1. A level two layer T2 comprises the second frame 15b, depending on the last frame 15' of the previous GOP and the fourth frame I5d, and the sixth frame I5f, depending on the fourth frame i5d and the eighth frame 15I1. A level three layer T3 comprises the first frame 15a, depending on the last frame 15' of the previous GOP and the second frame 15b, the third frame 15c, depending on the second frame 15b and the fourth frame I5d, the fifth frame 15c, depending on the fourth frame I5d and the sixth frame I5f, and the seventh frame I5g, depending on the sixth frame i5f and the eighth frame 15I1. All layers from layer Ti and above are called higher layers (which thus only comprise B-frames) and the To layer is called the base layer (which comprises I-frames and/or P-frames).

It can be seen from Fig 2 that for each GOP, the base layer needs to be encoded first, followed by the higher layers, layer by layer. Also, only frames of the base layer can be referred to when encoding frames in the next GOP. Using such hierarchical GOP structures only allows for a few frames to be encoded at the same time since there are dependencies between frames that need to be respected, i.e. the encoding needs to happen layer by layer from the base layer. It is possible to start encoding the next GOP structure when the base layer of the current GOP has finished encoding. However, each base layer encoding is essentially a sequential task which greatly limits the amount of parallelization within each GOP base layer encoding.

Another issue comes when switching video input streams. After switching has been done, any work done on encoding future GOP structures will have to be discarded since they are based on the old video input stream. This limits the amount of parallelization that can be exploited by a multiprocessor system until the base layer of the current GOP structure has finished encoding.

What the inventors have realised is that the base layers for future GOPs could encoded in advance by a graphic processing unit (GPU) for multiple streams. This is beneficial since the GPU typically comprises a large number of cores which can perform independent processing. Each such core, or separate groups of cores, then encodes the base layer for the next GOP for its own respective input video stream. In other words, looking also at Fig 1, the GPU encodes the base layer for a future GOP for each one of the input video streams 2oa-n in parallel. In this way, when the encoder device 1 switches source stream from one input video stream to another input video stream, the base layer has already been encoded for the next GOP, whereby the central processing unit (CPU) of the encoder device only needs to encode the higher layers, i.e. the B-frames, which reduces latency and CPU usage. After the switch, the GPU can continue to encode the next GOP, but now based on the newly selected source stream.

Fig 3 is a schematic entity relationship diagram illustrating entities related to encoding in the encoder device of Fig ι according to one embodiment.

There is here a CPU 3 and a GPU 4 which are both part of the encoder device. The GPU 4 encodes the base layer 11, which comprises at least one I-frame or at least one P-frame. The CPU 3 encodes at least one higher layer 12 which comprises only B-frames. It is the CPU 3 that is responsible for providing the output stream 22, which comprises a number of sequential GOPs 10. As described above, each GOP 10 is made up of a base layer 11 and one or more higher layers 12. Fig 4 is a schematic diagram showing some components of the encoder device 1 of Fig 1. A CPU 3 is capable of executing software instructions 66 stored in a CPU instruction memory 64, which can thus be a computer program product. The CPU 3 can be configured to execute the CPU related steps of the method described with reference to Figs 5A-B below. The CPU instruction memory 64 can be any combination of read and write memory (RAM) and read only memory (ROM). The CPU instruction memory 64 comprises persistent storage, which, for example, can be any single one or combination of magnetic memory, optical memory, solid state memory or even remotely mounted memory. A GPU 4 is capable of executing software instructions 67 stored in a GPU instruction memory 65, which can thus be a computer program product. The GPU 4 can be configured to execute the GPU related steps of the method described with reference to Figs 5A-B below. The GPU instruction memory 65 can be any combination of read and write memory (RAM) and read only memory (ROM). The GPU instruction memory 65 may comprise persistent storage, which, for example, can be any single one or combination of magnetic memory, optical memory, solid state memory or even remotely mounted memory. The software instructions 67 for the GPU can be preloaded in the GPU instruction memory 65 or dynamically provided by the CPU 3 at run-time.

A data memory 5 associated with the CPU 3 is also provided for reading and/or storing data during execution of software instructions in the CPU 3. The data memory 5 associated with the CPU can be any combination of read and write memory (RAM) and read only memory (ROM). For instance, the data memory 5 associated with the CPU can store video frames 7 used during encoding, e.g. higher level frames encoded by the CPU.

A data memory 6 associated with the GPU 4 is also provided for reading and/ or storing data during execution of software instructions in the GPU 4. The data memory 6 associated with the GPU can be any combination of read and write memory (RAM) and read only memory (ROM). For instance, the data memory 6 associated with the GPU can store video frames 8 used during encoding, e.g. base level frames encoded by the GPU.

The encoder device 1 further comprises an I/O interface 62 for

communicating with other external entities, e.g. to provide the output video stream for distribution to video clients or to obtain input video streams. Optionally, the I/O interface 62 also includes a user interface.

Other components of the encoder device 1 are omitted in order not to obscure the concepts presented herein.

Figs 5A-B are flow charts illustrating embodiments of methods for encoding a video stream performed in the encoder device of Figs 1 and 4. As shown in Fig 4, the encoder device 1 comprises both a CPU and a GPU. The method illustrated in Fig 5A will be described first.

In an encode higher layers step 42, B-frames for a current GOP of an input video stream are encoded in the CPU. This input video stream is currently selected as source stream of the encoder device. When I-frames and/ or P- frames previously encoded by the GPU are available, these form a base for the encoding of B-frames.

In an encode future base layer step 44, any I-frames and/ or P-frames (i.e. the base layer) for a future GOP (e.g. the next GOP) of at least one input video stream currently not selected as source stream of the encoder device are encoded in the GPU.

The encoding of base layer frames can be performed for a plurality of input video streams, all of which currently are not selected as the source. In such a case, different input video streams are encoded in different cores of the GPU. Since the GPU can contain a great number of cores, it is able to handle a great number of parallel input video streams in this way.

Optionally, this step comprises encoding, in the GPU, the base layer of the input video stream which is currently selected as source stream of the encoder device. In other words, the GPU may be assigned to encode the base layer also for the active input video stream.

Alternatively, the GPU only encodes the base layer for input video streams which are not selected as source stream of the encoder device. In this way, when no switch occurs, the CPU continuously encodes the input video stream selected to be the source stream of the encoder device. Optionally, the CPU can encode the base layer of a future GOP.

Since the encode higher layers step 42 is performed in the CPU and the encode future base layer step 44 is performed in the GPU, the encode higher layers step 42 and the encode future base layer step 44 can be performed in parallel.

In this way, the CPU is relieved of the task of encoding the base layer. This not only reduces CPU usage, but also allows a switch between input video streams with reduced latency, since after a switch, the CPU can directly start on higher layer encoding and does not need to encode the base layer. Since there is a greater degree of parallelism possible for the higher layers, multiple cores of the CPU can be utilised to perform the higher layer encoding with relatively low latency.

Due to the multiple cores of the GPU, this can be used for a great number of parallel potential input video streams compared to if a CPU is used for the base layer.

Since the last encoded GOP structure will form a reference for the input video stream encoding in the GPU, there is potentially high work per memory transfer ratio. This greatly benefits the GPU where transferring the reference frame is often a bottleneck.

Looking now to the method illustrated by the flow chart in Fig 5B, only new or modified steps compared to the method illustrated by the flow chart in Fig 5A will be described.

In a move base layer frames to GPU step 40, any I-frames and/or P-frames (i.e. the base layer) of the input video stream currently selected as source stream is moved from the memory 5 associated with the CPU 3 to the memory 6 associated with the GPU 4. This allows the GPU to, in the encode future base layer step 44, base the future base layer on the base layer of the current GOP conveniently stored in the memory 6 associated with the GPU 4. In this way, the encoder device can switch to any of the input video streams for which the base layer is encoded in the GPU.

In a conditional base layer done step 43, it is determined whether the base layer frame(s) of a current GOP have been encoded or not. If this is the case, the method continues to the encode higher layers step 42. Otherwise, the method continues to an encode base layer step 41. For instance, if the GPU is configured to encode the base layer also for the active stream, the base layer for the current GOP has been encoded. On the other hand, if the GPU is configured to only encode the base layer for the inactive streams, the base layer for the current GOP has not been encoded if there is no switch of input video stream for the current GOP (compared to the previous GOP). In the encode base layer step 41, the base layer for a current GOP of the input video stream currently selected as source stream of the encoder device is encoded by the CPU. This is of course more demanding on the CPU than letting the GPU do this, but when the base layer for the current GOP is not available, the CPU needs to perform this task.

In a select video stream step 46, an input video stream is selected as source stream of the encoder device 1. The source stream could be the same as before (which is most common over time) or switched to a new input stream. The selection of input video stream can be based on user input or a selection signal from another entity (not shown).

In a conditional switch step 47, it is determined whether the source stream has been selected to be a new input stream. If this is the case, the method continues to a move base layer frames to CPU step 48. Otherwise, the method returns to the move base layer frames to GPU step 40. In the move base layer frames to CPU step 48, any of the base layer frames for the future (i.e. next) GOP is moved from the memory 6 associated with the GPU 4 to the memory 5 associated with the CPU 3.

Fig 6 is a schematic diagram showing functional modules of the encoder device 1 of Figs 1 and 4. The modules can be implemented using software instructions such as a computer program executing in the network node 1 and/ or using hardware, such as application specific integrated circuits, field programmable gate arrays, discrete logical components, etc. The modules correspond to the steps in the methods illustrated in Figs 5A-B.

A CPU higher layer encoder 72 is configured to encode B-frames for the current GOP of an input video stream currently selected as source stream of the encoder device. This module corresponds to the encode higher layers step 42 of Figs 5A-B.

A GPU base layer encoder 74 is configured to encode the base layer frames for a future GOP of at least one input video stream currently not selected as source stream of the encoder device. This module corresponds to the encode future base layer step 44 of Figs 5A-B.

A frame mover 70 is configured to move frames between the memories associated with the CPU and the GPU. This module corresponds to the move base layer frames to GPU step 40 and move base layer frames to CPU step 48 of Fig 5B.

A CPU base layer encoder 71 is configured to encode the base layer for a current GOP of the input video stream currently selected as source stream of the encoder device. This module corresponds to the encode base layer step 41 of Fig 5B.

A video stream selector 76 is configured to select an input video stream as source stream of the encoder device 1. This module corresponds to the select video stream step 46 of Fig 5B.

Fig 7 shows one example of a computer program product 90 comprising computer readable means. On this computer readable means a computer program 91 can be stored, which computer program can cause a processor to execute a method according to embodiments described herein. In this example, the computer program product is an optical disc, such as a CD (compact disc) or a DVD (digital versatile disc) or a Blu-Ray disc. As explained above, the computer program product could also be embodied in a memory of a device, such as the computer program products 64 and/or 65 of Fig 4. While the computer program 91 is here schematically shown as a track on the depicted optical disk, the computer program can be stored in any way which is suitable for the computer program product, such as a removable solid state memory (e.g. a universal serial bus memory).

The invention has mainly been described above with reference to a few embodiments. However, as is readily appreciated by a person skilled in the art, other embodiments than the ones disclosed above are equally possible within the scope of the invention, as defined by the appended patent claims.

Claims

l6 CLAIMS

1. A method for encoding a video stream, the method being performed in an encoder device (1) comprising a central processing unit, CPU, (3) and a graphic processing unit, GPU, (4) the method comprising the steps of:

encoding (42), in the CPU, bi-directional predicted frames (B) for a current group of pictures, GOP, (10a) of an input video stream currently selected as source stream of the encoder device (1); and

encoding (44), in the GPU, any intra coded frames (I) and/ or predicted frames (P) for a future GOP (10b) of at least one input video stream currently not selected as source stream of the encoder device.

2. The method according to claim 1, further comprising the step of:

selecting (46) an input video stream as source stream of the encoder device (1);

after which the method is repeated.

3. The method according to claim 2, further comprising the step of:

moving (48) any intra coded frames (I) and/or predicted frames (P) for the future GOP (10b) from a memory (6) associated with the GPU (4) to a memory (5) associated with the CPU (3).

4. The method according to any one of the preceding claims, wherein in the step of encoding (42) bi-directional predicted frames (B), the encoding is based on intra coded frames (I) and/ or predicted frames (P) previously encoded by the GPU.

5. The method according to claim 4, further comprising:

encoding (41), in the CPU, any intra coded frames (I) and/or predicted frames (P) for a current GOP (10a) of at least one input video stream currently selected as source stream of the encoder device (1) when these frames have not previously been encoded by the GPU (4).

6. The method according to any one of the preceding claims, further comprising the step of: moving (40) any intra coded frames (I) and/ or predicted frames (P) of the input video stream currently selected as source stream from a memory (5) associated with the CPU (3) to a memory (6) associated with the GPU (4).

7. The method according to any one of the preceding claims, wherein the step of encoding (42) bi-directional predicted frames (B) and the step of encoding (44) any intra coded frames (I) and/ or predicted frames (P), are performed in parallel.

8. The method according to any one of the preceding claims, wherein the step of encoding (44) any intra coded frames (I) and/or predicted frames (P) is performed for a plurality of input video streams currently not selected as source stream of the encoder device, wherein different input video streams are encoded in different cores of the GPU (4).

9. The method according to any one of the preceding claims, wherein the step of encoding (44) any intra coded frames (I) and/or predicted frames (P) further comprises encoding (44), in the GPU, any intra coded frames (I) and/ or predicted frames (P) for a future GOP (10b) of the input video stream currently selected as source stream of the encoder device.

10. An encoder device (1) for encoding a video stream, the encoder device comprising:

a central processing unit, CPU (3);

a graphic processing unit, GPU (4);

a CPU memory (64) storing CPU instructions (66) that, when executed by the CPU (3), causes the encoder device to:

encode bi-directional predicted frames (B) for a current group of pictures, GOP, (10a) of an input video stream currently selected as source stream of the encoder device (1); and

a GPU memory (65) storing GPU instructions (67) that, when executed by the GPU (4), causes the encoder device to:

encode any intra coded frames (I) and/or predicted frames (P) for a l8 future GOP (10b) of at least one input video stream currently not selected as source stream of the encoder device.

11. The encoder device (1) according to claim 10, wherein the CPU instructions further comprise instructions that, when executed by the CPU

5 (3), causes the encoder device to:

select an input video stream as source stream of the encoder device (1); and to repeat the encoding of bi-directional predicted frames (B); and wherein the GPU instructions further comprise instructions that, when executed by the GPU (4), causes the encoder device to repeat the encoding of0 any intra coded frames (I) and/ or predicted frames (P).

12. The encoder device (1) according to claim 11, wherein the CPU instructions further comprise instructions that, when executed by the CPU (3), causes the encoder device to move any intra coded frames (I) and/ or predicted frames (P) for the future GOP (10b) from a memory (6) associated5 with the GPU (4) to a memory (5) associated with the CPU (3).

13. The encoder device (1) according to any one of claims 10 to 12, wherein the CPU instructions further comprise instructions that, when executed by the CPU (3), causes the encoder device to encode bi-directional predicted frames (B) based on intra coded frames (I) and/or predicted frames (P) o previously encoded by the GPU.

14. The encoder device (1) according to claim 13, wherein the CPU instructions further comprise instructions that, when executed by the CPU (3), causes the encoder device to encode any intra coded frames (I) and/ or predicted frames (P) for a current GOP (10b) of at least one input video5 stream currently selected as source stream of the encoder device (1) when these frames have not previously been encoded by the GPU (4).

15. The encoder device (1) according to any one of claims 10 to 14, wherein the CPU instructions further comprise instructions that, when executed by the CPU (3), causes the encoder device to move any intra coded frames (I)0 and/ or predicted frames (P) of the input video stream currently selected as source stream from a memory (5) associated with the CPU (3) to a memory (6) associated with the GPU (4).

16. The encoder device (1) according to any one of claims 10 to 15, wherein the CPU (3) and the GPU (4) execute their respective instructions in parallel.

17. The encoder device (1) according to any one of claims 10 to 16, wherein the GPU instructions further comprise instructions that, when executed by the GPU (4), causes the encoder device to encode any intra coded frames (I) and/ or predicted frames (P) for a plurality of input video streams currently not selected as source stream of the encoder device, and wherein different input video streams are encoded in different cores of the GPU (4).

18. The encoder device (1) according to any one of claims 10 to 17, wherein the CPU instructions to encode any intra coded frames (I) and/or predicted frames (P) for a future GOP comprise instructions that, when executed by the CPU (3), causes the encoder device to encode, in the GPU, any intra coded frames (I) and/ or predicted frames (P) for a future GOP (10b) of the input video stream currently selected as source stream of the encoder device.

19. An encoder device (1) comprising:

means for encoding, in a CPU (3) of the encoder device (1), bidirectional predicted frames (B) for a current group of pictures, GOP, (10a) of an input video stream currently selected as source stream of the encoder device (1); and

means for encoding, in the GPU of the encoder device (1), any intra coded frames (I) and/or predicted frames (P) for a future GOP (10b) of at least one input video stream currently not selected as source stream of the encoder device.

20. A computer program (91) for encoding a video stream, the computer program comprising computer program code which, when run on an encoder device (1) comprising a central processing unit, CPU, (3) and a graphic processing unit, GPU (4), causes the encoder device (1) to:

encode, in the CPU, bi-directional predicted frames (B) for a current group of pictures, GOP, (10a) of an input video stream currently selected as source stream of the encoder device (1); and

encode, in the GPU, any intra coded frames (I) and/or predicted frames (P) for a future GOP (10b) of at least one input video stream currently not selected as source stream of the encoder device.

21. A computer program product (90) comprising a computer program according to claim 20 and a computer readable means on which the computer program is stored.