WO2015002582A1 - Method and arrangement for video transcoding - Google Patents

Method and arrangement for video transcoding Download PDF

Info

Publication number
WO2015002582A1
WO2015002582A1 PCT/SE2013/050849 SE2013050849W WO2015002582A1 WO 2015002582 A1 WO2015002582 A1 WO 2015002582A1 SE 2013050849 W SE2013050849 W SE 2013050849W WO 2015002582 A1 WO2015002582 A1 WO 2015002582A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
format
pixel
motion vector
video signal
Prior art date
Application number
PCT/SE2013/050849
Other languages
French (fr)
Inventor
Jacob STRÖM
Original Assignee
Telefonaktiebolaget L M Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget L M Ericsson (Publ) filed Critical Telefonaktiebolaget L M Ericsson (Publ)
Priority to PCT/SE2013/050849 priority Critical patent/WO2015002582A1/en
Publication of WO2015002582A1 publication Critical patent/WO2015002582A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/40Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution

Definitions

  • Embodiments herein generally relates to the field of video coding, in particular to methods and arrangements for video transcoding.
  • a multimedia system might consist of various devices, such as PCs, laptops, PDAs and smart phones etc., interconnected via heterogeneous wireline and wireless networks.
  • multimedia content originally authored and compressed with a certain format might need bit rate adjustment and format conversion in order to allow access by receiving devices with diverse capabilities.
  • a transcoding mechanism is required to make the multimedia content adaptive to the capabilities of diverse networks and client devices. For example, if the bandwidth required for a particular video is fluctuating due to congestion or other causes, a transcoder can provide fine and dynamic adjustments in the bit rate of the video bitstream in the compressed domain without imposing additional functional requirements in the decoder.
  • a video transcoder can change the coding parameters of the compressed video, adjust spatial and temporal resolution, and modify the video content of and/or the coding standard used.
  • transcoding between different codecs. For instance, in a video conference scenario, not all users might be able to decode a particular video format. It might then be necessary for a node or device in the network to transcode from the current video format to a video format that the end user can accept.
  • the simplest way to do video transcoding is to first decode the video to pure pixels, and then encode the pixels again to the desired format. This is shown in Figure 1.
  • one problem with this is that it is not very efficient. Typically the encoding process is much more time consuming than the decoding process. It is possible to do the encoding quickly, but then the quality or bit rate suffers.
  • a known and more efficient way is to directly transcode from e.g. H.264 to VP8, as is shown in Figure 2.
  • the transcoder can remember selections made in the H. 264 format and use similar settings in the VP8 format. This increases the speed in relation to compression efficiency e.g. quality per bit.
  • Figure 3 depicts the five likely to be interesting in the next couple of years. As can be seen, in order to implement direct transcoding from every codec to every other codec 25 transcoder implementations are needed, and this is only for the five most common codecs.
  • transcoder implementations are needed. This is a huge number of transcoders to keep track of.
  • the number of needed transcoders goes up further due to the combinatorics.
  • transcoding to pixels does not provide sufficient speed in combination with compression efficiency.
  • a video transcoder arrangement which includes at least one decoder arrangement, each configured for decoding a received video signal of a respective first video format into video data of a common intermediate format, and at least two encoder arrangements, each configured for selectively encoding the video data of the common intermediate format into a video signal of a respective second video format.
  • a video transcoding method which includes the steps of providing a video signal of at least a first predetermined video format, and decoding, in a decoder arrangement, the provided video signal of the at least a first predetermined video format into video data of a common intermediate format. Further, the method includes the step of selectively encoding, in an encoder arrangement, the video data of the common intermediate format into a video signal of one of at least two second predetermined formats.
  • An advantage of the proposed technology enables quickly creating, maintaining and using a system of transcoders, where transcoding can happen from any codec to any other codec.
  • Fig. 1 illustrates a known transcoder
  • Fig. 2 illustrates a known transcoder
  • Fig. 3 illustrates necessary transcoder combinations
  • Fig. 4 illustrates an embodiment of the current technology
  • Fig. 5 illustrates an embodiment of a transcoding method of the current technology
  • Fig. 6 illustrates an embodiment of a transcoder arrangement according to the current technology
  • Fig. 7 illustrates a computer implementation of an embodiment of the current technology.
  • Motion estimation is the process of determining motion vectors that describe the transformation from one two-dimensional image to another, such as from adjacent video frames in a video sequence.
  • the motion vectors can relate to the entire image or specific parts thereof, such as rectangular blocks, arbitrary shaped patches or even per image pixel. Applying the motion vectors to an image to synthesize the transformation to the next image is called motion compensation. By determining motion vectors for the pixels or blocks of pixels in an image, it is possible to predict a subsequent image
  • the above described motion estimation can also be referred to as inter- prediction. Consequently, a prediction model is created from one or more previously encoded video frames.
  • the model is formed by shifting samples in a reference frame, which is a case of motion compensated prediction.
  • An image frame which includes a multitude of individual image pixels is typically divided into one or more sub-frames or macro blocks, each of which can be further divide into sub-blocks.
  • a motion vector can be determined for one or more pixels, for one or more sub-blocks, or for one or more macro- blocks. Finding out an appropriate motion vector during inter-prediction gives rise to a certain amount of computational load. With known transcoding arrangements, the result of this computational load is sometimes discarded during a transcoding operation, something that is dealt with and negated in the current disclosure.
  • a basic idea of the present disclosure is to use or define a common intermediate media format or an interface other than pure pixels such that every decoder can be coupled with every encoder.
  • every decoder can be coupled with every encoder.
  • every respective decoder and encoder per video format needs to be implemented, thus reducing the complexity of a transcoder arrangement and also simplifying the addition of new video formats.
  • Every new video format only requires one encoder and one decoder functionality to be added to the transcoder. This is shown in Figure 4, where a number of video formats are illustrated, and the encoding/ decoding process is indicated by the arrows.
  • the method can be implemented in a network node and/ or user device in a wireless or wired communication network.
  • a video signal of at least a first predetermined video format is provided in step S 10, as input to a transcoder arrangement or functionality.
  • the provided video signal is subsequently decoded in step S20, in a decoder arrangement, from the first predetermined video format into video data of a common intermediate format.
  • the video data of the common intermediate format is selectively encoded in step S30, in an encoder arrangement, into a video signal of one of at least two second predetermined formats.
  • the decoding step S20 includes decoding the video signal of the first predetermined video format into image pixel values and additional information relating to the video signal of the first predetermined video format. The additional information is then selectively used to encode the video data to the second predetermined video format.
  • the additional information can comprise a multitude of different types of information relating to the provided video signal, where the information can be used or ignored in a subsequent encoding operation. Thereby, it is possible to utilize already existing video information data that would otherwise be lost, thus further improving the efficiency of the transcoding operation.
  • the additional information includes at least motion vector information for one or more pixels of a decoded image.
  • This motion vector information could include one or more motion vectors related to some or each of the one or more pixels of the decoded image. As an example, one motion vector points towards a previous frame and another motion vector points towards a future frame.
  • the common intermediate format or interface can be a set of pixel images, but where every pixel not only has a color but also additional information such as a motion vector associated with it.
  • the aim is to transcode from one video format such as H.264 to another video format such as VP8.
  • the video signal is decoded into the above-mentioned interchangeable format where every pixel has a color and potentially a motion vector.
  • the encoder arrangement takes this interchangeable format as input when starting to encode to VP8.
  • the encoder compresses a block, it can examine the motion vector for e.g. the center pixel in the block.
  • the VP8 encoder arrangement can then try this motion vector position, which is likely to be good, instead of trying all possible motion vectors. It is thus likely that this will work much better than decoding followed by encoding as in Figure 1.
  • the motion vector associated with the center pixel does not yield a satisfactory result, the motion vector associated with one or more of the other pixels in the block can be tried, for instance the one associated with the top left pixel. If this also fails, the encoder can of course disregard the extra information and search for a motion vector from scratch. It should therefore never be worse than decoding following by encoding as in Figure 1.
  • the common interchangeable format is described to include pixels and "more".
  • “more” is the motion vector per pixel information described above.
  • Another type of information can be about the block structure, such as the number of surrounding pixels with a same motion vector as a current pixel. Are there 16, 32, or 64*64 pixels?
  • Other types of information can be whether the block was intra-coded or inter-coded, which reference picture or reference pictures the motion vector was referring to if it was inter-coded etc. In case a block has been intra- coded, the block is typically also predicted from previously coded parts of the image in question.
  • the pixels of the block are predicted from the left or from the top of the block.
  • the intra-prediction can have a directional quality.
  • the additional information could include an indication that directional intra-prediction has been used and if so, an indication on which direction a particular pixel has been intra-predicted from.
  • Any type of information that is possible to parse from the bit stream of the video signal and that is present in several of the other codecs can be of interest and part of the interchangeable format. Not all of the information needs to be used by the encoder. For example, some coders might have a fixed block structure, and will therefore ignore information about block structures.
  • the motion vector information can comprise motion vector information of every motion compensated pixel, and / or a predetermined value if the pixel is not motion compensated.
  • the common intermediate format can also comprise an indication whether a current pixel was intra coded or not.
  • the encoding step S30 can be based on at least part of the provided additional information. Thereby, an encoder which does not recognize all the provided additional information only utilizes the information that it does recognize.
  • the data that is available per pixel can be scaled using e.g. nearest neighbor to the new resolution. For instance, if a downscale from 640x480 to 320x240 is performed, the information stored in the interchangeable format might be of 320x240 pixels and contain the information of every second pixel in the x- and y-dimension from the 640x480 size.
  • a video transcoding method can be used to transcode from a first video format to a second different video format, or between a first video format to a same second video format in the case of changing bitrate or resolution or the like for a same video format.
  • the video transcoder arrangement 1 is configured for decoding received video signals of at least one first predetermined format into video data of a common intermediate format, and further configured for selectively encoding the video data into a transcoded video signal of a second video format.
  • an embodiment of a video transcoder arrangement 1 includes at least one decoder arrangement 20, each configured for decoding a received video signal of a respective first video format into video data of a common intermediate format. Further, the video transcoder arrangement 1 includes at least two encoder arrangements 30, each of which is configured for selectively encoding the video data from the common intermediate format into a transcoded video signal of a respective second video format. In this embodiment one decoder arrangement 20 and two encoder arrangements 30 are disclosed. However, in a particular embodiment a plurality of both encoder arrangements 30 and decoder arrangements 20 are provided, thus enabling transcoding from any one of a plurality of first video formats into any one of a plurality of second video formats.
  • the first and the second video formats can be a same video format but with different bitrate or other characteristics, or be altogether different video formats.
  • the video transcoder arrangement is beneficially implemented in a network node or a user equipment node or device in a wireless or wired communication system.
  • the video transcoder arrangement includes all necessary equipment e.g. RX/TX, antenna etc. for successfully receiving and transmitting video signals. Consequently, the transcoder can be included and utilized in a device or node before transmitting the transcoded signal, or included and utilized in a device or node upon reception of a video signal.
  • the video transcoder arrangement 1 is configured to enable all functionality described with relation to Figure 6. These include, decoding video signals into an intermediate format comprising image pixel values and motion vector information for the received video signal.
  • Image pixel values can be generally understood to include pixel colors such as for example red, green and blue (RGB), or cyan, magenta, yellow and black (CMYK), or luminance, chrominance-U and chrominance-V (YUV).
  • the motion vector information can comprise motion vector information of every motion compensated pixel, or a predetermined value if the pixel is not motion compensated.
  • the intermediate format can include an indication whether a current pixel was intra coded or not.
  • the transcoder arrangement and method of the current technology can be utilized to transcode a video signal from a first video format to a second different video format, or from one video format to the same video format with a different compression or bitrate. The latter case occurs when a certain bitrate is not supported by a receiving device or if the available bandwidth is reduced for the transmission of the video signal.
  • the embodiments of the present technology can also be viewed as a system capable of transcoding between at least two combinations of formats (A to B and A to C) for instance where each such transcoding is divided into two steps, by decoding from one format to an intermediate format containing pixel values such as R, G and B (or Y, U or V) and additional information, and encoding from said intermediate format to another format and where the same intermediate format is used for all transcoding operations in the system.
  • a to B and A to C for instance where each such transcoding is divided into two steps, by decoding from one format to an intermediate format containing pixel values such as R, G and B (or Y, U or V) and additional information, and encoding from said intermediate format to another format and where the same intermediate format is used for all transcoding operations in the system.
  • Particular examples include one or more suitably configured digital signal processors and other known electronic circuits, e.g. discrete logic gates interconnected to form a specialized function, or Application Specific Integrated Circuits, ASICs.
  • processing circuitry includes, but is not limited to, one or more microprocessors, one or more Digital Signal Processors (DSPs), one or more Central Processing Units (CPUs), video acceleration hardware, and/or any suitable programmable logic circuitry, such as one ore more Field Programmable Gate Arrays (FPGAs), or one or more Programmable Logic Controllers (PLCs).
  • DSPs Digital Signal Processors
  • CPUs Central Processing Units
  • FPGAs Field Programmable Gate Arrays
  • PLCs Programmable Logic Controllers
  • the transcoder arrangement 200 comprises processing circuitry such as one or more processors 210 e.g. a micro processor, which executes a software component 221 for providing a video signal of first predetermined video format, and a software component 222 for decoding the provided video signal into video data of a common intermediate format. Further, the transcoder arrangement 200 includes a software component 223 for encoding the video data of the common intermediate format into a transcoded video signal of a second predetermined video format. These software components are stored in a memory 220.
  • the processing circuitry 210 and memory 220 are interconnected to each other to enable normal software execution.
  • An optional input/output device might also be interconnected to the processing circuitry and/ or the memory to enable input and/ or output of relevant data such as input parameter(s) and/or resulting output parameter(s) .
  • the processor 210 communicates with the memory over a system bus.
  • a video signal is received by an input/output (I/O) controller 230 controlling an I/O bus, to which the processor 210 and the memory 220 are connected.
  • I/O input/output
  • the signal received by the I/O controller 230 is stored in the memory 220, where it is processed by the software components.
  • Software component 221 might implement the functionality of the video signal-providing step S 10 in the embodiment described with reference to Figure 5.
  • Software component 222 might implement the functionality of the decoding step S20, also with reference to Figure 5.
  • software component 223 might implement the functionality of the encoding step S30 described with reference to Figure 5.
  • the I/O unit 230 might be interconnected to the processor 210 and the memory 220 via an I/O bus to enable input and/or output of relevant data such as input parameters and/or resulting output parameters.
  • the term 'computer' should be interpreted in a general sense as any system or device capable of executing program code or computer program instructions to perform a particular processing, determining or computing task.
  • the computer program comprises program code which when executed by the processing circuitry or computer causes the processing circuitry or computer to provide S 10 a video signal of at least a first predetermined video format, and decoding S20, in a decoder arrangement, the provided video signal of the at least first predetermined video format into video data of a common intermediate format.
  • the processing circuitry or computer selectively encodes S30, in an encoder arrangement, the video data of the common intermediate format into a video signal of one of at least two second predetermined formats.
  • the software or computer program might be realized as a computer program product, which is normally carried or stored on a computer-readable medium.
  • the computer-readable medium might include one or more removable or non-removable memory devices including, but not limited to a Read-Only Memory, ROM, a Random Access Memory, RAM, a Compact Disc, CD, a Digital Versatile Disc, DVD, a Universal Serial Bus, USB, memory, a Hard Disk Drive, HDD, storage device, a flash memory, or any other conventional memory device.
  • the computer program might thus be loaded into the operating memory of a computer or equivalent processing device for execution by the processing circuitry thereof.
  • the computer program stored in memory includes program instructions executable by the processing circuitry, whereby the processing circuitry is able or operative to execute the above-described steps, functions, procedures and / or blocks.
  • the transcoder arrangement 1 is thus configured to perform, when executing the computer program, well-defined processing tasks such as those described above.
  • the computer or processing circuitry does not have to be dedicated to only execute the above-described steps, functions, procedures, and/ or blocks, but might also execute other tasks.
  • the main advantage of the present technology is the possibility to quickly create, maintain and use as transcoder or system of transcoders where transcoding can happen from any codec to any other codec, without the penalty of doing blind re-encoding as in Figure 1 and without the combinatorial explosion described in Figure 3.

Abstract

In a video transcoder arrangement (1), including at least one decoder arrangement (20), each configured for decoding a received video signal of a respective first video format into video data of a common intermediate format, and at least two encoder arrangements (30), each configured for selectively encoding the video data of the common intermediate format into a video signal of a respective second video format.

Description

METHOD AND ARRANGEMENT FOR VIDEO TRANSCODING
TECHNICAL FIELD
Embodiments herein generally relates to the field of video coding, in particular to methods and arrangements for video transcoding.
BACKGROUND
One of the fundamental challenges in deploying multimedia systems is to deliver smooth and uninterruptible flow of audio-visual information. A multimedia system might consist of various devices, such as PCs, laptops, PDAs and smart phones etc., interconnected via heterogeneous wireline and wireless networks. In such systems, multimedia content originally authored and compressed with a certain format might need bit rate adjustment and format conversion in order to allow access by receiving devices with diverse capabilities. Thus, a transcoding mechanism is required to make the multimedia content adaptive to the capabilities of diverse networks and client devices. For example, if the bandwidth required for a particular video is fluctuating due to congestion or other causes, a transcoder can provide fine and dynamic adjustments in the bit rate of the video bitstream in the compressed domain without imposing additional functional requirements in the decoder. In addition, a video transcoder can change the coding parameters of the compressed video, adjust spatial and temporal resolution, and modify the video content of and/or the coding standard used.
In many situations, it is necessary to do transcoding between different codecs. For instance, in a video conference scenario, not all users might be able to decode a particular video format. It might then be necessary for a node or device in the network to transcode from the current video format to a video format that the end user can accept. The simplest way to do video transcoding is to first decode the video to pure pixels, and then encode the pixels again to the desired format. This is shown in Figure 1. However, one problem with this is that it is not very efficient. Typically the encoding process is much more time consuming than the decoding process. It is possible to do the encoding quickly, but then the quality or bit rate suffers.
A known and more efficient way is to directly transcode from e.g. H.264 to VP8, as is shown in Figure 2. In this case, the transcoder can remember selections made in the H. 264 format and use similar settings in the VP8 format. This increases the speed in relation to compression efficiency e.g. quality per bit.
For instance, most video coders use some form of motion vectors. Finding the best motion vector to use is a slow process. Forcing this search to happen quicker typically lowers quality for a certain bit rate. However, in the above described "direct transcoding" case, the motion vector used by the H.264 format can be used as a starting point for the search. This will mean that the VP8 format will converge quicker to a good solution. Another way to look at this is that when going from pixels in Figure 1, a lot of useful information that has taken a lot of computational effort to obtain is typically discarded. In the "direct transcode" case, the information is preserved.
A real world example of a direct transcoder is presented by Thomas Rusert and Sina Tamanna [ 1], and this is a good example of existing technology in this area.
However, creating a transcoder that operates directly in this fashion is unique to the particular codecs between which transcoding takes place. When transcoding to HEVC instead of to VP8 the direct transcoder would be different.
By the advent of Googels codecs VP8 and soon-to-come VP9 there are suddenly many more possible codecs around. Figure 3 depicts the five likely to be interesting in the next couple of years. As can be seen, in order to implement direct transcoding from every codec to every other codec 25 transcoder implementations are needed, and this is only for the five most common codecs.
In the example of Figure 3, the case of transcoding from one format to itself is included. This is a rather common case, for instance, when an end user cannot receive video at the current resolution and bit rate, and a network node has to down-sample the video.
As stated above, 25 transcoder implementations are needed. This is a huge number of transcoders to keep track of. In addition, every time there is a new codec the number of needed transcoders goes up further due to the combinatorics. At the same time, transcoding to pixels does not provide sufficient speed in combination with compression efficiency.
Consequently, there is a need for improving the efficiency of video transcoding, as the number of possible codecs is continuously increasing.
SUMMARY
It is an object to provide an improved transcoder.
This and other objects are met by embodiments of the proposed technology. According to a first aspect, there is provided a video transcoder arrangement which includes at least one decoder arrangement, each configured for decoding a received video signal of a respective first video format into video data of a common intermediate format, and at least two encoder arrangements, each configured for selectively encoding the video data of the common intermediate format into a video signal of a respective second video format.
According to a second aspect, there is provided a video transcoding method, which includes the steps of providing a video signal of at least a first predetermined video format, and decoding, in a decoder arrangement, the provided video signal of the at least a first predetermined video format into video data of a common intermediate format. Further, the method includes the step of selectively encoding, in an encoder arrangement, the video data of the common intermediate format into a video signal of one of at least two second predetermined formats.
An advantage of the proposed technology enables quickly creating, maintaining and using a system of transcoders, where transcoding can happen from any codec to any other codec.
Other advantages will be appreciated when reading the detailed description.
BRIEF DESCRIPTION OF THE DRAWINGS The embodiments, together with further objects and advantages thereof, might best be understood by referring to the following description taken together with the accompanying drawings, in which:
Fig. 1 illustrates a known transcoder;
Fig. 2 illustrates a known transcoder;
Fig. 3 illustrates necessary transcoder combinations;
Fig. 4 illustrates an embodiment of the current technology; Fig. 5 illustrates an embodiment of a transcoding method of the current technology;
Fig. 6 illustrates an embodiment of a transcoder arrangement according to the current technology;
Fig. 7 illustrates a computer implementation of an embodiment of the current technology.
DETAILED DESCRIPTION
Throughout the drawings, the same reference numbers are used for similar or corresponding elements.
For a better understanding of the proposed technology, it might be useful to begin with a brief overview of what motion estimation and motion vectors in relation to pixels of decoded video signals entails.
Motion estimation is the process of determining motion vectors that describe the transformation from one two-dimensional image to another, such as from adjacent video frames in a video sequence. The motion vectors can relate to the entire image or specific parts thereof, such as rectangular blocks, arbitrary shaped patches or even per image pixel. Applying the motion vectors to an image to synthesize the transformation to the next image is called motion compensation. By determining motion vectors for the pixels or blocks of pixels in an image, it is possible to predict a subsequent image
The above described motion estimation can also be referred to as inter- prediction. Consequently, a prediction model is created from one or more previously encoded video frames. The model is formed by shifting samples in a reference frame, which is a case of motion compensated prediction. An image frame which includes a multitude of individual image pixels is typically divided into one or more sub-frames or macro blocks, each of which can be further divide into sub-blocks. A motion vector can be determined for one or more pixels, for one or more sub-blocks, or for one or more macro- blocks. Finding out an appropriate motion vector during inter-prediction gives rise to a certain amount of computational load. With known transcoding arrangements, the result of this computational load is sometimes discarded during a transcoding operation, something that is dealt with and negated in the current disclosure.
A basic idea of the present disclosure is to use or define a common intermediate media format or an interface other than pure pixels such that every decoder can be coupled with every encoder. Thereby, only one respective decoder and encoder per video format needs to be implemented, thus reducing the complexity of a transcoder arrangement and also simplifying the addition of new video formats. Every new video format only requires one encoder and one decoder functionality to be added to the transcoder. This is shown in Figure 4, where a number of video formats are illustrated, and the encoding/ decoding process is indicated by the arrows.
With reference to Figure 5, an embodiment of a video transcoding method will be described. The method can be implemented in a network node and/ or user device in a wireless or wired communication network.
Initially a video signal of at least a first predetermined video format is provided in step S 10, as input to a transcoder arrangement or functionality. The provided video signal is subsequently decoded in step S20, in a decoder arrangement, from the first predetermined video format into video data of a common intermediate format. Finally, the video data of the common intermediate format is selectively encoded in step S30, in an encoder arrangement, into a video signal of one of at least two second predetermined formats.
By using the common intermediate format, it is possible to decode a video signal into video data with a common intermediate format that is useable for multiple encoders. According to a further embodiment, the decoding step S20 includes decoding the video signal of the first predetermined video format into image pixel values and additional information relating to the video signal of the first predetermined video format. The additional information is then selectively used to encode the video data to the second predetermined video format.
The additional information can comprise a multitude of different types of information relating to the provided video signal, where the information can be used or ignored in a subsequent encoding operation. Thereby, it is possible to utilize already existing video information data that would otherwise be lost, thus further improving the efficiency of the transcoding operation. According to a particular embodiment, the additional information includes at least motion vector information for one or more pixels of a decoded image. This motion vector information could include one or more motion vectors related to some or each of the one or more pixels of the decoded image. As an example, one motion vector points towards a previous frame and another motion vector points towards a future frame.
According to an embodiment of the present disclosure, the common intermediate format or interface can be a set of pixel images, but where every pixel not only has a color but also additional information such as a motion vector associated with it. Assume that the aim is to transcode from one video format such as H.264 to another video format such as VP8. Instead of doing direct decoding or decoding to pixels only, as mentioned in the background section, according to embodiments of the video transcoding method of the current disclosure, the video signal is decoded into the above-mentioned interchangeable format where every pixel has a color and potentially a motion vector. For pixels belonging to so-called intra-blocks (which have no motion compensation) it is possible to introduce an indication or a dummy value, such as not-a-number (NaN). The encoder arrangement takes this interchangeable format as input when starting to encode to VP8. When the encoder compresses a block, it can examine the motion vector for e.g. the center pixel in the block. The VP8 encoder arrangement can then try this motion vector position, which is likely to be good, instead of trying all possible motion vectors. It is thus likely that this will work much better than decoding followed by encoding as in Figure 1. If the motion vector associated with the center pixel does not yield a satisfactory result, the motion vector associated with one or more of the other pixels in the block can be tried, for instance the one associated with the top left pixel. If this also fails, the encoder can of course disregard the extra information and search for a motion vector from scratch. It should therefore never be worse than decoding following by encoding as in Figure 1.
In addition, this also solves the above-described combinatorial explosion. Since the same common interchangeable format can be used for all codecs, it is only necessary to implement five decoders and five encoders as shown in Figure 4. In other words, one decoder for decoding from each particular format into the interchangeable format, and one encoder for encoding from the interchangeable format into each particular format. When introducing a new codec, only one new decoder and one new encoder need to be implemented. This enables a more efficient manner in which to add new video formats to the transcoder capabilities.
In Figure 4, the common interchangeable format is described to include pixels and "more". One example of "more" is the motion vector per pixel information described above. Another can be a "true/false" flag indicating whether the block or pixel was skipped. Another type of information can be about the block structure, such as the number of surrounding pixels with a same motion vector as a current pixel. Are there 16, 32, or 64*64 pixels? Other types of information can be whether the block was intra-coded or inter-coded, which reference picture or reference pictures the motion vector was referring to if it was inter-coded etc. In case a block has been intra- coded, the block is typically also predicted from previously coded parts of the image in question. For instance, the pixels of the block are predicted from the left or from the top of the block. In other words, the intra-prediction can have a directional quality. Consequently, the additional information, according to a further embodiment, could include an indication that directional intra-prediction has been used and if so, an indication on which direction a particular pixel has been intra-predicted from. Any type of information that is possible to parse from the bit stream of the video signal and that is present in several of the other codecs can be of interest and part of the interchangeable format. Not all of the information needs to be used by the encoder. For example, some coders might have a fixed block structure, and will therefore ignore information about block structures.
Consequently, according to various embodiments, the motion vector information can comprise motion vector information of every motion compensated pixel, and / or a predetermined value if the pixel is not motion compensated. In addition, the common intermediate format can also comprise an indication whether a current pixel was intra coded or not.
Since not all codecs utilize motion vectors in the same manner, the encoding step S30 can be based on at least part of the provided additional information. Thereby, an encoder which does not recognize all the provided additional information only utilizes the information that it does recognize.
When going between resolutions, the data that is available per pixel can be scaled using e.g. nearest neighbor to the new resolution. For instance, if a downscale from 640x480 to 320x240 is performed, the information stored in the interchangeable format might be of 320x240 pixels and contain the information of every second pixel in the x- and y-dimension from the 640x480 size.
The above-described embodiments of a video transcoding method can be used to transcode from a first video format to a second different video format, or between a first video format to a same second video format in the case of changing bitrate or resolution or the like for a same video format.
With reference to Figure 6, an embodiment of a video transcoder arrangement 1 according to the present technology will be described. The video transcoder arrangement 1 is configured for decoding received video signals of at least one first predetermined format into video data of a common intermediate format, and further configured for selectively encoding the video data into a transcoded video signal of a second video format.
Consequently, an embodiment of a video transcoder arrangement 1, includes at least one decoder arrangement 20, each configured for decoding a received video signal of a respective first video format into video data of a common intermediate format. Further, the video transcoder arrangement 1 includes at least two encoder arrangements 30, each of which is configured for selectively encoding the video data from the common intermediate format into a transcoded video signal of a respective second video format. In this embodiment one decoder arrangement 20 and two encoder arrangements 30 are disclosed. However, in a particular embodiment a plurality of both encoder arrangements 30 and decoder arrangements 20 are provided, thus enabling transcoding from any one of a plurality of first video formats into any one of a plurality of second video formats.
According to a further embodiment, the first and the second video formats can be a same video format but with different bitrate or other characteristics, or be altogether different video formats.
The video transcoder arrangement is beneficially implemented in a network node or a user equipment node or device in a wireless or wired communication system.
Although not disclosed in Figure 6, the video transcoder arrangement includes all necessary equipment e.g. RX/TX, antenna etc. for successfully receiving and transmitting video signals. Consequently, the transcoder can be included and utilized in a device or node before transmitting the transcoded signal, or included and utilized in a device or node upon reception of a video signal.
Further, the video transcoder arrangement 1 is configured to enable all functionality described with relation to Figure 6. These include, decoding video signals into an intermediate format comprising image pixel values and motion vector information for the received video signal.
Image pixel values can be generally understood to include pixel colors such as for example red, green and blue (RGB), or cyan, magenta, yellow and black (CMYK), or luminance, chrominance-U and chrominance-V (YUV). The motion vector information can comprise motion vector information of every motion compensated pixel, or a predetermined value if the pixel is not motion compensated. In addition, according to a further embodiment, the intermediate format can include an indication whether a current pixel was intra coded or not. As mentioned previously, the transcoder arrangement and method of the current technology can be utilized to transcode a video signal from a first video format to a second different video format, or from one video format to the same video format with a different compression or bitrate. The latter case occurs when a certain bitrate is not supported by a receiving device or if the available bandwidth is reduced for the transmission of the video signal.
The embodiments of the present technology can also be viewed as a system capable of transcoding between at least two combinations of formats (A to B and A to C) for instance where each such transcoding is divided into two steps, by decoding from one format to an intermediate format containing pixel values such as R, G and B (or Y, U or V) and additional information, and encoding from said intermediate format to another format and where the same intermediate format is used for all transcoding operations in the system.
The steps, functions, procedures, and/ or blocks described above might be implemented in hardware using any conventional technology, such as discrete circuit or integrated circuit technology, including both general- purpose electronic circuitry and application-specific circuitry.
Particular examples include one or more suitably configured digital signal processors and other known electronic circuits, e.g. discrete logic gates interconnected to form a specialized function, or Application Specific Integrated Circuits, ASICs.
Alternatively, at least some of the steps, functions, procedures, and/ or blocks described above might be implemented in software such as a computer program for execution by suitable processing circuitry including one or more processing units. Examples of processing circuitry includes, but is not limited to, one or more microprocessors, one or more Digital Signal Processors (DSPs), one or more Central Processing Units (CPUs), video acceleration hardware, and/or any suitable programmable logic circuitry, such as one ore more Field Programmable Gate Arrays (FPGAs), or one or more Programmable Logic Controllers (PLCs).
It should also be understood that it might be possible to re-use the general processing capabilities of any conventional device or unit in which the proposed technology is implemented. It might also be possible to re-use existing software or by adding new software components.
In the following, an example of a computer implementation will be described with reference to Figure 8. The transcoder arrangement 200 comprises processing circuitry such as one or more processors 210 e.g. a micro processor, which executes a software component 221 for providing a video signal of first predetermined video format, and a software component 222 for decoding the provided video signal into video data of a common intermediate format. Further, the transcoder arrangement 200 includes a software component 223 for encoding the video data of the common intermediate format into a transcoded video signal of a second predetermined video format. These software components are stored in a memory 220. In this particular example, at least some of the steps, functions, procedures, and/or blocks described above are implemented in a computer program, which is loaded into the memory for execution by the processing circuitry. The processing circuitry 210 and memory 220 are interconnected to each other to enable normal software execution. An optional input/output device might also be interconnected to the processing circuitry and/ or the memory to enable input and/ or output of relevant data such as input parameter(s) and/or resulting output parameter(s) . The processor 210 communicates with the memory over a system bus. A video signal is received by an input/output (I/O) controller 230 controlling an I/O bus, to which the processor 210 and the memory 220 are connected. In this embodiment, the signal received by the I/O controller 230 is stored in the memory 220, where it is processed by the software components. Software component 221 might implement the functionality of the video signal-providing step S 10 in the embodiment described with reference to Figure 5. Software component 222 might implement the functionality of the decoding step S20, also with reference to Figure 5. Finally, software component 223 might implement the functionality of the encoding step S30 described with reference to Figure 5. The I/O unit 230 might be interconnected to the processor 210 and the memory 220 via an I/O bus to enable input and/or output of relevant data such as input parameters and/or resulting output parameters.
The term 'computer' should be interpreted in a general sense as any system or device capable of executing program code or computer program instructions to perform a particular processing, determining or computing task. In a particular embodiment, the computer program comprises program code which when executed by the processing circuitry or computer causes the processing circuitry or computer to provide S 10 a video signal of at least a first predetermined video format, and decoding S20, in a decoder arrangement, the provided video signal of the at least first predetermined video format into video data of a common intermediate format. Finally, the processing circuitry or computer selectively encodes S30, in an encoder arrangement, the video data of the common intermediate format into a video signal of one of at least two second predetermined formats.
The software or computer program might be realized as a computer program product, which is normally carried or stored on a computer-readable medium. The computer-readable medium might include one or more removable or non-removable memory devices including, but not limited to a Read-Only Memory, ROM, a Random Access Memory, RAM, a Compact Disc, CD, a Digital Versatile Disc, DVD, a Universal Serial Bus, USB, memory, a Hard Disk Drive, HDD, storage device, a flash memory, or any other conventional memory device. The computer program might thus be loaded into the operating memory of a computer or equivalent processing device for execution by the processing circuitry thereof.
For example, the computer program stored in memory includes program instructions executable by the processing circuitry, whereby the processing circuitry is able or operative to execute the above-described steps, functions, procedures and / or blocks.
The transcoder arrangement 1 is thus configured to perform, when executing the computer program, well-defined processing tasks such as those described above.
The computer or processing circuitry does not have to be dedicated to only execute the above-described steps, functions, procedures, and/ or blocks, but might also execute other tasks. The main advantage of the present technology is the possibility to quickly create, maintain and use as transcoder or system of transcoders where transcoding can happen from any codec to any other codec, without the penalty of doing blind re-encoding as in Figure 1 and without the combinatorial explosion described in Figure 3.
The embodiments described above are merely given as examples, and it should be understood that the proposed technology is not limited thereto. It will be understood by those skilled in the art that various modifications, combinations and changes might be made to the embodiments without departing from the present scope as defined by the appended claims. In particular, different part solutions in the different embodiments can be combined in other configurations, where technically possible.

Claims

1. A video transcoder arrangement ( 1), comprising:
at least one decoder arrangement ( 10), each configured for decoding a received video signal of a respective first video format into video data of a common intermediate format;
at least two encoder arrangements (20), each configured for selectively encoding said video data of said common intermediate format into a video signal of a respective second video format.
2. The video transcoder arrangement according to claim 1, wherein said common intermediate format comprises image pixel values and motion vector information for said decoded received video signal.
3. The video transcoder arrangement according to claim 2, wherein said motion vector information comprises motion vector information of every motion compensated pixel.
4. The video transcoder arrangement according to claim 3, wherein said motion vector information comprises a predetermined value if the pixel is not motion compensated.
5. The video transcoder arrangement according to claim 1, wherein said common intermediate format further comprises an indication whether the pixel was intra coded or not.
6. The video transcoder arrangement according to claim 5, wherein if the pixel was intra-coded, said additional information comprising an indication whether said pixel was directionally predicted.
7. The video transcoder arrangement according to claim 6, wherein said additional information comprising an indication that said pixel was predicted directionally and an indication of a direction from which said pixel was predicted.
8. The video transcoder arrangement according to claim 1, wherein said first video format and said at least two second video formats comprise the same or different video formats.
9. The video transcoder arrangement according to any of claims 1-8, wherein said video transcoder arrangement ( 1) comprising a plurality of decoder arrangements ( 10) and a plurality of encoder arrangements (20), thereby enabling selectively transcoding a video signal from any of a plurality of first formats to a video signal of any of a plurality of second formats.
10. A video transcoding method, comprising the steps of:
providing (S 10) a video signal of at least a first predetermined video format;
decoding (S20), in a decoder arrangement, said provided video signal of said at least a first predetermined video format into video data of a common intermediate format;
selectively encoding (S30), in an encoder arrangement, said video data of said common intermediate format into a video signal of one of at least two second predetermined formats.
1 1. The video transcoding method according to claim 10, wherein said decoding step (S20) comprises decoding said video signal into image pixel values and additional information relating to said video signal of said first predetermined video format.
12. The video transcoding method according to claim 1 1, wherein said additional information includes at least motion vector information.
13. The video transcoding method according to claim 12, wherein said at least motion vector information comprises motion vector information of every motion compensated pixel.
14. The video transcoding method according to claim 12, wherein said at least motion vector information comprises a predetermined value if the pixel is not motion compensated.
15. The video transcoding method according to claim 1 1, wherein said intermediate format also comprises an indication whether the pixel was intra coded or not.
16. The video transcoding method according to claim 15, wherein if the pixel was intracoded, said additional information comprising an indication whether said pixel was directionally predicted..
17. The video transcoding method according to claim 14, wherein said additional information comprising an indication that said pixel was predicted directionally and an indication of a direction from which said pixel was predicted.
18. The video transcoding method according to claim 1 1, wherein said encoding step (S30) is based on at least part of the provided additional information.
PCT/SE2013/050849 2013-07-02 2013-07-02 Method and arrangement for video transcoding WO2015002582A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/SE2013/050849 WO2015002582A1 (en) 2013-07-02 2013-07-02 Method and arrangement for video transcoding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/SE2013/050849 WO2015002582A1 (en) 2013-07-02 2013-07-02 Method and arrangement for video transcoding

Publications (1)

Publication Number Publication Date
WO2015002582A1 true WO2015002582A1 (en) 2015-01-08

Family

ID=48856918

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2013/050849 WO2015002582A1 (en) 2013-07-02 2013-07-02 Method and arrangement for video transcoding

Country Status (1)

Country Link
WO (1) WO2015002582A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3267686A4 (en) * 2015-03-03 2018-08-08 Tencent Technology (Shenzhen) Company Limited Video source access method and device

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
AHMAD I ET AL: "Video Transcoding: An Overview of Various Techniques and Research Issues", IEEE TRANSACTIONS ON MULTIMEDIA, IEEE SERVICE CENTER, PISCATAWAY, NJ, US, vol. 7, no. 5, 1 October 2005 (2005-10-01), pages 793 - 804, XP011139259, ISSN: 1520-9210, DOI: 10.1109/TMM.2005.854472 *
AMIR E ET AL: "AN APPLICATION LEVEL VIDEO GATEWAY", PROCEEDINGS OF ACM MULTIMEDIA '95 SAN FRANCISCO, NOV. 5 - 9, 1995; [PROCEEDINGS OF ACM MULTIMEDIA], NEW YORK, ACM, US, 5 November 1995 (1995-11-05), pages 255 - 265, XP000599037, ISBN: 978-0-201-87774-8 *
CHRISTIAN FELLER ET AL: "The VP8 video codec - overview and comparison to H.264/AVC", CONSUMER ELECTRONICS - BERLIN (ICCE-BERLIN), 2011 IEEE INTERNATIONAL CONFERENCE ON, IEEE, 6 September 2011 (2011-09-06), pages 57 - 61, XP031968592, ISBN: 978-1-4577-0233-4, DOI: 10.1109/ICCE-BERLIN.2011.6031852 *
WILLY AUBRY ET AL: "A generic video adaptation framework towards content-and context-awareness in future networks", SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012 PROCEEDINGS OF THE 20TH EUROPEAN, IEEE, 27 August 2012 (2012-08-27), pages 2218 - 2222, XP032254850, ISBN: 978-1-4673-1068-0 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3267686A4 (en) * 2015-03-03 2018-08-08 Tencent Technology (Shenzhen) Company Limited Video source access method and device

Similar Documents

Publication Publication Date Title
KR20180111839A (en) Method and device for encoding / decoding an image unit comprising image data represented by a luminance channel and at least one chrominance channel
US8422772B2 (en) Decoding device, decoding method, and receiving device
WO2018010662A1 (en) Video file transcoding method and device, and storage medium
JP6632638B2 (en) Methods and configurations for transcoding
US10404987B2 (en) Layer switching in video coding
US11706424B2 (en) Device and method of video decoding with first and second decoding code
WO2020083403A1 (en) Image prediction method and device
US11849124B2 (en) Device and method of video encoding with first and second encoding code
TW201740731A (en) A method and device for intra-predictive encoding/decoding a coding unit comprising picture data, said intra-predictive encoding depending on a prediction tree and a transform tree
JP2014131141A (en) Coding system conversion device, coding system conversion method, and program
US10735735B2 (en) Guided transcoding
KR20210020915A (en) Method and apparatus for video encoding and decoding based on asymmetric binary partitioning of image blocks
WO2022063729A1 (en) Template matching prediction for versatile video coding
KR20210018270A (en) Syntax elements for video encoding or decoding
US11943473B2 (en) Video decoding method and apparatus, video encoding method and apparatus, storage medium, and electronic device
CN102934445A (en) Methods and apparatuses for encoding and decoding image based on segments
US20140204995A1 (en) Efficient region of interest detection
WO2015002582A1 (en) Method and arrangement for video transcoding
JP2021528893A (en) Multi-reference intra-prediction with variable weights
JP4797999B2 (en) Image encoding / decoding device
US9788025B2 (en) Reproduction device, encoding device, and reproduction method
JP4779977B2 (en) Image encoding / decoding device
US20110249719A1 (en) Video compression
US20240137504A1 (en) Methods and apparatuses for encoding/decoding a video
KR20230150293A (en) Methods and devices for encoding/decoding video

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13740074

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13740074

Country of ref document: EP

Kind code of ref document: A1