WO2009149564A1 - Method and device for controlling bit-rate for video encoding, video encoding system using same and computer product therefor - Google Patents
Method and device for controlling bit-rate for video encoding, video encoding system using same and computer product therefor Download PDFInfo
- Publication number
- WO2009149564A1 WO2009149564A1 PCT/CA2009/000833 CA2009000833W WO2009149564A1 WO 2009149564 A1 WO2009149564 A1 WO 2009149564A1 CA 2009000833 W CA2009000833 W CA 2009000833W WO 2009149564 A1 WO2009149564 A1 WO 2009149564A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- frame
- quantization parameter
- encoded
- video
- frames
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/189—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
- H04N19/192—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding the adaptation method, adaptation tool or adaptation type being iterative or recursive
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/15—Data rate or code amount at the encoder output by monitoring actual compressed data size at the memory before deciding storage at the transmission buffer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
Definitions
- TITLE METHOD AND DEVICE FOR CONTROLLING BIT-RATE FOR VIDEO ENCODING, VIDEO ENCODING SYSTEM USING SAME AND COMPUTER PRODUCT THEREFOR
- the present invention relates generally to the field of video encoding and, more specifically, to a method and apparatus for controlling bit-rate in a video service application.
- the invention is particularly useful in low bandwidth connections such as found, for example, in mobile networks.
- the quality of video services is subject to bandwidth availability of the carrier as well as the acceptable latency from the perspective of the user.
- video-conferencing because of its real-time nature, has stringent latency requirements in addition to bandwidth limitations. Delays must be kept to a minimum otherwise interactions become awkward.
- Other video applications also have limited bandwidth/acceptable latency requirements, which must be taken in to account in the video services context.
- media gateways and media servers are responsible for converting and adapting video content to the network conditions and endpoint devices.
- Media gateways and media servers that support video are responsible for, amongst other things, converting a video stream from one codec format to another. In doing so they typically must transcode between video codecs, adapt the frame rates, and change the frame size so as to adapt it to the screen on the target device.
- the media gateway and media server also ensure that the video stream on a given connection does not exceed its bandwidth budget.
- Video gateway and media server make use of video encoding/decoding processor to encode/decode video frames.
- a video encoding processor can apply different levels (amounts) of compression to a frame in order to derive respective encoded versions of that frame.
- the levels of compression applied by the video encoding processor are controlled by parameter values, commonly referred to as quantization parameter values.
- a rate controller in the context of video encoding, is a device that dynamically adjusts parameter values provided to a video encoding processor in order to attempt to achieve a target bit rate.
- a first type of technique for controlling the bit-rate for a video stream makes use of a multi-pass encoding approach.
- a set of frames is repetitively encoded using different sets of parameter values and the set of parameter values resulting in the most optimal encoded series of frames for the given bandwidth budget is selected.
- Such multi-pass encoding techniques are used, for example, in the video broadcast or video authoring domains.
- Another popular approach for controlling bit-rate in a video stream involves controlling the average number of bits transmitted over a period of time.
- such an approach is implemented by monitoring encoded frames "after the fact", after they've been encoded, and keeping track of their size. When frames cause the allocated bandwidth to be exceeded, frames are dropped until there is sufficient bandwidth to continue.
- the reader in invited to refer to the specification document for the TMN8 rate controller a copy of which can be found at http://ftp3.itu.int/av-arch/video-site/9706 Por/ql5a20rl.doc. The content of the aforementioned document is incorporated herein by reference.
- the invention provides a method using statistical information for determining a level of compression to be applied to a certain frame by a video encoding processor in order to generate an encoded frame.
- the statistical information is obtained by encoding a plurality of representative video frames and observing statistical trends of a resulting encoded video stream.
- the statistical information provides estimates of encoded frame sizes of encoded frames resulting from the video encoding processor using different levels of compression to encode the certain frame.
- the method comprises selecting the level of compression to be applied to the certain frame at least in part based on the statistical information and a target frame size.
- an improvement in video quality can be obtained, in particular in applications having limited bandwidth availability and requiring low latency such as, for example, mobile video applications.
- the method for selecting a frame quantization parameter value that takes into account the type of frame being encoded (reference frame or non-reference frame for example) in the selection.
- this may allow, amongst other, to optimize bandwidth priority between reference frames and non-reference frames.
- the invention provides a bit-rate controller using statistical information for determining a level of compression to be applied to a frame by a video encoding processor in accordance with the above-described method.
- the invention provides a method for selecting a quantization parameter value for use by a video encoding processor encoding a certain frame, the quantization parameter value corresponding to a certain level of compression of the video encoding processor.
- the method comprises a) receiving target encoded frame size information conveying a desired frame size for the certain frame; b) providing a data structure including a plurality of entries, each entry mapping an encoded frame size to a corresponding quantization parameter value; c) selecting a quantization parameter value at least in part based on said target encoded frame size information and on said data structure; d) releasing the selected quantization parameter value to the video encoding processor.
- the quantization parameter value is a reference frame quantization parameter value, the reference frame quantization parameter value being for use by the video encoding processor in encoding the certain frame as a reference frame.
- a reference frame is an intra- encoded frame also referred to as an I-frame.
- the quantization parameter value is a non-reference frame quantization parameter value, the non-reference frame quantization parameter value being for use by the video encoding processor in encoding the certain frame as a non-reference frame.
- a non- reference frame is an inter-encoded frame, such as for example a P-frame or a B- frame.
- the selected quantization parameter value allows the video encoding processor to derive an encoded frame based on the certain frame such that the encoded frame has a frame size tending toward the desired frame size.
- selecting the quantization parameter value corresponds to an entry in the data structure associated to an encoded frame size approximating the target encoded frame size.
- selecting the quantization parameter value comprises: a) selecting an initial quantization parameter value corresponding to an entry in the data structure associated to an encoded frame size approximating the target encoded frame size; b) on the basis of the initial quantization parameter value, encoding the certain frame to derive a resulting encoded frame, the resulting encoded frame having a certain size; c) selecting the quantization parameter value at least in part based on the certain size of the resulting encoded frame and the desired frame size for the certain frame.
- the method comprises selecting the data structure from a set of data structures.
- Each data structure in the set of data structures includes a respective set of entries mapping encoded frame sizes to corresponding quantization parameter values.
- the selection of the data structure may be based on a number of suitable criteria.
- each data structure in the set of data structures is associated to a respective (uncompressed) frame resolution.
- a data structure associated to a frame resolution corresponding to a frame resolution approximating the frame resolution associated to the frame to be encoded is selected.
- each data structure in the set of data structures is associated to a respective video sequence type.
- Video sequence types may include, for example, sports broadcast video sequence and talking head video sequence amongst others.
- the certain video frame to be encoded is part of a sequence of frames and the data structure is selected from the set of data structures in part based on the video sequence type associated with the sequence of frames.
- the video sequence type associated to the sequence of frames may be provided as an input and used for selecting the quantization parameter.
- video sequence type associated to the sequence of frames may be derived by processing at least some frames in the sequence of frames to select, from a set of possible video sequence types, the video sequence type associated to the sequence of frames.
- the invention provides an apparatus for selecting a quantization parameter value in accordance with the above-described method, the quantization parameter value being for use by a video encoding processor encoding a certain video frame.
- the invention provides a computer readable storage medium storing a program element suitable for execution by a processor for selecting a quantization parameter value for use by a video encoding processor encoding a certain video frame, the quantization parameter value corresponding to a certain level of compression of the video encoding processor.
- the quantization parameter value is selected in accordance with the above-described method.
- the invention provides a video encoding system for encoding a video stream including a sequence of frames.
- the system comprises a first input for receiving a video frame originating from a sequence of frames to be encoded and a second input for receiving target encoded frame size information conveying a desired frame size for the certain video frame.
- the system also comprises an apparatus for selecting a quantization parameter value in accordance with the above-described method and an encoding processor in communication with the first input and with the apparatus.
- the encoding processor processes the certain video frame to generate an encoded video frame based in part on the selected quantization parameter.
- the system also includes an output for releasing the encoded video frame generated by the encoding processor.
- the invention provides a method for generating information for use in selecting a quantization parameter value for a video encoding processor, the quantization parameter value corresponding to a certain level of compression of the video encoding processor.
- the method comprises: a) providing a plurality of video frames representative of types of frames expected to be encoded by the video encoding processor; b) encoding the plurality video frames for a quantization parameter value selected from a set of quantization parameter values to generate an encoded frame group, the encoded frame group including a plurality of encoded video frames derived using the quantization parameter value; c) deriving an encoded frame size corresponding to the quantization parameter value, the corresponding encoded frame size being derived at least in part based on frame sizes of encoded video frames in the encoded frame group; d) on a computer readable storage medium, storing information mapping the quantization parameter value to its derived corresponding encoded frame size.
- steps b) c) and d) are repeated for each quantization parameter value in the set of quantization parameter values.
- deriving the encoded frame size corresponding to the quantization parameter value comprises processing the frame sizes of encoded video frames in the encoded frame group to derive a range of frame sizes, the range of frame sizes having an upper limit.
- the derived range of frame sizes is such that a certain proportion of encoded frames in the encoded frame group have a frame size falling within the derived range of frame sizes.
- the encoded frame size corresponding to the quantization parameter value is set to substantially correspond to the upper limit of the derived range of frame sizes.
- the proportion of encoded frames in the encoded frame group used to define the range of frame sizes may vary in different specific implementations. In accordance with a specific non-limiting example of implementation, the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 50%. In accordance with another specific non-limiting example of implementation, the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 70%. In accordance with another specific non-limiting example of implementation, the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 90%. In accordance with another specific non-limiting example of implementation, the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 95%. In accordance with another specific non-limiting example of implementation, the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 99%.
- the plurality of video frames includes a first sub-set of frames having a first frame resolution and a second sub-set of frames having a second frame resolution distinct from the first frame resolution.
- the method comprises processing video frames in the first sub-set of frames to derive a first encoded frame size corresponding to the quantization parameter value and the first frame resolution.
- the method also comprises processing video frames in the second sub-set of frames to derive a second encoded frame size corresponding to the quantization parameter value and the second frame resolution.
- the plurality of video frames includes a first sequence of frames of a first video sequence type and a second sequence of frames of a second video sequence type distinct from the first video sequence type.
- the method comprises processing video frames in the first sequence of frames to derive a first encoded frame size corresponding to the quantization parameter value and the first video sequence type.
- the method also comprises processing video frames in the second sequence of frames to derive a second encoded frame size corresponding to the quantization parameter value and the second video sequence type.
- the first video sequence type is selected from a set of possible video sequence types including, for example, sports broadcast video sequence and talking head video sequence amongst others. The person skilled in the art will appreciate that the above process may be applied to any number of distinct video sequence types, in order to derive a mapping between encoded frame sizes and respective quantization parameter values for each distinct video sequence type.
- the invention provides a computer readable storage medium storing a data structure for use in selecting a quantization parameter for a video encoding processor, entries in the data structure being generated in accordance with the above described method.
- the invention provides a computer readable storage medium storing a data structure for use in selecting a quantization parameter value for a video encoding processor from a set of quantization parameter values, each quantization parameter value in the set corresponding to a respective level of compression of the video encoding processor.
- the data structure includes a plurality of entries, each entry mapping an encoded frame size to a corresponding quantization parameter value in the set of quantization parameter values.
- the entries in the data structure are derived at least in part by encoding a plurality of representative video frames and observing statistical trends in the resulting encoded video streams.
- the invention provides an apparatus for generating information for use in selecting a quantization parameter value for a video encoding processor in accordance with the above described method.
- the invention provides a method for selecting a quantization parameter value for use by a video encoding processor in encoding a non-reference frame, the quantization parameter value corresponding to a certain level of compression of the video encoding processor.
- the method comprises: a) receiving target encoded non-reference frame size information conveying a desired frame size for a certain video frame to be encoded as a non-reference frame by the video encoding processor,; b) receiving reference frame information associated to a reference frame on which the non-reference frame will be partly based, said reference frame information including; i. a reference frame quantization parameter value associated to the reference frame and corresponding to a level of compression applied to generate the reference frame ame; ii.
- reference frame size information conveying a frame size associated to the reference frame; c) selecting the quantization parameter value for use in encoding the non- reference frame at least in part based on: i. the target encoded non-reference frame size information; ii. the reference frame quantization parameter value; and iii. the reference frame size information; d) releasing the selected quantization parameter value for use in encoding the non-reference frame to the video encoding processor.
- the above describe method is used to select a non-reference frame quantization parameter value for use by a video encoding processor when encoding a frame as a first non-reference frame following a frame in a sequence of frame encoded as a reference frame.
- the method comprises providing a data structure providing a mapping between reference frame quantization parameter values and non-reference frame quantization parameter values.
- the data structure comprises a plurality of entries, each entry being associated to a respective combination of a reference frame quantization parameter value and a non-reference frame quantization parameter value.
- the method comprises selecting the data structure from a set of data structures. Each data structure includes a respective set of entries providing a mapping between reference frame quantization parameter values and non-reference frame quantization parameter values. The selection may be based on a number of suitable criteria.
- each data structure in the set of data structures is associated to a respective frame resolution.
- a data structure associated to a frame resolution corresponding to a frame resolution approximating the frame resolution associated to the certain video frame to be encoded is selected.
- each data structure in the set of data structures is associated to a respective a video sequence type.
- the certain video frame to be encoded is part of a sequence of frames and the data structure is selected from the set of data structures based in part on the video sequence type associated to the sequence of frames.
- the invention provides an apparatus for selecting a non-reference frame quantization parameter value for use by a video encoding processor in accordance with the above-described method.
- the invention provides a computer readable storage medium storing a program element suitable for execution by a processor for selecting a non-reference frame quantization parameter value for use by a video encoding processor.
- the non-reference frame quantization parameter value is selected in accordance with the above-described method.
- the invention provides a video encoding system for encoding a video stream including a sequence of frames.
- the system comprises a first input for receiving a video frame originating from a sequence of frames to be encoded and a second input for receiving target encoded frame size information conveying a desired frame size for the certain video frame.
- the system also comprises an apparatus for selecting a non-reference frame quantization parameter value in accordance with the above-described method and an encoding processor in communication with the first input and with the apparatus.
- the encoding processor processes the certain video frame to generate an encoded video frame based in part on the selected non-reference frame quantization parameter.
- the system also includes an output for releasing the encoded video frame generated by the encoding processor.
- the invention provides a method for generating information for use in selecting a non-reference frame quantization parameter value from a set of possible non-reference frame quantization parameter values for use by a video encoding processor.
- Each non-reference frame quantization parameter value in the set corresponds to a certain level of compression of the video encoding processor.
- the method comprises: a) providing a sequence of video frames representative of types of frames to be encoded by the video encoding processor; b) encoding the sequence video frames so as to generate: i. a plurality of reference frames associated with a given reference frame quantization parameter value, the reference frame quantization parameter value corresponding to a level of compression applied to generate the plurality of reference frames; ii.
- a plurality of non-reference frames the plurality of non- reference frames being arranged in sets, each set of non- reference frames being associated with a respective non- reference frame quantization parameter value corresponding to a level of compression applied to generate the non-reference frames in the associated set of non-reference frames; c) deriving a mapping between the given reference frame quantization parameter value and each non-reference frame quantization parameter value in the set of possible non-reference frame quantization parameter values based on the plurality of reference frames and on the plurality of non-reference frames; d) storing the derived mapping in a data structure on a computer readable storage medium.
- steps b) c) and d) described above are repeated for each reference frames quantization parameter value in a set of possible reference frame quantization parameter values. This results in a mapping between each reference frame quantization parameter value in the set of possible reference frame quantization parameter values and each non-reference frame quantization parameter value in the set of possible non-reference frame quantization parameter values.
- Fig. 1 shows a simplified high-level functional block diagram of a video encoding system in accordance with a specific example of implementation of the present invention
- Fig. 2 shows a high level flow diagram of a process implemented by the video encoding system shown in Figure 1 in accordance with a specific example of implementation of the present invention
- Fig. 3 is a functional block diagram of an I-frame processing module suitable for use in the video encoding system depicted in Figure 1 in accordance with a specific example of implementation of the present invention
- Fig. 4 is a functional block diagram of a bit-rate controller module suitable for use in connection with the I-frame processing module shown in Figure 3 in accordance with a specific example of implementation of the present invention
- Fig. 5 shows a flow diagram of a process implemented by the bit-rate controller module shown in Figure 4 for selecting a quantization parameter in accordance with a specific example of implementation of the present invention
- Fig. 6 shows a process for generating a data structure including statistical information for use by the bit-rate controller module of Figure 4 in accordance with a specific example of implementation of the present invention
- Fig. 8 is a functional block diagram of a P-frame processing module suitable for use in the video encoding system depicted in Figure 1 in accordance with a specific example of implementation of the present invention
- Fig. 9 is a block diagram of a bit-rate controller module suitable for use in connection with the P-frame processing module shown in Figure 8 in accordance with a specific example of implementation of the present invention
- Fig. 10 shows a flow diagram of a process implemented by the bit-rate controller module shown in Figure 9 for selecting a quantization parameter in accordance with a specific example of implementation of the present invention
- Fig. 11 shows a process for generating a data structure including statistical information for use for the bit-rate controller module of Figure 9 in accordance with a specific example of implementation of the present invention
- Fig. 12 is a graph showing a exemplary relationship between P-frame QP values and ratios of P-frame sizes and I-frame sizes for a given I-frame QP value of 15, the relationship being derived as part of the process depicted in Figure 11;
- Fig. 13 illustrates an example of a display order of I-frame, P-frame and B-frame encoded video frames than can be generated using MPEG-4 coding
- Fig. 14 is a graph depicting a relationship between I-frame QP values used for generating an I-frame and the size of the I-frame in accordance with a specific example of implementation of the present invention
- Fig. 15 is a block diagram of an apparatus for selecting a quantization parameter value in accordance with a specific example of implementation of the present invention
- Fig. 16 is a block diagram of a bit-rate controller module suitable for use in connection with the I-frame processing module shown in Figure 3 in accordance with a variant of the present invention
- Fig. 17 is a block diagram of a bit-rate controller module suitable for use in connection with the P-frame processing module shown in Figure 8 in accordance with a variant of the present invention.
- the specific example of implementation of the invention will describe the selection of a quantization parameter value (reference frame QP or non-reference frame QP) for the purpose of encoding an entire frame.
- the embodiment(s) described provide that a quantization parameter value is selected for the purpose of encoding all macroblocks in a frame.
- alternative embodiments of the invention may apply the processes described in the present application for the selection of a quantization parameter value (reference frame QP or non-reference frame QP) for one macroblock (or a subset of macroblocks) in a given frame.
- different quantization parameter values can be selected for encoding different portions of a frame corresponding to individual macroblocks or subsets of macroblocks. It is also to be appreciated that the macroblocks in a given subset of macroblocks need not be adjacent to one another in a frame but may be positioned anywhere within the frame.
- alternative embodiments of the invention may modify the quantization parameter value dynamically so that different quantization parameter values are used for different macroblocks (or subsets of macroblocks). Such alternative embodiment will become readily apparent to the person skilled in the art of video processing in light of the present description.
- macroblock is a term used in video compression, which generally represents a block of 16 by 16 pixels. Macroblocks can be subdivided further into smaller blocks. H.264, for example, supports block sizes as small as 4x4.
- a frame is composed of macroblocks. The higher (larger) the frame resolution, the more macroblocks it is composed of.
- frame is used interchangeably with image.
- encoding processors often make use of spatial and temporal redundancies when encoding video frames.
- “Intra frame” coding exploits the spatial redundancies in a frame to compress the information
- "Inter frame” coding exploits the temporal redundancies between a current frame and previous or following frames.
- a reference frame is an encoded version of a single raw (uncompressed) frame that can take advantage of spatial redundancy.
- reference frames are independent, meaning that they do not depend on data in preceding or following frames to be decoded.
- Non- reference frames are derived based on a reference frame and typically provide more compression than reference frames because they take advantage of the temporal redundancies in a previous and/or a subsequent reference frame.
- Reference and non-reference frames may be used in the context of video encoding depending on the coding protocol being used.
- the reference frame is an I-frame and the non-reference frames are either P-frames or B-frames will be considered.
- P-frames are derived by taking advantage of the temporal redundancies in a previous I-frame (but not in a subsequent I-frame) while B-frames are derived taking advantage of the temporal redundancies in a previous I-frame and in a subsequent I-frame.
- I-frame, P-frame and B-frames are commonly used in the art of video encoding and will therefore not be described in further details here.
- Figure 13 of the drawings illustrates an example of a display order of encoded video frames than can be generated using MPEG-4 coding and exemplifies one implementation of I-frames, P-frames and B- frames in a given frame sequence.
- the present example of implementation of the invention will describe situations where I-frames and P- frames are generated (not B-frames).
- FIG. 1 there is shown a functional block diagram of a video encoding system 100 in accordance with a specific example of implementation of the present invention.
- the video encoding system 100 includes a frame type selector module 106, an I-frame (reference frame) processing module 112, a P-frame (non-reference frame) processing module 120, a reconstruction module 150 and an output signal generator 126.
- the video encoding system 100 also includes a first input 104 for receiving a frame type selector parameter, a second input 102 for receiving a sequence of video frames to be encoded ("raw" frames) and an output 128 for releasing a coded video stream.
- the video encoding system 100 also includes data line 114 for receiving a target encoded size for an encoded I-firame (reference frame) and a data line 116 for receiving a target encoded size for an encoded P-frame (non- reference frame). It is to be appreciated that the target encoded size for an encoded P- frame (non-reference frame) and the target encoded size for an encoded I-frame (reference frame) can be the same size and can be received at a same data line.
- the target encoded size for an encoded P-frame (non-reference frame) and the target encoded size for an encoded I-frame (reference frame) are generated in accordance with any suitable process for determining a desired frame size for an encoded frame given a bandwidth budget. Such processes are well known in the art and as such will not be described in further detail here.
- the frames in the sequence of frames received at input 102 have a certain resolution.
- the resolution of the frames received which may be expressed in terms of a frame size in bits, may be the same for all frames received at input 102 or may vary over time so that frames of different resolutions are received at input 102.
- the frame type selector module 106 determines, based on the frame type selector parameter received at the first input 104, what type of encoded frame will be generated.
- the type of frame to be generated may be an I- frame or a P-frame however other types of frames (such as for example B-frames) may also be contemplated in alternative examples of implementation.
- the sequence of frame type selector parameters generated for a sequence of video frames is dictated by a certain video transmission standard and/or policy specified by the video application service provider.
- the frame type selector parameters received at input 104 are generated in accordance with any suitable process. Such processes are known in the art of video signal encoding and as such will not be described in further detail here.
- the frame type selector parameter received at input 104 conveys an I-frame type
- the next video frame in the sequence of frames at input 102 is transmitted to the I-frame processing module 112 over data line 108.
- the I-frame processing module 112 generates an encoded I-frame based in part on a "raw" video frame received over data line 108 and the target frame size information received over data line 114.
- the encoded I-frame is released over data line 122.
- the quantization parameter (I-frame QP) used by the I-frame processing module 112 to generate the encoded I-frame, is also released over data line 118.
- the manner in which the I-frame processing module 112 generates the encoded I-frame and selects the quantization parameter value (I-frame QP) will be described later in the specification.
- the reconstruction module 150 generates a reconstructed version of a video frame based on the encoded I-frame released by the I-frame processing module 112 over data line 122. Reconstructing a video frame based on a compressed version of a frame (the encoded I-frame) may be done according to well-known methods.
- the P-frame processing module 120 generates an encoded P-frame based in part on a "raw" video frame received over data line 110 and the target frame size information received over data line 116.
- the P-frame processing module 120 receives from reconstruction module 150 over data lien 152 the reconstructed version of the video frame, which was derived from an encoded I-frame released over data line 122 by the I-frame processing module 112.
- the P-frame processing module 120 also receives from data line 108 the quantization parameter (I-frame QP) used by the I- frame processing module 112 when generating the encoded I-frame.
- the encoded P- frame is released over data line 124. The manner in which the P-frame processing module 120 generates the encoded P-frame will be described later in the specification.
- the output signal generator 126 combines the I-frames and P-frames released over data lines 122 and 124 to generate an output coded video stream 128.
- the coded video stream is in any suitable form for transmission over a communication path.
- the output signal generator 126 combines the I-frames and P-frames in accordance with any suitable process. Such processes are known in the art of video signal encoding and as such will not be described in further detail here.
- FIG. 1 A simplified flow diagram of a process implemented by the system 100 is shown in Figure 2.
- a video frame from a sequence of video frames to be encoded is received at step 200. It should be appreciated that the frame received at this step may be the first in the sequence of video frames or may be any other frame within the sequence of frames.
- the frame type selector parameter is received. This parameter identifies the type of frame the video frame received in step 200 should be encoded as.
- the frame type selector parameter specifies either an I-frame type or a P- frame type although other types of frames may be contemplated in alternative examples of implementation of the invention.
- step 204 depending on the frame type selector parameter received in step 202, a different path is taken.
- step 202 if the frame type selector parameter received in step 202 indicates that the video frame received in step 200 should be encoded as an I-frame, the system proceeds to perform steps 206 and 208.
- steps 206 and 208 are implemented by the I- firame processing module 112 (shown in Figure 1).
- a quantization parameter (I-frame QP) for use in encoding the video frame received in step 200 is derived by a first process making use of statistical information. This first process will be described later below.
- the I-frame QP value derived in step 206 is used by a video encoding processor to encode the video frame received at step 200 as an I-frame. The result of step 208 is the release of an encoded I-frame.
- steps 210 and 212 are implemented by the P-frame processing module 120 (shown in Figure 1).
- a quantization parameter (P-frame QP) for the P-frame to be encoded from the video frame received in step 200 is derived by a second process making use of statistical information. This second process will be described later below.
- the P-frame QP value derived in step 210 is used by a video encoding processor to encode the video frame received at step 200 as a P-frame.
- the result of step 212 is the release of an encoded
- step 214 the released I-frame or P-frame derived in either one of steps 208 or 212 are processed and combined with other encoded I-frames and P-frames to create a coded video stream in accordance with any suitable process for output to a target device (not shown).
- Processes for creating coded video streams based on I-frames and P-frames are known in the art of video encoding and as such will not be further described here. If the released I-frame or P-frame is not the last frame in the plurality of video frames contained within a sequence of video frames to be encoded, the next frame to be processed is selected and another iteration of the process shown in Figure 2 is performed.
- the I-frame processing module 112 and the P-frame processing module 120 make use of statistical information in order to determine the level of compression to apply when generating a given I-frame and/or given P-frame based on the constraints provided by the target frames sizes received over data lines 114 and 116.
- the statistical information used was derived by encoding a plurality of representative video frames and observing statistical trends of the resulting encoded video stream.
- an improvement in video quality can be obtained, in particular in video applications having limited bandwidth availability and requiring low latency, such as mobile video applications.
- the I-frame processing module 112 and the P-frame processing module 120 will now be described in greater detail below.
- Figure 3 is a block diagram showing in greater detail components of the I-frame processing module 112 in accordance with a specific example of implementation.
- the I-frame processing module 112 includes a bit-rate controller 304, herein referred to as the "I-frame" bit-rate controller 304, and an encoder module 306 in communication with the "I-frame" bit-rate controller 304.
- the encoder 306 includes a video encoding processor and implements any suitable intra-frame coding process. It may be recalled that intra-frame coding can exploit, if applicable, the spatial redundancies in a frame to compress the information.
- the encoder 306 includes a video encoding processor (codec) selected from H.263, MPEG-4, VC-I and H.264 codecs amongst other types.
- codec video encoding processor
- the encoding process implemented by the encoder 306 can be controlled such as to modify the level of compression applied to a frame based on a quantization parameter (QP) value selected amongst a set of possible QP value and received from the "I-frame" bit-rate controller 304 over data line 308.
- QP quantization parameter
- a Discrete Cosine Transform is used to transform a frame from the spatial domain to the frequency domain. This operation is typically done at the macroblock level.
- the DCT represents a macroblock as a matrix of coefficients, which can be quantized with different step sizes depending on the desired level of compression. These step sizes are controlled by the value of the quantization parameter (QP); in general, the smaller the QP value, the more bits are allocated to encoding each macroblock and the less image information is lost during encoding.
- the quantization parameter value can be set to any value in a range (or set) of possible values.
- the quantization parameter (QP) for the H.264 codec can be set as any integer value between 1 (the highest quality value) and 51 (the lowest quality value). It will however be appreciated certain embodiments may make use of floating-point quantization parameter (QP).
- the "I-frame” bit-rate controller 304 selects a quantization parameter value for use by the encoder 306, the quantization parameter value corresponding to a certain level of compression. As shown in Figure 3, the "I-frame” bit-rate controller 304 receives a target encoded frame size from data line 114. The target encoded frame size conveys a desired encoded frame size for the certain video frame to be encoded. Also as shown in Figure 3, the "I-frame” bit-rate controller 304 also receives information conveying a resolution (size) of the raw frame to be encoded over data line 300.
- the information conveying a resolution (size) of the raw frame to be encoded received over data line 300 may be omitted.
- the system 100 shown in Figure 1
- the resolution (size) of the raw frames to be encoded is not needed, since no other types of frames will be encoded.
- the bit-rate controller 304 selects a quantization parameter value from a set of possible quantization parameter values, at least in part based on the target encoded frame size information received over data line 114.
- the selected quantization parameter value herein referred to as the I-frame QP, is released to the encoder 306 over data line 308.
- the selected I-frame QP is also released as an output of the I- frame processing module 112 over data line 118.
- the "I-frame" bit-rate controller 304 is shown in greater detail in Figure 4.
- the "I-frame” bit-rate controller 304 includes a first input 450, a second input 452, a memory module 402, a processing unit 475 and an output 454.
- the "I-frame” bit-rate controller 304 also includes a third input 456.
- the first input 450 is for receiving from data line 300 information conveying a resolution (size) of the raw frame to be encoded. In implementation where the system 100 (shown in Figure 1) is designed to be used for a single frame resolution, this input 450 may be omitted.
- the second input 452 is for receiving information from data line 114 conveying a target encoded frame size.
- the output 454 is for releasing information derived by the processing unit 475 and conveying the level of compression to be applied to the raw frame to be encoded.
- the information conveying the level of compression is in the form of an I-frame quantization parameter (I-frame QP) and is released over data line 308.
- the optional third input 456 is for receiving the raw frame to be encoded from data line 302.
- the memory module 402 stores statistical information related to the relationship between levels of compression and resulting encoded frames sizes.
- the memory module 402 stores one or more data structures and conveys an estimate of the relationship between different QP values and encoded frame sizes.
- each data structure provides a mapping between encoded frames sizes and corresponding QP values, wherein this mapping is derived based on statistical data obtained by encoded a reference set of video frames.
- each data structure in the set of data structures is associated to a respective raw (uncompressed) frame resolution and conveys an estimate of the relationship between different QP values and encoded frame sizes for the respective raw (uncompressed) frame resolution to which it is associated.
- each data structure in the set of data structures is associated to a respective a video sequence type and conveys an estimate of the relationship between different QP values and encoded frame sizes for the respective video sequence type to which it is associated.
- Video sequence types may include, for example, sports broadcast video sequence and talking head video sequence of the type used in a newscast for example amongst others.
- each data structure in the set of data structures is associated to a combination of a certain frame resolution and a certain the video sequence type.
- the data structure 404 is in the form of a table including a plurality of entries, each entry providing a mapping between an encoded frame size and a corresponding quantization parameter value.
- each entry in the data structure maps an expected maximum encoded frame size to a corresponding quantization parameter value.
- the expected maximum encoded frame size may be based on a certain confidence interval (such as for e.g. 90%, 95% or 99% confidence interval).
- each entry in the data structure maps an expected average encoded frame size to a corresponding quantization parameter value.
- mappings have been presented for the purpose of illustration only and that other types of mappings based on a statistical relationship between encoded frame sizes and corresponding quantization parameter values can be used in alternative examples of implementations of the invention.
- An exemplary process for determining the information stored within each data structure in memory module 402 will be described later in the present specification.
- the processing unit 475 as shown in Figure 4, is in communication with the first and second inputs 450 452, with the optional third input 456, with the memory module 402 and with output 454.
- the processing unit 475 determines the level of compression to be applied at least in part based on the information conveying a resolution (size) of the raw frame to be encoded received at the first input 450, the target encoded frame size received at the second input 452 and the statistical information in the memory module 402.
- the processing unit 475 includes a data structure selector module 400, an initial QP (quantization parameter) selector module 406 in communication with the second input 452, an output module 412 in communication with output 454.
- the processing unit 475 may also include a QP adjustment module 410 in communication with optional input 456.
- the data structure selector module 400 is in communication with the first input 450 and with the memory module 402 and is for selecting a data structure from the set of data structures in memory unit 402.
- each data structure in the set of data structures in memory 402 is associated to a respective raw (uncompressed) frame resolution.
- the data structure selector module 400 receives from input 450 the resolution (size) of the raw frame to be encoded and accesses memory 402 to select a data structure associated to a frame resolution corresponding to or approximating the frame resolution associated to the frame to be encoded.
- each data structure in the set of data structures in memory 402 is associated to a respective video sequence type.
- the raw frame to be encoded is part of a sequence of frames and the data structure selector module 400 selects a data structure from the set of data structures based in part based on the video sequence type associated to the sequence of frames of which the raw frame to be encoded is part.
- the video sequence type associated to the sequence of frames may be provided as an input to the data structure selector module 400 and used for selecting the data structure from the set of data structures.
- the video sequence type associated to the sequence of frames of which the frame to be encoded is part may be derived by a processing module programmed for processing at least some frames in the sequence of frames to select, from a set of possible video sequence types, the video sequence type associated to the sequence of frames.
- the resolution (size) of the raw frame to be encoded as an I-frame is of QCIF resolution (176x144 pixels).
- the frame is to be encoded using the H.264 standard for video compression to 320x240 pixels for playback on a mobile end-point.
- the memory module 402 stores a respective data structure for each or a subset of target video frame resolutions of which the following are common examples:
- the data structure selector module 400 would determine that the resolution of the raw frame that is to be encoded is 176x144 pixels and would select from memory unit 402 the data structure corresponding to QCIF. In a non-limiting example of implementation, if the resolution of the raw frame to be encoded does not correspond exactly to a resolution associated to a data structure in memory 402, the data structure selector module 400 selects a data structure associated to a frame resolution approximating the frame resolution associated to the frame to be encoded. It will be appreciated by the person skilled in the art that the above list of video frame resolutions constitutes a non-exhaustive list and has been presented here for the purpose of illustration only. It will also be appreciated that in alternative implementations in which the memory unit 402 includes a single data structure associated to a single resolution, the data structure selector module 400 as well as input 450 may be omitted.
- the initial QP selector module 406 is in communication with the second input 452 for receiving the target encoded frame size.
- the initial QP selector module 406 is also in communication with the data structure selector module 400 and receives there from information conveying the data structure selected from memory unit 402. Recall that the data structure selected by the data structure selector 400 includes a set of entries mapping encoded frame sizes to QP values.
- the initial QP selector 406 locates in the data structure selected by the data structure selector module 400 an entry corresponding to an encoded frame size approximating the target encoded frame size received at input 452. In a non-limiting example of implementation, the entry selected corresponds to an encoded frame size that is closest to the target encoded frame size.
- the initial QP selector 406 obtains the QP value corresponding to the located entry and sets it as the initial I-frame QP value.
- the initial QP value is the released over data lines 308.
- the output module 412 is in communication with the initial QP selector module 406 and with output 454.
- the output module 412 releases to output 454 the initial QP value received over data line 408 from the initial QP selector module 406.
- the output module 412 releases to output 454 a current I- frame QP value which is either the initial QP value received over data line 408 from the initial QP selector module 406 or an adjusted QP value received from the optional QP adjustment module 410.
- the QP adjustment module 410 is for adjusting the last computed value I-frame QP value based on the how close the size of the encoded I-frame generated when using this last computed value of QP comes to the target encoded frame size.
- the QP adjustment module 410 is in communication with input 452 for receiving the target encoded frame size, with input 456 for receiving the raw frame to be encoded and with the output module 412 for receiving the last computed value of QP.
- the QP adjustment module 410 includes an encoder 460 to encode the raw frame received at input 456 using the last computed value of QP to generate an encoded I-frame.
- the encoder 460 in the QP adjustment module 410 implements the same encoding process as encoder 306 (shown in Figure 3).
- encoder 460 and encoder 306 have been shown as being separate components in the Figures, it will be readily appreciated that a same encoder may perform the functionality of encoder 460 and encoder 306 in some implementations. While its operation will be described in more detail later, it may be appreciated at this point that the QP adjustment module 410 provides functionality for determining whether a given QP value is suitable for producing I- frames with sizes that fall within a range deemed acceptable in relation to the target encoded frame size. If the frame size of the I-frame falls outside of this range, the QP adjustment module 410 modifies the QP value to find a QP value that produces an encoded frame with a frame size that is closer to the target encoded frame size.
- the QP adjustment module 410 is optional in the sense that it is not required for the operation of the "I-frame" bit-rate controller 304 and therefore may be omitted from certain specific implementations thereof.
- the use of the QP adjustment module 410 is optional as well and may depend, for example, on factors such as the current overall computational load on the "I-frame" bit-rate controller 304 as well as the availability of spare processing capacity within the system 100 (shown in figure
- the "I-frame" bit-rate controller 304 (shown in figures 3 and 4) implements a process for selecting a quantization parameter value for use by the video encoding processor of encoder 306 (shown in Figure 3) for encoding a certain video frame, the quantization parameter value corresponding to a certain level of compression of the video encoding processor in encoder 306.
- the selected quantization parameter value is selected such as to attempt to allow the video encoding processor to derive an encoded video frame having a frame size tending toward the desired frame size.
- step 500 represents one optional starting point for the process, where the "I-frame" bit-rate controller 304 receives the information pertaining to the raw frame to be encoded.
- the information received at step 500 conveys the resolution (size) of the raw frame to be encoded.
- the raw frame to be encoded is part of a sequence of frames associated to a video sequence type and the information received at step 500 conveys the video sequence type associated to the sequence of frame of which the raw frame to be encoded is part.
- step 500 includes processing at least some frames in the sequence of frames of which the frame to be encoded is part to select, from a set of possible video sequence types, the video sequence type associated to the sequence of frames.
- the resolution (size) of the raw frame to be encoded and the video sequence type associated to the sequence of frames of which the raw frame to be encoded is part may be used in combination.
- a data structure is selected from an available set of data structures based in part on the information received in step 500.
- the selected data structure includes a plurality of entries, each entry mapping an encoded frame size to a corresponding quantization parameter value.
- this step is implemented by the data structure selector module 400 (shown in Figure 4) which accesses the set of data structure in memory module 402 (also shown in Figure 4).
- Step 504 the target encoded frame size, which conveys a desired encoded frame size, is received.
- Step 504 is an alternate starting point for the process in cases where optional steps 500 and 502 are omitted. The process then proceeds to step 505.
- an initial quantization parameter value is selected.
- the selected initial quantization parameter value corresponds to an entry in the data structure (selected at step 502), the entry being associated to an encoded frame size approximating the target encoded frame size (received at step 504).
- the entry selected is associated to an encoded frame size that is closest to the target encoded frame size without exceeding the latter. The process then proceeds to step 507.
- the initial QP (quantization parameter) value selected at step 505 is set as the current QP value.
- steps 508 510 512 514 and 516 are optional and pertain to the adjustment of the initial quantization parameter value selected at step 505 based on one or more actual observed encoded frame sizes.
- the process moves forward to step 508. Otherwise, the process moves forward to step 518.
- step 508 the raw frame is encoded using the current QP value.
- step 508 is implemented by the encoder 460 of the QP adjustment module 410 (shown in Figure 4).
- the result of step 508 is a resulting encoded video frame having a certain size. The process then proceeds to step 512.
- step 512 the size of the resulting encoded video frame generated using the current QP value at step 508 is compared against the target encoded frame size received at step 504 to determine whether to adapt or release the current QP value. It will be appreciated that, in practical implementations, the actual frame size of the resulting encoded video frame generated at step 508 is unlikely to be an exact match to the target encoded frame size. Therefore, in a specific example of implementation, step 512 determines if the actual frame size of the resulting encoded video frame generated at step 508 falls within an acceptable range of size near the target encoded frame size.
- the range of what is deemed an acceptable frame size for resulting encoded video frames may be expressed in numerical values as a range of frame sizes, as a percentage in relation to the target encoded frame size (such as ⁇ 10%) or in any other suitable fashion.
- the outcome of decision step 512 is identified by the following two result branches, each of which is described below: - If the frame size of the resulting encoded video frame is considered acceptable in relation to the target encoded frame size, step 512 is answered in the negative and we proceed to step 518. - If the frame size of the resulting encoded video frame is not considered acceptable in relation to the target encoded frame size, 512 is answered in the positive and we proceed to step 514.
- a new QP value is selected based in part on a difference between the size of the resulting encoded video frame generated using the current QP value at step 508 and the target encoded frame size. More specifically, at step 514, a new QP value is selected so that when the new QP value is used to encode the raw frame, the newly obtained resulting encoded frame is likely to have a size that is closer to the target encoded frame size than the resulting encoded raw frame obtained by using the current QP value.
- the approach makes use of information in the data structure (selected at step 502) and on the size of the resulting encoded video frame generated at step 508 in order to select a new QP value. More specifically, as described above, the data structure (selected at step 502), provides a mapping between encoded frames sizes and corresponding QP values. This mapping is derived based on statistical data and conveys a relationship between QP values and the encoded frames sizes as derived based on a reference set of video frames.
- the amount by which the size of the resulting encoded video frame generated at step 508 deviates from the relationship between QP values and the encoded frames sizes conveyed by the mapping in the data structure provides an indication of the extend to which the current QP value must be adapted/modified.
- the result of step 514 is a newly selected QP value.
- Figure 14 of the drawings illustrates graphically a manner in which the amount by which the size of the resulting encoded video frame obtained at step 508 deviates from the relationship between QP values and the encoded frames sizes (as conveyed by the data structure selected at step 502) can be used to select a new QP value.
- curve 1402 depicts an exemplary relationship between QP values and encoded frame sizes as stored in a data structure in memory 402 (shown in Figure 4). As can be observed from curve 1402, the relationship between QP values and encoded frame sizes in the specific example shown generally follows a negative exponential curve.
- Dotted line 1404 conveys a target encoded frame size.
- Location 1410 on the graph represents the size of a resulting encoded video frame obtained by using the initial QP value (1408) to encode a raw frame, wherein the y-axis value conveys the size of the resulting encoded frame.
- the difference between location 1410 and 1406 represents the amount by which the size of the resulting encoded video frame deviates from the relationship between QP values and the encoded frames sizes provided by curve 1402.
- Curve 1412, which goes through location 1410, is derived based on the difference between location 1410 and 1406 and is a revised estimate of the relationship between QP values and encoded frame sizes.
- the current QP value is set to the new QP value selected at step 514. We then proceed to optional step 517 or to step 508 in cases where step 517 is omitted.
- a test is made to determine whether a maximum allowable number of iterations of steps 508 510 512 514 and 516 has been reached.
- the number of iterations or "adaptations" for determining a current QP value may be limited so the current QP may only be adjusted a certain number of times before being released at step 518 to the encoding processor.
- the number of permitted iterations will vary from one implementation to the next and may depend on a number of different factor which may include, without being limited to, a maximum allowable latency in the video encoding system.
- the maximum number of iterations of steps 508 510 512 514 and 516 is set to two (2) however any suitable number of iterations may be used. If at step 517, it is determined that the maximum number of iterations has been reached, step 517 is answered in the affirmative and we proceed to step 518. If at step 517, it is determined that the maximum number of iterations has not yet been reached, step 517 is answered in the negative and we return to step 508. In another non-limiting example of implementation, the span of QP values is restricted between a minimum value and a maximum value.
- step 517 once the maximum value of QP available has been reached, there is no reason to do another iteration if the goal is to compress more and therefore step 517 is answered in the affirmative and we proceed to step 518. Similarly, once the minimum QP available has been reached, there is no reason to do another iteration if the goal is to compress less and therefore step 517 is also answered in the affirmative and we proceed to step 518. Otherwise, step 517 is answered in the negative and we return to step 508.
- step 517 is optional and that in implementation where a single iteration of steps 508 510 512 514 and 516 is permitted, step 517 is omitted and the process goes directly from step 516 to step 518. It has been observed that one iteration of steps 508 512 514 and 516 to derive the I-frame QP value does not materially affect the real-time aspect of the encoding process.
- motion estimation typically consumes 50% of the encoding time in an encoder. I-frame coding does not require motion estimation, therefore making it possible to encode the same I-frame twice without exceeding the normally budgeted processing time per frame.
- step 517 may also be omitted.
- steps 508 512 514 516 and 517 provides useful functionality in adapting the selected QP value in order to attempt to obtain a better result, these steps are optional and may be omitted from certain implementations.
- the current QP value selected is released to the video encoding processor of encoder 306 (shown in Figure 3) for use in encoding a certain video frame.
- the information generated is in the form of a data structure including a plurality of entries, each entry mapping an encoded frame size to a corresponding quantization parameter value.
- the process described with reference to Figure 6 is for generating a single data structure associated to a single frame resolution (size).
- the reader skilled in the art will readily appreciate, in light of the present description, that the process depicted in Figure 6 can be performed repeatedly for different video sequence types, different frame resolutions (sizes) and/or different combinations of video sequence types/ frame resolutions (sizes) in order to generate different respective data structures.
- the manner in which such multiple data structures can be generated will become readily apparent to the person skilled in the art in light of the present specification and as such will not be described in further detail here.
- a plurality of video frames is provided as a basis for generating entries in a data structure.
- the video frames provided at this step preferably include frames representative of the types of frames expected to be encoded.
- using frames representative of the types of frames expected to be encoded allows generating estimates of the relationship between different QP values and encoded frame sizes that more closely resemble the relationship of the actual frames to be encoded.
- Steps 602 604 606 608 and 610 are performed for each QP value in a set of possible QP values.
- step 602 a quantization parameter (QP) value that has not yet been processed is selected.
- QP quantization parameter
- each video frame in the plurality of video frames provided at step 600 is encoded as an I-frame by a video encoding processor using the QP value selected in step 602 to generate an encoded frame group associated with the QP value selected in step 602.
- the resulting encoded frame group includes a plurality of encoded video frames derived using the quantization parameter value selected in step 602.
- an encoded frame size corresponding to the quantization parameter value selected at step 602 is derived at least in part based on frame sizes of encoded video frames in the encoded frame group generated at step 604. Different approaches for deriving the encoded frame size corresponding to the quantization parameter value selected at step 602 may be contemplated. In a specific example of implementation, the encoded frame size corresponding to the quantization parameter value selected at step 602 is derived by observing statistical trends in the resulting encoded video frames.
- the mean encoded frame size of the encoded frames in the encoded frame group obtained at step 604 is derived.
- the encoded frame size corresponding to the quantization parameter value selected at step 602 is set to correspond to the mean frame size.
- a statistical distribution of the frame sizes of the frames in the encoded frame group generated at step 604 is obtained.
- Figure 7 is a graphical representation of the frame size distribution for the resulting encoded frames obtained using a QP value of 10 to encode a sample set of frames.
- the frame size distribution depicted in Figure 7 is not a normal distribution but is shown for the purpose of illustration only. Based on this distribution 704, for a given QP value, a range of frame sizes having an upper limit can be derived so that a certain proportion of encoded frames have a frame size falling within the derived range of frame sizes. In the field of statistics, this is commonly referred to as a confidence interval.
- an X% confidence interval indicates that X% of the frames processed have an encoded frame size falling within the interval.
- the encoded frame size corresponding to the quantization parameter value selected at step 602 is set to correspond substantially to the upper limit of the range of frame sizes.
- an upper limit 702 for the range of frame sizes can be obtained for the given I-frame QP, so that 99% of the frames processed have an encoded frame size falling below the upper limit of the range.
- a 99% confidence interval it will be appreciated that other suitable confidence interval provided a range of frame size could also be used.
- a 50% confidence interval may be used so that the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 50%.
- a 70% confidence interval may be used so that the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 70%.
- a 90% confidence interval may be used so that the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 90%.
- a 95% confidence interval may be used so that the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 95%.
- step 608 Once the encoded frame size corresponding to the quantization parameter value selected at step 602 has been derived, we proceed to step 608.
- step 608 the encoded frame size derived in step 606 is stored in association with the QP value selected at step 602 on a computer readable storage medium. We then proceed to step 610.
- step 610 if not all QP values in the set of QP values have been processed, step 610 is answered in the affirmative and we return to step 602 and the process including steps 604, 606 and 608 is repeated for the next unprocessed QP value. If, on the other hand, all QP values in the set of QP values have been processed, step 610 is answered in the negative and the process ends at step 612. Based on the above, a data structure including a plurality of entries is generated, wherein each entry maps a QP value to a corresponding encoded frame size. An example of a data structure in the form of a look-up table that could be generated by the process depicted in Figure 6 is shown below for a CIF type frame resolution.
- a set of data structures may be generated, wherein each data structure is associated to a respective frame resolution (size).
- the plurality of video frames provided at step 600 includes sub-sets of frames associated with respective frame resolutions.
- the process depicted in Figure 6 is modified such that steps 602 604 606 608 and 610 are repeated for each subset of frames in the plurality of frames so that a respective data structure is generated for each frame resolution, wherein each respective data structure mapping an encoded frame size to a corresponding QP value.
- a set of data structures may be generated, each data structure being associated to a respective video sequence type.
- the plurality of video frames provided at step 600 includes sub-sets of frames associated with respective video sequence type.
- the process depicted in Figure 6 is modified such that steps 602 604 606 608 and 610 are repeated for each subset of frames in the plurality of frames so that a respective data structure is generated for each video sequence type, wherein each respective data structure mapping an encoded frame size to a corresponding QP value.
- the data structures obtained in accordance with this variant provide mapping that are particularly suited for sequences of frames of the same types as those for generating entries in the data structures.
- the entries in the data structure will be particularly well suited to predict the size of encoded frames for frames of the sports broadcast type.
- the sequence of frames used to generate the entries in the data structure is of a 'talking head' type, then the entries in the data structure will be particularly well suited to predict the size of encoded frames for frames of the talking head type and so on.
- all or part of the process previously described herein with reference to Figure 6 may be implemented as software consisting of a series of instructions for execution by a computing unit.
- the series of instructions could be stored on a medium which is fixed, tangible and readable directly by the computing unit, (e.g., removable diskette, CD-ROM, ROM, PROM, EPROM or fixed disk, or the instructions could be stored remotely but transmittable to the computing unit via a modem or other interface device (e.g., a communications adapter) connected to a network over a transmission medium.
- the transmission medium may be either a tangible medium (e.g., optical or analog communications lines) or a medium implemented using wireless techniques (e.g., microwave, infrared, RF or other transmission schemes).
- the process described with reference to Figure 6 may be implemented on a computing platform separate from system 100 (shown in Figure 1). As such, the process described with reference with Figure 6 may be performed off-line in order to generate the entries to the data structure(s) in memory module 402.
- the computing platform that is used for generating a data structure in accordance with the process described in Figure 6 may be placed in communication with the system 100 for the purpose of storing the generated data structure in the memory module 402 of "I- frarne" bit-rate controller 304 (both shown in Figure 4).
- the computing platform may be integrated within system 100 (shown in Figure 1) as an integral component thereof.
- bit-rate controller 304 includes a self-update module implementing a process for generating updated information for use in selecting a quantization parameter value for a video encoding processor.
- Figure 16 of the drawings depicts a bit-rate controller 304' which is analogous to bit- rate controller 304 (shown in Figure 4) but modified to include self-update module 1800.
- the self-update module 1800 is in communication with output module 412 for receiving therefrom the last computed QP value and is in communication with input 456 for receiving therefrom the raw frame to be encoded.
- the self-update module 1800 uses the last computed QP value released by output module 412 to encode the raw frames received at input 456 and stores the resulting frame size in a memory (not shown) in associated with the QP value used. In this fashion, the self-update module 1800 gathers statistical information pertaining to the relationship between encoded frame sizes and at least some QP values.
- the self-update module 1800 processes this information pertaining to encoded frame sizes to derive a new encoded frame size corresponding to the certain QP value using methods similar to those described with reference to step 606 in Figure 6.
- the "sufficiency" of the information gathered pertaining to encoded frame sizes for a certain QP value will vary from one implementation to the other.
- the self-update module 1800 processes the frame sizes of the encoded frames to derive a new encoded frame size corresponding to the given QP value.
- the self-update module 1800 accesses memory 402 to locate in a data structure an entry corresponding to the certain QP value. Once the entry has been located, the self-update module 1800 replaces the encoded frame size in that entry with the new encoded frame size.
- bit-rate controller 304' can be adapted over time based on actual video frames to be encoded receiving at input 456.
- the functionality of the self-update module 1800 may be adapted for updating statistical information in memory 402 in implementations where the memory 402 stores a plurality of data structures associated with respective frame resolutions (sizes) and/or respective video sequence types. Manners in which the self-update module 1800 could implement this functionality in such implementation will be readily apparent to the person skilled in the art in light of the present description and as such will not be described further detail here.
- FIG 8 is block diagram showing in greater detail components of the P-frame processing module 120 in accordance with a specific example of implementation.
- the P-frame processing module 120 includes a bit-rate controller 804, herein referred to as the "P-frame" bit-rate controller 804, and an encoder module 806 in communication with the "P-frame" bit-rate controller 804.
- the encoder 806 includes a video encoding processor and implements an inter-frame coding process corresponding to the intra-frame coding process implemented by encoder 306 (shown in Figure 3).
- inter-frame coding exploits the temporal redundancies between the current frame and previous and/or following frames.
- the encoder 806 generates a P-frame based on a raw frame received at over data path 110 and a reconstructed version of an encoded I-frame received from the reconstruction module 150 over data line 152.
- the resulting encoded P-frame is based in part in the I-frame derived by the I-frame processing module 112 (shown in Figure 1), which was reconstructed by the reconstruction module 150.
- the encoding process implemented by the encoder 806 can be controlled such as to modify the level of compression applied to a frame based on a quantization parameter (QP) value received from the "P-frame" bit-rate controller 804 over data line 810.
- QP quantization parameter
- the "P-frame" bit-rate controller 804 selects a quantization parameter value, herein referred to as the P-frame QP, for use by the encoder 806, the quantization parameter value corresponding to a certain level of compression from a set of possible quantization parameter values.
- the selection is effected at least in part based on target encoded frame size information conveying a desired frame size for the certain video frame to be encoded.
- the selected P-frame quantization parameter value is selected such as to attempt to cause encoder 806 to generate an encoded video frame having a frame size tending toward the desired frame size.
- the "P-frame" bit-rate controller 804 makes use of statistical information conveying a relationship between I-frame QP values and P-frame QP values and well as the associated I-frame and P-frame sizes of the resulting encoded frames.
- the selected P-frame QP value is released to the encoder 806 over data line 810.
- the "P-frame” bit-rate controller 804 receives the target encoded frame size over data line 116. Also as shown in Figure 8, the "P-frame” bit- rate controller 804 also receives information over data line 802 conveying a resolution (size) of the raw frame to be encoded as well as information related to the I-frame on which P-frame will be partly based.
- the information related to the I-frame includes the I-frame QP value derived by the I-frame processing module 112 (shown in Figure 1) and present on data line 118 and the size of the encoded I-frame generated by the I- frame processing module 112 (shown in Figure 1) present on data line 808.
- the information conveying a resolution (size) of the raw frame to be encoded, conveyed by data line 802 may be omitted.
- the "P-frame" bit-rate controller 804 also receives the encoded P-frame from data line 124.
- the "P-frame" bit-rate controller 804 includes a set of inputs 950 952 954 956 990, a memory module 902, a processing unit 975 and an output 954.
- the first input 950 for receiving from data line 802 information conveying a resolution (size) of the raw frame to be encoded. In implementation where the system 100 (shown in Figure 1) is designed to be used for a single frame resolution, this input 950 may be omitted.
- the second input 952 is for receiving information from data line 116 conveying a target encoded P-frame size.
- the target encoded P-frame size conveys a desired size of a P-frame resulting from the encoding of the raw frame present on data line 110 (shown Figure 8) by the video encoding processor in encoder 806 (also shown in Figure 8).
- the third input 954 is for receiving information from data line 808 conveying an encoded I-frame size of an I-frame on which the P-frame to be encoded is to be partly based.
- the fourth input 956 is for receiving information from data line 118 conveying an I- frame quantization parameter (I-frame QP) value.
- I-frame QP value corresponds to a level of compression applied to generate the I-frame and on which the P-frame to be encoded is to be based.
- the fifth input 990 is for receiving information related to a previously encoded P- frame.
- the information related to a previously encoded P-frame may include, for example, size information associated with a previously encoded P-frame and/or the previously encoded P-frame released over data line 124 (shown in figure 8).
- the output 958 is for releasing information derived by the processing unit 975 and conveying the level of compression to be applied to the raw frame to be encoded as a P-frame.
- the information conveying the level of compression is in the form of a P-frame quantization parameter (P-frame QP) and is released over data line 810.
- the memory module 902 stores statistical information related to the relationship between levels of compression of I-frames, levels of compression of P-frames, encoded P-frame sizes and encoded I-frame sizes.
- the memory module 902 stores one or more data structures.
- Each data structure includes information conveying an estimate of the relationship between P-frame QP values, I-frame QP values and resulting encoded frame sizes and is derived based on statistical data.
- each data structure in the set of data structures is associated to a respective raw (uncompressed) frame resolution and conveys an estimate of the relationship between P-frame QP values, I-frame QP values and resulting encoded frame sizes for the respective raw (uncompressed) frame resolution to which it is associated.
- each data structure in the set of data structures is associated to a respective a video sequence type and conveys an estimate of the relationship between P-frame QP values, I-frame QP values and resulting encoded frame sizes for the respective video sequence type to which it is associated.
- each data structure in the set of data structures is associated to a combination of a certain frame resolution and a certain the video sequence type.
- Each data structure for a given resolution (size) of a raw frame includes a plurality of entries, each entries being associated to a respective "I-frame QP value"/ "P-frame QP value” combination.
- the entries in the data structure convey estimates of the relationship between the QP value used to encode an I-frame and the QP value that can be used to encode subsequent P-frames that are encoded based on this I-frame.
- these I-frame and P-frame QP values are linked through a ratio between the target encoded frame size and the encoded I-frame size.
- the columns of the table are each associated to a respective I-frame QP value and the rows of the table are each associated to a respective P-frame QP value.
- Each entry in the table conveys a ratio of an encoded P-frame size to an encoded I-frame size associated to a given I-frame QP value/ P-frame QP value combination.
- the data structure 906 conveys a relationship between P-frame QP values, I- frame QP values and resulting encoded frame sizes for I-frames and P-frames.
- the processing unit 975 is in communication with the inputs 950 952 954 956 and 990 and with the memory module 902. In a specific example of implementation, the processing unit 975 determines the level of compression to be applied when generating a P-frame at least in part based on the information conveying a resolution (size) of the raw frame to be encoded received at the first input 950, the target encoded frame size received at the second input 952, the encoded I-frame size received at the third input 954, the I-frame QP receiving at the fourth input 956 and the statistical information in the memory module 902.
- the processing unit 975 includes a data structure selector module 900, a ratio computation module 904 in communication with the second and third inputs 952 954, a P-frame QP selector module 908 and an output module 910 in communication with output 958.
- the data structure selector module 900 is in communication with the first input 950 and with the memory module 902 and is for selecting a data structure from the set of data structures in memory unit 902.
- each data structure in the set of data structures in memory 902 is associated to a respective raw (uncompressed) frame resolution.
- the data structure selector module 900 receives from input 950 the resolution (size) of the raw frame to be encoded and accesses memory 902 to select a data structure associated to a frame resolution corresponding to or approximating the frame resolution associated to the frame to be encoded.
- each data structure in the set of data structures in memory 902 is associated to a respective video sequence type.
- the raw frame 110 (shown in Figure 8) to be encoded is part of a sequence of frames and the data structure selector module 900 selects a data structure from the set of data structures based in part based on the video sequence type associated to the sequence of frames of which the frame to be encoded is part.
- the memory unit 902 includes a single data structure associated to a single resolution the data structure selector module 900 as well as input 950 may be omitted.
- the ratio computation module 904 is in communication with second and third inputs 952 954 for receiving a target encoded frame size and an encoded I-frame size from data lines 116 and 808 respectively.
- the ratio computation module 904 is adapted to calculate the ratio between the target encoded frame size and an encoded I-frame size. Mathematically, this is expressed as:
- Target encoded frame size Ratio Encoded I-frame size
- the P-frame QP selector module 908 is in communication with the fourth input 956, with the fifth input 990, with the ratio computation module 904 and with the data structure selection module 900.
- the P-frame QP selector 908 selects the P-frame QP value that will be used to encode the raw frame as a P-frame.
- the P-frame QP selector module 908 selects the P-frame quantization parameter (P-frame QP) based in part on statistical information in memory unit 902.
- the P-frame QP selector module 908 selects the P-frame QP value based on the I-frame QP value received at input 956, the data structure selected by selection module 900 and the ratio computed by the ratio computation module 904.
- the processing unit 975 selects a P-frame QP value based in part on information related to one or more previously encoded P-frames received at input 990 over data line 124.
- the processing unit 975 may make use of any suitable technique, such as PID based (Proportional-Integral- Derivative) techniques for example, in order to select a P-frame QP value. PID-based techniques are well-known in the art and as such will not be described in further detail here.
- the P-frame QP selector module 908 releases the selected P-frame QP value to output module 910.
- the output module 910 is in communication with the QP selector module 908 and with output 958.
- the output module 910 releases to output 958 the QP value derived by the QP selector module 908.
- the "P-frame" bit-rate controller 804 may include a P-frame QP adjustment module, analogous to the QP adjustment module 410 described with reference to the "I-frame” bit-rate controller 304 (shown in figure 4).
- the P-frame QP adjustment module is for adjusting the last computed value P-frame QP value based on the how close the size of the encoded P-frame generated when using this last computed value of QP comes to the target encoded frame size.
- the P-frame QP adjustment module is in communication with input 990 for receiving information related to one or more previously encoded P-frame, with data line 110 (shown in figure 8) for receiving the raw frame to be encoded, with data line 152 for receiving a reconstructed version of a video frame on which the encoded P-frame will be based and with the output module 910 for receiving the last computed value of QP.
- the P-frame QP adjustment module includes an encoder to encode the raw frame received at input 110 using the last computed value of QP to generate an encoded P-frame.
- the encoder in the P-frame QP adjustment module implements the same encoding process as encoder 806 (shown in Figure 8).
- the P-frame QP adjustment module provides functionality for determining whether a given QP value is suitable for producing P-frames with sizes that fall within a range deemed acceptable in relation to the target encoded frame size. If the frame size of the P-frame falls outside of this range, the P-frame QP adjustment module modifies the QP value to find a QP value that produces an encoded frame with a frame size that is closer to the target encoded frame size.
- the P-frame QP adjustment module is optional in the sense that it is not required for the operation of the "P-frame” bit-rate controller 804 and therefore may be omitted from certain specific implementations thereof.
- the use of the P-frame QP adjustment module is optional as well and may depend, for example, on factors such as the current overall computational load on the "P-frame" bit-rate controller 804 as well as the availability of spare processing capacity within the system 100 (shown in figure 1).
- the "P-frame" bit-rate controller 804 implements a method for selecting a P-frame quantization parameter (QP) value for use by a video encoding processor in encoder 806 (shown in Figure 8), wherein the P-frame QP value corresponds to a certain level of compression of the video encoding processor.
- QP P-frame quantization parameter
- step 1050 if the P-frame to be generated is a first P- frame following an I-frame, condition 1050 is answered in the affirmative and we proceed to step 1000. Otherwise, if the P-frame to be generated is not a first P-frame following an I-frame, condition 1050 is answered in the negative and we proceed to step 1052.
- a P-frame QP value is selected in accordance with any suitable conventional QP value selection process known in the art, such as PID based (Proportional-Integral-Derivative) techniques for example.
- PID based Proportional-Integral-Derivative
- step 1000 information pertaining to a raw frame to be encoded is received.
- the information received at step 1000 conveys the resolution (size) of the raw frame to be encoded.
- the raw frame to be encoded is part of a sequence of frames associated to a video sequence type and the information received at step 1000 conveys the video sequence type associated to the sequence of frame of which the raw frame to be encoded is part.
- step 1000 includes processing at least some frames in the sequence of frames of which the frame to be encoded is part to select, from a set of possible video sequence types, the video sequence type associated to the sequence of frames.
- the resolution (size) of the raw frame to be encoded and the video sequence type associated to the sequence of frames of which the raw frame to be encoded is part may be used in combination in certain implementations.
- a data structure is selected from an available set of data structures based in part on the information received in step 1000.
- this step is implemented by the data structure selector module 900 (shown in Figure 9) which accesses the set of data structure in memory module 902 (also shown in Figure 9).
- the data structure selector module 900 shown in Figure 9
- memory module 902 also shown in Figure 9
- steps 1000 and 1002 would be omitted.
- the size of the encoded I-frame on which the P-frame will be partly based ; and the I-frame QP value (the QP value that was used to encode the I- frame on which the P-frame will be partly based).
- the target encoded P-frame size conveys a desired size of a P-frame resulting from the encoding of the raw frame present on data line 110 (shown Figure 8) by the video encoding processor in encoder 806. We then proceed to step 1006. At step 1006, a ratio between the target encoded frame size and the size of the encoded I-frame is computed. Once this calculation is performed, we proceed to step 1008.
- the P-frame QP value is selected at least in part based on the I-frame QP value received at step 1004 and the ratio computed at step 1006.
- the data structure selected at step 1002 includes a plurality of entries, each entries being associated to a respective ⁇ I-frame QP value; P-frame QP value ⁇ combination.
- a set of entries associated to the I-frame QP value received at step 1004 is first identified in the data structure selected at step 1002.
- this process involves identifying in data structure 906 the column corresponding to the I-frame QP value received at step 1004.
- the entries in the column associated with the I-frame QP value are then compared to the ratio computed at step 1006 to identify amongst these entries a specific entry including a ratio approaching the ratio computed at step 1006. It will be appreciated that the ratio computed at step 1006 may not exactly match any of the ratios in the data structure 906 corresponding to the I-frame QP value received at step 1004. As such, an entry including a ratio that is closest to the ratio computed at step 1006 may be selected. Alternative approaches may opt to select an entry including a ratio that is closest to but less then (or greater than) the ratio computed at step 1006.
- the identified specific entry is associated to an I-frame QP value / P-frame QP value pair.
- the P-frame QP value of the I-frame QP value / P-frame QP value pair is then selected and released for use by a video encoding processor.
- the "P-frame" bit-rate controller 804 includes a P-frame QP adjustment module, analogous to the QP adjustment module 410 described with reference to the "I-frame” bit-rate controller 304 (shown in figure 4), following step 1008 an adjustment of the P-frame QP may be made.
- the adjustment to the selected P-frame QP may be made in a manner analogous to that of the I-frame QP described with reference to steps 508 510 512 514 and 516 shown in Figure 5.
- steps pertaining to the adjustment of the P-frame QP will not be described in greater detail here as their implementation will become apparent to the person skilled in the art in light of the present description.
- step 1006 we receive a target encoded frame size of 25 Kbits, an encoded 1-frame size of 100 Kbits and an I-frame QP-value of 10. Based on the above information, the ratio between the target encoded frame size and the encoded I-frame size is computed at step 1006 as being:
- I-frame QP value 15
- the relationship between P-frame QP values (shown in the x-axis) and ratios between P- frame sizes and I-frame sizes for a fixed I-frame QP value generally follows a negative exponential curve.
- Dotted line 1204 conveys a ratio computed based a target encoded frame size and the size of the encoded I-frame on which the P-frame to be encoded is based.
- the P-frame bit-rate controller 804 (shown in Figures 8 and 9) makes use of statistical information conveying a relationship between I-frame QP values and P-frame QP values and well as the associated I-frame and P-frame sizes of resulting frames encoded using different combinations of I-frame QP values and P- frame QP values.
- An exemplary process for generating information suitable for use in selecting a quantization parameter value for a video encoding processor will now be described with reference to Figure 11.
- the information generated is in the form of a data structure including entries conveying a statistical relationship between I-frame QP values and P-frame QP values and resulting encoded frame sizes.
- the process described with reference to Figure 11 is for generating a single data structure associated to a single frame resolution (size).
- the reader skilled in the art will readily appreciate in light of the present description, that the process depicted in Figure 11 can be performed repeatedly for different video sequence types, different frame resolutions (sizes) and/or different combinations of video sequence types/ frame resolutions (sizes) in order to generate different respective data structures.
- the manner in which such multiple data structures can be generated will becomes readily apparent to the person skilled in the art in light of the present specification and as such will not be described further here.
- a sequence of video frames is provided as a basis for generating the data structure.
- the sequence of video frames provided at this step preferably includes frames sequences representative of the types of frame sequences expected to be encoded.
- the frames in the sequence of video frames provided at step 1100 are assigned to be encoded as either I-frames or P-frames in such a manner as to interleave I-frame and P-frame assignment in the encoded video sequence.
- the frames in the sequence of frames are assigned to be encoded as either I-frames or P-frames according to the following pattern:
- each frame assigned to be encoded as a P- frame immediately follows a frame assigned to be encoded as an I-frame.
- each P-frame will be based in part on the I-frame immediately preceding it. It is to be appreciated that the above described pattern of I-P-I-P-I-P-(et ⁇ ) constitutes only one possible pattern and other suitable patterns, including patterns making use of frame types other than I-frames and P-frames, may be used.
- Steps 1104 1106 1108 1110 1112 1114 1116 1118 and 1120 are performed for each I- frame QP value from a set of possible I- frame QP values.
- step 1104 an I-frame QP value that has not yet been processed is selected. We then proceed to step 1106.
- the frames assigned to be encoded as I-frames at step 1102 are encoded using the I-frame QP value selected at step 1104 to generate an I-frame encoded frame group.
- the resulting I-frame encoded frame group includes a plurality of encoded video frames derived using the I-frame QP value selected in step 1104. We then proceed to step 1108.
- an encoded frame size corresponding to the I-frame QP value selected in step 1104 is derived at least in part based on frame sizes of encoded video frames in the I-frame encoded frame group generated at step 1106.
- the encoded frame size corresponding to the I-frame QP value selected in step 1104 will be designation as: F (Size Of I-frameS ⁇ l-frame QP value ⁇ )
- F(x) is used to designate a certain function of variable x, and wherein "size of I-frames p-frame Q P value ⁇ " is used to designate the size of the I-frames in the frame group associated with the I-frame QP value selected at step 1104.
- function F(x) is an average function which computes the average encoded frame sizes of the encoded frames in the I-frame encoded frame group obtained at step 1106.
- the encoded frame size corresponding to the I-frame value selected in step 1104 is set to correspond to the average I-frame size and is associated to the specific I-frame QP value selected at step 1104. We then proceed to step 1110.
- steps 1112 1114 1116 and 1118 are repetitively performed for each P-frame QP value from a set of possible P- frame QP values.
- step 1110 a P-frame (QP) value that has not yet been processed is selected.
- QP P-frame
- the frames assigned to be encoded as P-frames at step 1102 are encoded using the P-frame QP value selected at step 1110 to generate a P-frame encoded frame group. More specifically, the frames assigned to be encoded as P-frames in step 1102 are encoded using the I-frames generated in step 1106 (using the I-frame QP value selected at step 1104), as well as the selected P-frame QP value selected in step 1110.
- the methods for encoding a video frame as a P-frame based on a preceding I-frame are well known in the art and as such will not be described further here.
- the P-frame encoded frame group includes a plurality of encoded video frames derived using the P-frame QP value selected in step 1110.
- each P-frame generated at step 1112 is associated to a specific combination of an I-frame QP value and P-frame QP value.
- an encoded frame size corresponding to the P-frame QP value selected in step 1110 is derived at least in part based on frame sizes of encoded video frames in the P-frame encoded frame group generated at step 1112. For the purpose of simplicity, this encoded frame size corresponding to the P-frame QP value selected in step 1114 will be designation as:
- F(x) is used to designate a certain function of variable x, and wherein "size of I-frames ⁇ frame QP value, P-Frame QP value ⁇ " is used to designate the size of the P-frames in the frame group associated with the I-frame QP value selected at step 1104 and the P- frame QP value selected at step 1110.
- Different functions for deriving the encoded frame size corresponding to the P-frame QP value selected in step 1110 may be contemplated.
- function F(x) is an average function which computes the average encoded frame size of the encoded frames in the P-frame encoded frame group obtained at step 1112.
- the encoded frame size corresponding to the P-frame QP value selected in step 1110 is set to correspond to the average encoded P-frame size.
- the average encoded P-frame size computed at step 1116 is associated to a specific combination of an I-frame QP value and P-frame QP value.
- a data element conveying a relationship between the I-frame size derived at step 1108 and the P-frame size derived at step 1114 is derived.
- This data element is associated to a specific ⁇ I-frame QP value; P-frame QP value ⁇ combination.
- the data element corresponding to the I- frame QP value selected in step 1104 and to the P-frame QP value selected at step 1110 will be designation as:
- G(x, y) is used to designate a certain function of variables x and y, m is used to designate and I-frame QP value and n is used to designate a P-frame QP value.
- Different functions for deriving a data element conveying a relationship between the I-frame size and the P-frame size derived at steps 1108 and 1114 respectively may be contemplated.
- function G(x, y) is a ratio function which computes:
- Ratio ⁇ i-frame QP, P-frame QP ⁇ F (size of P-frame (i-frame OP. P-frame OP )I
- step 1116 is a data element associated with a given ⁇ I-frame QP value; P-frame QP value ⁇ combination conveying a relationship between the I-frame sizes and P-frame sizes associated with that given ⁇ I-frame QP value; P-frame QP value ⁇ combination.
- P-frame QP value ⁇ combination conveying a relationship between the I-frame sizes and P-frame sizes associated with that given ⁇ I-frame QP value; P-frame QP value ⁇ combination.
- the data element derived in step 1116 is stored in a data structure on a computer readable storage medium.
- the data element is stored in association with an ⁇ I-frame QP value; P-frame QP value ⁇ combination, wherein the I-frame QP value of the combination is the I-frame QP value selected at step 1104 and the P-frame QP value of the combination is the P-frame QP value selected at step 1110.
- step 1120 if not all P-frame QP values in the set of P-frame QP values have been processed for the I-frame QP value selected at the last iteration of step 1104, step 1120 is answered in the negative and we return to step 1110 where steps 1112 1114 1116 and 1118 are repeated for the next unprocessed P-frame QP value. If on the other hand all P-frame QP values in the set of P-frame QP values have been processed for the I-frame QP value selected at the last iteration of step 1104, step 1120 is answered in the positive and the process continues at step 1124.
- step 1124 if not all I-frame QP values in the set of I-frame QP values have been processed, step 1124 is answered in the negative and we return to step 1104 where steps 1106 1108 1110 1112 1114 1116 1118 and 1120are repeated for the next unprocessed I-frame QP value. If, on the other hand, all I-frame QP values in the set of I-frame QP values have been processed, step 1124 is answered in the positive and we proceed to step 1128 where the process terminates.
- a data structure including a plurality of entries is generated, wherein each entry is associated to a respective ⁇ I-frame QP value; P- frame QP value ⁇ combination and conveys a relationship between I-frame sizes and P-frame sizes derived using the respective ⁇ I-frame QP value; P-frame QP value ⁇ combination to which it is associated.
- a set of data structures may be generated, wherein each data structure is associated to a respective frame resolution (size).
- a set of data structures may be generated, each data structure being associated to a respective video sequence type.
- all or part of the process previously described herein with reference to Figure 11 may be implemented as software consisting of a series of instructions for execution by a computing unit.
- the series of instructions could be stored on a medium which is fixed, tangible and readable directly by the computing unit, (e.g., removable diskette, CD-ROM, ROM, PROM, EPROM or fixed disk, or the instructions could be stored remotely but transmittable to the computing unit via a modem or other interface device (e.g., a communications adapter) connected to a network over a transmission medium.
- the transmission medium may be either a tangible medium (e.g., optical or analog communications lines) or a medium implemented using wireless techniques (e.g., microwave, infrared, RF or other transmission schemes).
- the process described with reference to Figure 11 may be implemented on a computing platform separate from system 100 (shown in Figure 1). As such, the process described with reference with Figure 11 may be performed off-line in order to generate the entries to the data structure(s) in memory module 902.
- the computing platform that is used for generating a data structure in accordance with the process described in Figure 11 may be placed in communication with the system 100 for the purpose of storing the generated data structure in the memory module 902 of P-frame bit-rate controller 804 (both shown in Figure 9).
- the computing platform may be integrated within system 100 (shown in Figure 1) as an integral component thereof.
- bit-rate controller 804 (shown in Figure 9) includes a self-update module implementing a process for generating updated information for use in selecting a P- frame quantization parameter value for a video encoding processor.
- Figure 17 of the drawings depicts a bit-rate controller 804' which is analogous to bit- rate controller 804 (shown in Figure 9) but modified to include self-update module 1900.
- the self-update module 1900 is in communication with output module 910 for receiving a P-frame QP value, with input 990 for receiving information related to a P-frame previously encoded using the P- frarne QP value, with input 954 for receiving information conveying the size of the I- frame on which the previously encoded P-frame was based, with input 956 for receiving the I-frame QP value used to generate that I-frame and with input 950 for receiving the resolution(size) of the raw frames to be encoded.
- the self-update module 1900 stores the information it receives from the aforementioned inputs 990 956 954 and 950 in a memory (not shown) in associated with the P-frame QP value released by output module 910. In this fashion, the self-update module 1900 gathers statistical information pertaining to the relationship between I-frame QP values and P- frame QP values and the associated I-frame and P-frame sizes of the resulting encoded frames.
- the self-update module 1900 processes this information to derive information using methods similar to those described with reference to Figure 11.
- the "sufficiency" of the information gathered pertaining to encoded frame sizes for certain ⁇ P-frame QP value; I-frame QP value ⁇ combinations will vary from one implementation to the other.
- the self-update module 1900 then accesses memory 902 to locate in a data structure an entry corresponding to the certain ⁇ P-frame QP value; I-frame QP value ⁇ combination and replaces the data in that entry with the newly generated information.
- bit-rate controller 804' can be adapted over time based on actual video frames to be encoded.
- the functionality of the self-update module 1900 may be adapted for updating statistical information in memory 902 in implementations where the memory 902 stores a plurality of data structures associated with respective frame resolutions (sizes) and/or respective video sequence types. Manners in which the self-update module 1900 could implement this functionality in such implementations will be readily apparent to the person skilled in the art in light of the present description and as such will not be described further detail here.
- bit-rate controller 804' may also include a P-frame QP adjustment module (no shown in figure 17), analogous to the QP adjustment module 410 described with reference to the "I-frame" bit-rate controller 304 (shown in figure 4), for adjusting the last computed value P-frame QP value based on the how close the size of the encoded P-frame generated when using this last computed value of QP comes to the target encoded frame size.
- a P-frame QP adjustment module analogous to the QP adjustment module 410 described with reference to the "I-frame" bit-rate controller 304 (shown in figure 4), for adjusting the last computed value P-frame QP value based on the how close the size of the encoded P-frame generated when using this last computed value of QP comes to the target encoded frame size.
- all or part of the functionality previously described herein with respect to either one of the "I-frame" bit-rate controllers 304 and 304' and either one of the "P-frame” bit-rate controllers 804 and 804' may be implemented as software consisting of a series of instructions for execution by a computing unit.
- the series of instructions could be stored on a medium which is fixed, tangible and readable directly by the computing unit, (e.g., removable diskette, CD-ROM, ROM, PROM, EPROM or fixed disk, or the instructions could be stored remotely but transmittable to the computing unit via a modem or other interface device (e.g., a communications adapter) connected to a network over a transmission medium.
- the transmission medium may be either a tangible medium (e.g., optical or analog communications lines) or a medium implemented using wireless techniques (e.g., microwave, infrared, RF or other transmission schemes).
- the apparatus implementing any of the "I-frame" bit-rate controllers 304 and 304', the "P-frame” bit-rate controller 804 and 804' may be configured as a computing unit of the type depicted in Figure 15, including a processing unit 1502 and a memory
- the memory 1504 connected by a communication bus 1508.
- the memory 1504 includes data 1510 and program instructions 1506.
- the processing unit 1502 is adapted to process the data 1510 and the program instructions 1506 in order to implement the functional blocks described in the specification and depicted in the drawings.
- the data 1510 includes a set of data structures in accordance with the set of data structures 402 described with reference to Figure 4 and the program instructions 1506 implement the functionality of the processing unit 475 described above with reference to Figure 4.
- the data 1510 includes a set of data structures in accordance with the set of data structures 902 described with reference to Figure 9 and the program instructions 1506 implement the functionality of the processing unit 975 described above with reference to Figure 9.
- the computing unit 1502 may also comprise a number of interfaces (now shown) for receiving or sending data elements to external modules.
- program instructions 1506 may be written in a number of programming languages for use with many computer architectures or operating systems. For example, some embodiments may be implemented in a procedural programming language (e.g., "C") or an object oriented programming language (e.g., "C++” or "JAVA").
- C procedural programming language
- object oriented programming language e.g., "C++” or "JAVA”
- a quantization parameter (either reference frame QP or non-reference frame QP) may be made for a subset of macro-blocks in a given frame, rather than for a frame as a whole.
- different quantization parameters can be selected for different portions of a frame, each portion including one or more macroblocks, based on the concepts and processes described above.
- the data structures used for storing information providing a relationship between different encoded frame sizes and quantization parameters can be further indexed to correspond to subsets of macro- blocks in a frame so that the data structures map macroblock sizes (rather than frame sizes) to corresponding quantization parameters.
- a first data structure may be assigned to macroblocks of a frame positioned in the top right quadrant of the frame
- a second data structure may be assigned to macroblocks of a frame positioned in the centre of the frame.
- the principles described above in accordance with specific examples of implementation of the invention can be used in combination with existing methods and systems for selecting a quantization parameters value. For example, if while encoding macroblocks in a frame using a quantization parameter (QP) selected according to one of the methods described above, the encoder realizes that too many bits will be used if the selected compression level (QP) is maintained, the compression level can be increased (higher QP) for the remaining macroblocks in the frame.
- QP quantization parameter
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
A method and apparatus using statistical information for determining a level of compression to be applied to a certain frame by a video encoding processor in order to generate an encoded frame is provided. The level of compression to be applied to the certain frame is selected at least in part based on the statistical information and a target frame size. The statistical information is obtained by encoding a plurality of representative video frames and observing statistical trends of a resulting encoded video stream. The statistical information provides estimates of encoded frame sizes associated with encoded frames resulting from the video encoding processor using different levels of compression to encode a certain frame.
Description
TITLE: METHOD AND DEVICE FOR CONTROLLING BIT-RATE FOR VIDEO ENCODING, VIDEO ENCODING SYSTEM USING SAME AND COMPUTER PRODUCT THEREFOR
CROSS-REFERENCE TO RELATED APPLICATION
For the purpose of the United States, the present application claims the benefit of priority under 35 USC §119 based on U.S. provisional patent application serial number 61/061,329 filed on June 13, 2008 by D. Leclerc et al. and presently pending. The contents of the above-referenced document are incorporated herein by reference.
FIELD OF THE INVENTION
The present invention relates generally to the field of video encoding and, more specifically, to a method and apparatus for controlling bit-rate in a video service application. The invention is particularly useful in low bandwidth connections such as found, for example, in mobile networks.
BACKGROUND
Generally speaking, the quality of video services is subject to bandwidth availability of the carrier as well as the acceptable latency from the perspective of the user. For example, video-conferencing, because of its real-time nature, has stringent latency requirements in addition to bandwidth limitations. Delays must be kept to a minimum otherwise interactions become awkward. Other video applications also have limited bandwidth/acceptable latency requirements, which must be taken in to account in the video services context.
Located at the heart of a communication network, media gateways and media servers are responsible for converting and adapting video content to the network conditions
and endpoint devices. Media gateways and media servers that support video are responsible for, amongst other things, converting a video stream from one codec format to another. In doing so they typically must transcode between video codecs, adapt the frame rates, and change the frame size so as to adapt it to the screen on the target device. In addition, the media gateway and media server also ensure that the video stream on a given connection does not exceed its bandwidth budget.
Media gateway and media server make use of video encoding/decoding processor to encode/decode video frames. Typically, a video encoding processor can apply different levels (amounts) of compression to a frame in order to derive respective encoded versions of that frame. The levels of compression applied by the video encoding processor are controlled by parameter values, commonly referred to as quantization parameter values. A rate controller, in the context of video encoding, is a device that dynamically adjusts parameter values provided to a video encoding processor in order to attempt to achieve a target bit rate.
Various techniques have been developed to perform rate control.
A first type of technique for controlling the bit-rate for a video stream makes use of a multi-pass encoding approach. In such an approach, a set of frames is repetitively encoded using different sets of parameter values and the set of parameter values resulting in the most optimal encoded series of frames for the given bandwidth budget is selected. Such multi-pass encoding techniques are used, for example, in the video broadcast or video authoring domains.
A deficiency with such multi-pass encoding techniques is that they are not well suited to real-time, or near real-time, video applications since they introduce significant delays due to the increased processing time required to "try-out" the different parameter values.
Other techniques for controlling bit-rate in a video stream involve changing the compression level within a given video frame. In other words, while encoding
macroblocks in a frame, the encoder realizes that too many bits will be used if the selected compression level is maintained, and increases the compression level for the remaining macroblocks in the frame. For more information regarding such methods the reader in invited to refer to the H.263 DQUANT codec specification described in ITU-T Rec. H.263 (01/2005) specification provided by TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU. The content of the aforementioned document is incorporated herein by reference.
A deficiency with techniques of the type described above is that they often result in uneven image quality within a given frame, leading to blurry sections, or flashing effects from one frame to the next.
Another popular approach for controlling bit-rate in a video stream involves controlling the average number of bits transmitted over a period of time. Typically, such an approach is implemented by monitoring encoded frames "after the fact", after they've been encoded, and keeping track of their size. When frames cause the allocated bandwidth to be exceeded, frames are dropped until there is sufficient bandwidth to continue. For more information regarding such an approach, the reader in invited to refer to the specification document for the TMN8 rate controller a copy of which can be found at http://ftp3.itu.int/av-arch/video-site/9706 Por/ql5a20rl.doc. The content of the aforementioned document is incorporated herein by reference.
A deficiency of the above-mentioned technique is that for limited bandwidth applications it often results in a video stream that appears choppy at the receiving end due to the dropped frames.
In the context of the above, there is a need to provide a method and a device for controlling the bit-rate of a video stream in video applications having limiting bandwidth availability that alleviate at least in part problems associated with the existing methods and devices.
SUMMARY
In accordance with a broad aspect, the invention provides a method using statistical information for determining a level of compression to be applied to a certain frame by a video encoding processor in order to generate an encoded frame. The statistical information is obtained by encoding a plurality of representative video frames and observing statistical trends of a resulting encoded video stream. The statistical information provides estimates of encoded frame sizes of encoded frames resulting from the video encoding processor using different levels of compression to encode the certain frame. The method comprises selecting the level of compression to be applied to the certain frame at least in part based on the statistical information and a target frame size.
Advantageously, by using a statistical approach to bit-rate control, an improvement in video quality can be obtained, in particular in applications having limited bandwidth availability and requiring low latency such as, for example, mobile video applications.
In accordance with a specific implementation, the method for selecting a frame quantization parameter value that takes into account the type of frame being encoded (reference frame or non-reference frame for example) in the selection. Advantageously this may allow, amongst other, to optimize bandwidth priority between reference frames and non-reference frames.
In accordance with another broad aspect, the invention provides a bit-rate controller using statistical information for determining a level of compression to be applied to a frame by a video encoding processor in accordance with the above-described method.
In accordance with another broad aspect, the invention provides a method for selecting a quantization parameter value for use by a video encoding processor encoding a certain frame, the quantization parameter value corresponding to a certain level of compression of the video encoding processor. The method comprises
a) receiving target encoded frame size information conveying a desired frame size for the certain frame; b) providing a data structure including a plurality of entries, each entry mapping an encoded frame size to a corresponding quantization parameter value; c) selecting a quantization parameter value at least in part based on said target encoded frame size information and on said data structure; d) releasing the selected quantization parameter value to the video encoding processor.
In a specific example of implementation, the quantization parameter value is a reference frame quantization parameter value, the reference frame quantization parameter value being for use by the video encoding processor in encoding the certain frame as a reference frame. A specific example of a reference frame is an intra- encoded frame also referred to as an I-frame.
In an alternative example of implementation, the quantization parameter value is a non-reference frame quantization parameter value, the non-reference frame quantization parameter value being for use by the video encoding processor in encoding the certain frame as a non-reference frame. A specific example of a non- reference frame is an inter-encoded frame, such as for example a P-frame or a B- frame.
In a specific example of implementation, the selected quantization parameter value allows the video encoding processor to derive an encoded frame based on the certain frame such that the encoded frame has a frame size tending toward the desired frame size.
In accordance with a specific example of implementation, the selected quantization parameter value corresponds to an entry in the data structure associated to an encoded frame size approximating the target encoded frame size.
In accordance with another specific example of implementation, selecting the quantization parameter value comprises: a) selecting an initial quantization parameter value corresponding to an entry in the data structure associated to an encoded frame size approximating the target encoded frame size; b) on the basis of the initial quantization parameter value, encoding the certain frame to derive a resulting encoded frame, the resulting encoded frame having a certain size; c) selecting the quantization parameter value at least in part based on the certain size of the resulting encoded frame and the desired frame size for the certain frame.
In accordance with a specific example of implementation, the method comprises selecting the data structure from a set of data structures. Each data structure in the set of data structures includes a respective set of entries mapping encoded frame sizes to corresponding quantization parameter values. The selection of the data structure may be based on a number of suitable criteria.
In a first non-limiting example of implementation, each data structure in the set of data structures is associated to a respective (uncompressed) frame resolution. In this example, a data structure associated to a frame resolution corresponding to a frame resolution approximating the frame resolution associated to the frame to be encoded is selected.
In a second non-limiting example of implementation, each data structure in the set of data structures is associated to a respective video sequence type. Video sequence types may include, for example, sports broadcast video sequence and talking head video sequence amongst others. In this example, the certain video frame to be encoded is part of a sequence of frames and the data structure is selected from the set of data structures in part based on the video sequence type associated with the sequence of frames. The video sequence type associated to the sequence of frames may be provided as an input and used for selecting the quantization parameter.
Alternatively, video sequence type associated to the sequence of frames may be derived by processing at least some frames in the sequence of frames to select, from a set of possible video sequence types, the video sequence type associated to the sequence of frames.
In accordance with another broad aspect, the invention provides an apparatus for selecting a quantization parameter value in accordance with the above-described method, the quantization parameter value being for use by a video encoding processor encoding a certain video frame.
In accordance with another broad aspect, the invention provides a computer readable storage medium storing a program element suitable for execution by a processor for selecting a quantization parameter value for use by a video encoding processor encoding a certain video frame, the quantization parameter value corresponding to a certain level of compression of the video encoding processor. The quantization parameter value is selected in accordance with the above-described method.
In accordance with another broad aspect, the invention provides a video encoding system for encoding a video stream including a sequence of frames. The system comprises a first input for receiving a video frame originating from a sequence of frames to be encoded and a second input for receiving target encoded frame size information conveying a desired frame size for the certain video frame. The system also comprises an apparatus for selecting a quantization parameter value in accordance with the above-described method and an encoding processor in communication with the first input and with the apparatus. The encoding processor processes the certain video frame to generate an encoded video frame based in part on the selected quantization parameter. The system also includes an output for releasing the encoded video frame generated by the encoding processor.
In accordance with another broad aspect, the invention provides a method for generating information for use in selecting a quantization parameter value for a video
encoding processor, the quantization parameter value corresponding to a certain level of compression of the video encoding processor. The method comprises: a) providing a plurality of video frames representative of types of frames expected to be encoded by the video encoding processor; b) encoding the plurality video frames for a quantization parameter value selected from a set of quantization parameter values to generate an encoded frame group, the encoded frame group including a plurality of encoded video frames derived using the quantization parameter value; c) deriving an encoded frame size corresponding to the quantization parameter value, the corresponding encoded frame size being derived at least in part based on frame sizes of encoded video frames in the encoded frame group; d) on a computer readable storage medium, storing information mapping the quantization parameter value to its derived corresponding encoded frame size.
In accordance with a specific example, the above steps b) c) and d) are repeated for each quantization parameter value in the set of quantization parameter values.
In accordance with a specific example of implementation, deriving the encoded frame size corresponding to the quantization parameter value comprises processing the frame sizes of encoded video frames in the encoded frame group to derive a range of frame sizes, the range of frame sizes having an upper limit. The derived range of frame sizes is such that a certain proportion of encoded frames in the encoded frame group have a frame size falling within the derived range of frame sizes. The encoded frame size corresponding to the quantization parameter value is set to substantially correspond to the upper limit of the derived range of frame sizes.
The proportion of encoded frames in the encoded frame group used to define the range of frame sizes may vary in different specific implementations. In accordance with a specific non-limiting example of implementation, the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least
about 50%. In accordance with another specific non-limiting example of implementation, the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 70%. In accordance with another specific non-limiting example of implementation, the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 90%. In accordance with another specific non-limiting example of implementation, the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 95%. In accordance with another specific non-limiting example of implementation, the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 99%.
In accordance with a specific example of implementation, the plurality of video frames includes a first sub-set of frames having a first frame resolution and a second sub-set of frames having a second frame resolution distinct from the first frame resolution. The method comprises processing video frames in the first sub-set of frames to derive a first encoded frame size corresponding to the quantization parameter value and the first frame resolution. The method also comprises processing video frames in the second sub-set of frames to derive a second encoded frame size corresponding to the quantization parameter value and the second frame resolution. The person skilled in the art will appreciate that the above process may be applied to any number of distinct frame resolutions, in order to derive a mapping between encoded frame sizes and respective quantization parameter values for each distinct frame resolution.
In accordance with a specific example of implementation, the plurality of video frames includes a first sequence of frames of a first video sequence type and a second sequence of frames of a second video sequence type distinct from the first video sequence type. The method comprises processing video frames in the first sequence of frames to derive a first encoded frame size corresponding to the quantization parameter value and the first video sequence type. The method also comprises processing video frames in the second sequence of frames to derive a second encoded frame size corresponding to the quantization parameter value and the second video
sequence type. In accordance with a specific non-limiting example of implementation, the first video sequence type is selected from a set of possible video sequence types including, for example, sports broadcast video sequence and talking head video sequence amongst others. The person skilled in the art will appreciate that the above process may be applied to any number of distinct video sequence types, in order to derive a mapping between encoded frame sizes and respective quantization parameter values for each distinct video sequence type.
In accordance with another broad aspect, the invention provides a computer readable storage medium storing a data structure for use in selecting a quantization parameter for a video encoding processor, entries in the data structure being generated in accordance with the above described method.
In accordance with another broad aspect, the invention provides a computer readable storage medium storing a data structure for use in selecting a quantization parameter value for a video encoding processor from a set of quantization parameter values, each quantization parameter value in the set corresponding to a respective level of compression of the video encoding processor. The data structure includes a plurality of entries, each entry mapping an encoded frame size to a corresponding quantization parameter value in the set of quantization parameter values. In accordance with a specific example of implementation, the entries in the data structure are derived at least in part by encoding a plurality of representative video frames and observing statistical trends in the resulting encoded video streams.
In accordance with another broad aspect, the invention provides an apparatus for generating information for use in selecting a quantization parameter value for a video encoding processor in accordance with the above described method.
In accordance with another broad aspect, the invention provides a method for selecting a quantization parameter value for use by a video encoding processor in encoding a non-reference frame, the quantization parameter value corresponding to a certain level of compression of the video encoding processor. The method comprises:
a) receiving target encoded non-reference frame size information conveying a desired frame size for a certain video frame to be encoded as a non-reference frame by the video encoding processor,; b) receiving reference frame information associated to a reference frame on which the non-reference frame will be partly based, said reference frame information including; i. a reference frame quantization parameter value associated to the reference frame and corresponding to a level of compression applied to generate the reference frame ame; ii. reference frame size information conveying a frame size associated to the reference frame; c) selecting the quantization parameter value for use in encoding the non- reference frame at least in part based on: i. the target encoded non-reference frame size information; ii. the reference frame quantization parameter value; and iii. the reference frame size information; d) releasing the selected quantization parameter value for use in encoding the non-reference frame to the video encoding processor.
In accordance with a specific example of implementation, the above describe method is used to select a non-reference frame quantization parameter value for use by a video encoding processor when encoding a frame as a first non-reference frame following a frame in a sequence of frame encoded as a reference frame.
In accordance with a specific example of implementation, the method comprises providing a data structure providing a mapping between reference frame quantization parameter values and non-reference frame quantization parameter values. In a non- limiting specific example, the data structure comprises a plurality of entries, each entry being associated to a respective combination of a reference frame quantization parameter value and a non-reference frame quantization parameter value.
Optionally, the method comprises selecting the data structure from a set of data structures. Each data structure includes a respective set of entries providing a mapping between reference frame quantization parameter values and non-reference frame quantization parameter values. The selection may be based on a number of suitable criteria.
In a first non-limiting example of implementation, each data structure in the set of data structures is associated to a respective frame resolution. In this example, a data structure associated to a frame resolution corresponding to a frame resolution approximating the frame resolution associated to the certain video frame to be encoded is selected.
In a second non-limiting example of implementation, each data structure in the set of data structures is associated to a respective a video sequence type. In this example, the certain video frame to be encoded is part of a sequence of frames and the data structure is selected from the set of data structures based in part on the video sequence type associated to the sequence of frames.
It will be appreciated that other criteria for selecting a data structure are possible and that criteria may be used in combination so that, for example, a given data structure may be associated to a certain frame resolution and to a certain video sequence type.
In accordance with another broad aspect, the invention provides an apparatus for selecting a non-reference frame quantization parameter value for use by a video encoding processor in accordance with the above-described method.
In accordance with another broad aspect, the invention provides a computer readable storage medium storing a program element suitable for execution by a processor for selecting a non-reference frame quantization parameter value for use by a video encoding processor. The non-reference frame quantization parameter value is selected in accordance with the above-described method.
In accordance with another broad aspect, the invention provides a video encoding system for encoding a video stream including a sequence of frames. The system comprises a first input for receiving a video frame originating from a sequence of frames to be encoded and a second input for receiving target encoded frame size information conveying a desired frame size for the certain video frame. The system also comprises an apparatus for selecting a non-reference frame quantization parameter value in accordance with the above-described method and an encoding processor in communication with the first input and with the apparatus. The encoding processor processes the certain video frame to generate an encoded video frame based in part on the selected non-reference frame quantization parameter. The system also includes an output for releasing the encoded video frame generated by the encoding processor.
In accordance with yet another broad aspect, the invention provides a method for generating information for use in selecting a non-reference frame quantization parameter value from a set of possible non-reference frame quantization parameter values for use by a video encoding processor. Each non-reference frame quantization parameter value in the set corresponds to a certain level of compression of the video encoding processor. The method comprises: a) providing a sequence of video frames representative of types of frames to be encoded by the video encoding processor; b) encoding the sequence video frames so as to generate: i. a plurality of reference frames associated with a given reference frame quantization parameter value, the reference frame quantization parameter value corresponding to a level of compression applied to generate the plurality of reference frames; ii. a plurality of non-reference frames, the plurality of non- reference frames being arranged in sets, each set of non- reference frames being associated with a respective non- reference frame quantization parameter value corresponding to
a level of compression applied to generate the non-reference frames in the associated set of non-reference frames; c) deriving a mapping between the given reference frame quantization parameter value and each non-reference frame quantization parameter value in the set of possible non-reference frame quantization parameter values based on the plurality of reference frames and on the plurality of non-reference frames; d) storing the derived mapping in a data structure on a computer readable storage medium.
In a specific example of implementation, steps b) c) and d) described above are repeated for each reference frames quantization parameter value in a set of possible reference frame quantization parameter values. This results in a mapping between each reference frame quantization parameter value in the set of possible reference frame quantization parameter values and each non-reference frame quantization parameter value in the set of possible non-reference frame quantization parameter values.
These and other aspects and features of the present invention will now become apparent to those of ordinary skill in the art upon review of the following description of specific embodiments of the invention in conjunction with the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
In the accompanying drawings:
Fig. 1 shows a simplified high-level functional block diagram of a video encoding system in accordance with a specific example of implementation of the present invention; Fig. 2 shows a high level flow diagram of a process implemented by the video encoding system shown in Figure 1 in accordance with a specific example of implementation of the present invention;
Fig. 3 is a functional block diagram of an I-frame processing module suitable for use in the video encoding system depicted in Figure 1 in accordance with a specific example of implementation of the present invention;
Fig. 4 is a functional block diagram of a bit-rate controller module suitable for use in connection with the I-frame processing module shown in Figure 3 in accordance with a specific example of implementation of the present invention; Fig. 5 shows a flow diagram of a process implemented by the bit-rate controller module shown in Figure 4 for selecting a quantization parameter in accordance with a specific example of implementation of the present invention; Fig. 6 shows a process for generating a data structure including statistical information for use by the bit-rate controller module of Figure 4 in accordance with a specific example of implementation of the present invention; Fig. 7 shows an exemplary distribution of encoded frame sizes for I-frames generated using a quantization parameter value of I-frame QP = 10, the encoded I-frames being generated as part of the process depicted in Figure 6;
Fig. 8 is a functional block diagram of a P-frame processing module suitable for use in the video encoding system depicted in Figure 1 in accordance with a specific example of implementation of the present invention;
Fig. 9 is a block diagram of a bit-rate controller module suitable for use in connection with the P-frame processing module shown in Figure 8 in accordance with a specific example of implementation of the present invention; Fig. 10 shows a flow diagram of a process implemented by the bit-rate controller module shown in Figure 9 for selecting a quantization parameter in accordance with a specific example of implementation of the present invention; Fig. 11 shows a process for generating a data structure including statistical information for use for the bit-rate controller module of Figure 9 in accordance with a specific example of implementation of the present invention; Fig. 12 is a graph showing a exemplary relationship between P-frame QP values and ratios of P-frame sizes and I-frame sizes for a given I-frame QP value of 15, the relationship being derived as part of the process depicted in Figure 11;
Fig. 13 illustrates an example of a display order of I-frame, P-frame and B-frame encoded video frames than can be generated using MPEG-4 coding;
Fig. 14 is a graph depicting a relationship between I-frame QP values used for generating an I-frame and the size of the I-frame in accordance with a specific example of implementation of the present invention;
Fig. 15 is a block diagram of an apparatus for selecting a quantization parameter value in accordance with a specific example of implementation of the present invention; Fig. 16 is a block diagram of a bit-rate controller module suitable for use in connection with the I-frame processing module shown in Figure 3 in accordance with a variant of the present invention. Fig. 17 is a block diagram of a bit-rate controller module suitable for use in connection with the P-frame processing module shown in Figure 8 in accordance with a variant of the present invention.
Other aspects and features of the present invention will become apparent to those ordinarily skilled in the art upon review of the following description of specific embodiments of the invention in conjunction with the accompanying Figures.
DETAILED DESCRIPTION
For the purpose of simplicity, the specific example of implementation of the invention will describe the selection of a quantization parameter value (reference frame QP or non-reference frame QP) for the purpose of encoding an entire frame. In other words, the embodiment(s) described provide that a quantization parameter value is selected for the purpose of encoding all macroblocks in a frame. It should be readily appreciated that alternative embodiments of the invention may apply the processes described in the present application for the selection of a quantization parameter value (reference frame QP or non-reference frame QP) for one macroblock (or a subset of macroblocks) in a given frame. In such alternative embodiments, different quantization parameter values can be selected for encoding different portions of a frame corresponding to individual macroblocks or subsets of macroblocks. It is also to be appreciated that the macroblocks in a given subset of macroblocks need not be adjacent to one another in a frame but may be positioned anywhere within the frame.
In addition, alternative embodiments of the invention may modify the quantization parameter value dynamically so that different quantization parameter values are used for different macroblocks (or subsets of macroblocks). Such alternative embodiment will become readily apparent to the person skilled in the art of video processing in light of the present description.
Simply for the purpose of clarity, the term "macroblock" is a term used in video compression, which generally represents a block of 16 by 16 pixels. Macroblocks can be subdivided further into smaller blocks. H.264, for example, supports block sizes as small as 4x4. A frame is composed of macroblocks. The higher (larger) the frame resolution, the more macroblocks it is composed of. For the purpose of the present description, the term "frame" is used interchangeably with image.
Generally speaking, encoding processors often make use of spatial and temporal redundancies when encoding video frames. For example, "Intra frame" coding exploits the spatial redundancies in a frame to compress the information while "Inter frame" coding exploits the temporal redundancies between a current frame and previous or following frames. Generally speaking, a reference frame is an encoded version of a single raw (uncompressed) frame that can take advantage of spatial redundancy. Unlike non-reference frames, reference frames are independent, meaning that they do not depend on data in preceding or following frames to be decoded. Non- reference frames are derived based on a reference frame and typically provide more compression than reference frames because they take advantage of the temporal redundancies in a previous and/or a subsequent reference frame. By using non- reference frames when encoding a video frame, significant compression can be achieved since video frames are often correlated to each other.
Different types of reference and non-reference frames may be used in the context of video encoding depending on the coding protocol being used.
For the purpose of the specific example of implementation described in the present specification, the case where the reference frame is an I-frame and the non-reference
frames are either P-frames or B-frames will be considered. P-frames are derived by taking advantage of the temporal redundancies in a previous I-frame (but not in a subsequent I-frame) while B-frames are derived taking advantage of the temporal redundancies in a previous I-frame and in a subsequent I-frame. The terms I-frame, P-frame and B-frames are commonly used in the art of video encoding and will therefore not be described in further details here. Figure 13 of the drawings illustrates an example of a display order of encoded video frames than can be generated using MPEG-4 coding and exemplifies one implementation of I-frames, P-frames and B- frames in a given frame sequence. For the purpose of simplicity, the present example of implementation of the invention will describe situations where I-frames and P- frames are generated (not B-frames).
It will be readily appreciated by the person skilled in the art in light of the present description that the concepts presented herein for the selection of quantization parameters and other processes may apply to any suitable format of reference frames and non-reference frames.
Also for the purpose of simplicity, only components necessary for the understanding of the invention have been included in the Figures. Other components well-known in the art of video signal processing have been omitted.
With reference to Fig. 1, there is shown a functional block diagram of a video encoding system 100 in accordance with a specific example of implementation of the present invention.
As depicted, the video encoding system 100 includes a frame type selector module 106, an I-frame (reference frame) processing module 112, a P-frame (non-reference frame) processing module 120, a reconstruction module 150 and an output signal generator 126. The video encoding system 100 also includes a first input 104 for receiving a frame type selector parameter, a second input 102 for receiving a sequence of video frames to be encoded ("raw" frames) and an output 128 for releasing a coded video stream.
In the example depicted in Figure 1, the video encoding system 100 also includes data line 114 for receiving a target encoded size for an encoded I-firame (reference frame) and a data line 116 for receiving a target encoded size for an encoded P-frame (non- reference frame). It is to be appreciated that the target encoded size for an encoded P- frame (non-reference frame) and the target encoded size for an encoded I-frame (reference frame) can be the same size and can be received at a same data line. The target encoded size for an encoded P-frame (non-reference frame) and the target encoded size for an encoded I-frame (reference frame) are generated in accordance with any suitable process for determining a desired frame size for an encoded frame given a bandwidth budget. Such processes are well known in the art and as such will not be described in further detail here.
The frames in the sequence of frames received at input 102 have a certain resolution. The resolution of the frames received, which may be expressed in terms of a frame size in bits, may be the same for all frames received at input 102 or may vary over time so that frames of different resolutions are received at input 102.
The frame type selector module 106 determines, based on the frame type selector parameter received at the first input 104, what type of encoded frame will be generated. In the example depicted, the type of frame to be generated may be an I- frame or a P-frame however other types of frames (such as for example B-frames) may also be contemplated in alternative examples of implementation. Typically, the sequence of frame type selector parameters generated for a sequence of video frames is dictated by a certain video transmission standard and/or policy specified by the video application service provider. The frame type selector parameters received at input 104 are generated in accordance with any suitable process. Such processes are known in the art of video signal encoding and as such will not be described in further detail here. In the example shown, when the frame type selector parameter received at input 104 conveys an I-frame type, the next video frame in the sequence of frames at input 102 is transmitted to the I-frame processing module 112 over data line 108.
When the frame type selector parameter received at input 104 conveys a P-frame type,
the next video frame in the sequence of frames at input 102 is transmitted to the P- frame processing module 120 over data line 110. It is to be appreciated that data lines in Figure 1 are used to illustrate the flow of data between different components of system 100 and are not necessarily representative of physical communication channels or encoder implementation.
The I-frame processing module 112 generates an encoded I-frame based in part on a "raw" video frame received over data line 108 and the target frame size information received over data line 114. The encoded I-frame is released over data line 122. In addition, the quantization parameter (I-frame QP), used by the I-frame processing module 112 to generate the encoded I-frame, is also released over data line 118. The manner in which the I-frame processing module 112 generates the encoded I-frame and selects the quantization parameter value (I-frame QP) will be described later in the specification.
The reconstruction module 150 generates a reconstructed version of a video frame based on the encoded I-frame released by the I-frame processing module 112 over data line 122. Reconstructing a video frame based on a compressed version of a frame (the encoded I-frame) may be done according to well-known methods.
The P-frame processing module 120 generates an encoded P-frame based in part on a "raw" video frame received over data line 110 and the target frame size information received over data line 116. In addition, the P-frame processing module 120 receives from reconstruction module 150 over data lien 152 the reconstructed version of the video frame, which was derived from an encoded I-frame released over data line 122 by the I-frame processing module 112. The P-frame processing module 120 also receives from data line 108 the quantization parameter (I-frame QP) used by the I- frame processing module 112 when generating the encoded I-frame. The encoded P- frame is released over data line 124. The manner in which the P-frame processing module 120 generates the encoded P-frame will be described later in the specification.
The output signal generator 126 combines the I-frames and P-frames released over data lines 122 and 124 to generate an output coded video stream 128. The coded video stream is in any suitable form for transmission over a communication path. The output signal generator 126 combines the I-frames and P-frames in accordance with any suitable process. Such processes are known in the art of video signal encoding and as such will not be described in further detail here.
A simplified flow diagram of a process implemented by the system 100 is shown in Figure 2.
With respect to this Figure, a video frame from a sequence of video frames to be encoded is received at step 200. It should be appreciated that the frame received at this step may be the first in the sequence of video frames or may be any other frame within the sequence of frames.
In step 202, the frame type selector parameter is received. This parameter identifies the type of frame the video frame received in step 200 should be encoded as. In this example, the frame type selector parameter specifies either an I-frame type or a P- frame type although other types of frames may be contemplated in alternative examples of implementation of the invention.
In step 204, depending on the frame type selector parameter received in step 202, a different path is taken.
In the example depicted, if the frame type selector parameter received in step 202 indicates that the video frame received in step 200 should be encoded as an I-frame, the system proceeds to perform steps 206 and 208.
Conversely, if the frame type selector parameter received in step 202 indicates that the video frame received in step 200 should be encoded as a P-frame, the system proceeds to perform steps 210 and 212.
In a specific example of implementation, steps 206 and 208 are implemented by the I- firame processing module 112 (shown in Figure 1). In step 206, a quantization parameter (I-frame QP) for use in encoding the video frame received in step 200 is derived by a first process making use of statistical information. This first process will be described later below. In step 208, the I-frame QP value derived in step 206 is used by a video encoding processor to encode the video frame received at step 200 as an I-frame. The result of step 208 is the release of an encoded I-frame.
In a specific example of implementation, steps 210 and 212 are implemented by the P-frame processing module 120 (shown in Figure 1). In step 210, a quantization parameter (P-frame QP) for the P-frame to be encoded from the video frame received in step 200 is derived by a second process making use of statistical information. This second process will be described later below. In step 212, the P-frame QP value derived in step 210 is used by a video encoding processor to encode the video frame received at step 200 as a P-frame. The result of step 212 is the release of an encoded
P-frame.
In step 214, the released I-frame or P-frame derived in either one of steps 208 or 212 are processed and combined with other encoded I-frames and P-frames to create a coded video stream in accordance with any suitable process for output to a target device (not shown). Processes for creating coded video streams based on I-frames and P-frames are known in the art of video encoding and as such will not be further described here. If the released I-frame or P-frame is not the last frame in the plurality of video frames contained within a sequence of video frames to be encoded, the next frame to be processed is selected and another iteration of the process shown in Figure 2 is performed.
The I-frame processing module 112 and the P-frame processing module 120 (shown in Figure 1) make use of statistical information in order to determine the level of compression to apply when generating a given I-frame and/or given P-frame based on the constraints provided by the target frames sizes received over data lines 114 and 116. The statistical information used was derived by encoding a plurality of
representative video frames and observing statistical trends of the resulting encoded video stream. Advantageously, by taking a statistical approach to bit-rate control, an improvement in video quality can be obtained, in particular in video applications having limited bandwidth availability and requiring low latency, such as mobile video applications.
The I-frame processing module 112 and the P-frame processing module 120 will now be described in greater detail below.
I-Frame Processing Module 112
Figure 3 is a block diagram showing in greater detail components of the I-frame processing module 112 in accordance with a specific example of implementation.
As shown in Figure 3, the I-frame processing module 112 includes a bit-rate controller 304, herein referred to as the "I-frame" bit-rate controller 304, and an encoder module 306 in communication with the "I-frame" bit-rate controller 304.
The encoder 306 includes a video encoding processor and implements any suitable intra-frame coding process. It may be recalled that intra-frame coding can exploit, if applicable, the spatial redundancies in a frame to compress the information. In a specific non-limiting example of implementation, the encoder 306 includes a video encoding processor (codec) selected from H.263, MPEG-4, VC-I and H.264 codecs amongst other types. The encoding process implemented by the encoder 306 can be controlled such as to modify the level of compression applied to a frame based on a quantization parameter (QP) value selected amongst a set of possible QP value and received from the "I-frame" bit-rate controller 304 over data line 308. Because most codecs used for encoding/compressing frames are lossy, some image information is lost even when the QP is set to maximum quality (a value of QP=I). Generally, the lower the QP used, the higher the quality and the higher the bandwidth requirement.
For example, in the H.263 codec, a Discrete Cosine Transform (DCT) is used to transform a frame from the spatial domain to the frequency domain. This operation is
typically done at the macroblock level. The DCT represents a macroblock as a matrix of coefficients, which can be quantized with different step sizes depending on the desired level of compression. These step sizes are controlled by the value of the quantization parameter (QP); in general, the smaller the QP value, the more bits are allocated to encoding each macroblock and the less image information is lost during encoding. For different types of encoding, the quantization parameter value can be set to any value in a range (or set) of possible values. For example, the quantization parameter (QP) for the H.264 codec can be set as any integer value between 1 (the highest quality value) and 51 (the lowest quality value). It will however be appreciated certain embodiments may make use of floating-point quantization parameter (QP).
The "I-frame" bit-rate controller 304 selects a quantization parameter value for use by the encoder 306, the quantization parameter value corresponding to a certain level of compression. As shown in Figure 3, the "I-frame" bit-rate controller 304 receives a target encoded frame size from data line 114. The target encoded frame size conveys a desired encoded frame size for the certain video frame to be encoded. Also as shown in Figure 3, the "I-frame" bit-rate controller 304 also receives information conveying a resolution (size) of the raw frame to be encoded over data line 300. In embodiments where the resolution of the raw frames to be encoded is the same for all frames, the information conveying a resolution (size) of the raw frame to be encoded received over data line 300 may be omitted. For example, if the system 100 (shown in Figure 1) is pre-confϊgured to encode only NTSC resolution frames, which have a constant size of 720x486 pixels, the resolution (size) of the raw frames to be encoded is not needed, since no other types of frames will be encoded.
The bit-rate controller 304 selects a quantization parameter value from a set of possible quantization parameter values, at least in part based on the target encoded frame size information received over data line 114. The selected quantization parameter value, herein referred to as the I-frame QP, is released to the encoder 306 over data line 308. The selected I-frame QP is also released as an output of the I- frame processing module 112 over data line 118.
The "I-frame" bit-rate controller 304 is shown in greater detail in Figure 4.
As shown in Figure 4, the "I-frame" bit-rate controller 304 includes a first input 450, a second input 452, a memory module 402, a processing unit 475 and an output 454. Optionally, the "I-frame" bit-rate controller 304 also includes a third input 456.
The first input 450 is for receiving from data line 300 information conveying a resolution (size) of the raw frame to be encoded. In implementation where the system 100 (shown in Figure 1) is designed to be used for a single frame resolution, this input 450 may be omitted.
The second input 452 is for receiving information from data line 114 conveying a target encoded frame size.
The output 454 is for releasing information derived by the processing unit 475 and conveying the level of compression to be applied to the raw frame to be encoded. In the example depicted, the information conveying the level of compression is in the form of an I-frame quantization parameter (I-frame QP) and is released over data line 308.
The optional third input 456 is for receiving the raw frame to be encoded from data line 302.
The memory module 402 stores statistical information related to the relationship between levels of compression and resulting encoded frames sizes. In a specific example of implementation, the memory module 402 stores one or more data structures and conveys an estimate of the relationship between different QP values and encoded frame sizes. In a specific example, each data structure provides a mapping between encoded frames sizes and corresponding QP values, wherein this mapping is derived based on statistical data obtained by encoded a reference set of video frames. In the example of implementation depicted in Figure 4, each data
structure in the set of data structures is associated to a respective raw (uncompressed) frame resolution and conveys an estimate of the relationship between different QP values and encoded frame sizes for the respective raw (uncompressed) frame resolution to which it is associated. In an alternative example of implementation (not shown in the Figures), each data structure in the set of data structures is associated to a respective a video sequence type and conveys an estimate of the relationship between different QP values and encoded frame sizes for the respective video sequence type to which it is associated. Video sequence types may include, for example, sports broadcast video sequence and talking head video sequence of the type used in a newscast for example amongst others. In yet another alternative example of implementation (not shown in the Figures), each data structure in the set of data structures is associated to a combination of a certain frame resolution and a certain the video sequence type.
Referring back to Figure 4, a representation is shown, designated by reference numeral 404, of one of the data structures in memory 402. The data structure 404 is in the form of a table including a plurality of entries, each entry providing a mapping between an encoded frame size and a corresponding quantization parameter value. In a first non-limiting example of implementation, each entry in the data structure maps an expected maximum encoded frame size to a corresponding quantization parameter value. The expected maximum encoded frame size may be based on a certain confidence interval (such as for e.g. 90%, 95% or 99% confidence interval). In a second non-limiting example of implementation, each entry in the data structure maps an expected average encoded frame size to a corresponding quantization parameter value. It will be appreciated that the above types of mapping have been presented for the purpose of illustration only and that other types of mappings based on a statistical relationship between encoded frame sizes and corresponding quantization parameter values can be used in alternative examples of implementations of the invention. An exemplary process for determining the information stored within each data structure in memory module 402 will be described later in the present specification.
The processing unit 475, as shown in Figure 4, is in communication with the first and second inputs 450 452, with the optional third input 456, with the memory module 402 and with output 454. In a specific example of implementation, the processing unit 475 determines the level of compression to be applied at least in part based on the information conveying a resolution (size) of the raw frame to be encoded received at the first input 450, the target encoded frame size received at the second input 452 and the statistical information in the memory module 402.
In the specific example of implementation depicted in Figure 4, the processing unit 475 includes a data structure selector module 400, an initial QP (quantization parameter) selector module 406 in communication with the second input 452, an output module 412 in communication with output 454. Optionally, the processing unit 475 may also include a QP adjustment module 410 in communication with optional input 456.
The data structure selector module 400 is in communication with the first input 450 and with the memory module 402 and is for selecting a data structure from the set of data structures in memory unit 402. In a first example of implementation, of the type shown in Figure 4, each data structure in the set of data structures in memory 402 is associated to a respective raw (uncompressed) frame resolution. In this example, the data structure selector module 400 receives from input 450 the resolution (size) of the raw frame to be encoded and accesses memory 402 to select a data structure associated to a frame resolution corresponding to or approximating the frame resolution associated to the frame to be encoded. In a second example of implementation (not shown in the Figures), each data structure in the set of data structures in memory 402 is associated to a respective video sequence type. The raw frame to be encoded is part of a sequence of frames and the data structure selector module 400 selects a data structure from the set of data structures based in part based on the video sequence type associated to the sequence of frames of which the raw frame to be encoded is part. The video sequence type associated to the sequence of frames may be provided as an input to the data structure selector module 400 and used for selecting the data structure from the set of data structures. Alternatively, the video
sequence type associated to the sequence of frames of which the frame to be encoded is part may be derived by a processing module programmed for processing at least some frames in the sequence of frames to select, from a set of possible video sequence types, the video sequence type associated to the sequence of frames.
For the purpose of illustration, assume that the resolution (size) of the raw frame to be encoded as an I-frame is of QCIF resolution (176x144 pixels). Further assume that the frame is to be encoded using the H.264 standard for video compression to 320x240 pixels for playback on a mobile end-point. Further assume that the memory module 402 stores a respective data structure for each or a subset of target video frame resolutions of which the following are common examples:
- VGA (640x480 pixels)
- DV-NTSC (720x480 pixels)
- DV-PAL (768x576 pixels) - HD 720 (1280x720 pixels)
- HD 1080 (1920x1080 pixels)
- SQCIF(128x96 pixels)
- QCIF(176x144 pixels)
- CIF(352x288 pixels) - QVGA(320x240 pixels)
In this case, the data structure selector module 400 would determine that the resolution of the raw frame that is to be encoded is 176x144 pixels and would select from memory unit 402 the data structure corresponding to QCIF. In a non-limiting example of implementation, if the resolution of the raw frame to be encoded does not correspond exactly to a resolution associated to a data structure in memory 402, the data structure selector module 400 selects a data structure associated to a frame resolution approximating the frame resolution associated to the frame to be encoded. It will be appreciated by the person skilled in the art that the above list of video frame resolutions constitutes a non-exhaustive list and has been presented here for the purpose of illustration only.
It will also be appreciated that in alternative implementations in which the memory unit 402 includes a single data structure associated to a single resolution, the data structure selector module 400 as well as input 450 may be omitted.
The initial QP selector module 406 is in communication with the second input 452 for receiving the target encoded frame size. The initial QP selector module 406 is also in communication with the data structure selector module 400 and receives there from information conveying the data structure selected from memory unit 402. Recall that the data structure selected by the data structure selector 400 includes a set of entries mapping encoded frame sizes to QP values. The initial QP selector 406 locates in the data structure selected by the data structure selector module 400 an entry corresponding to an encoded frame size approximating the target encoded frame size received at input 452. In a non-limiting example of implementation, the entry selected corresponds to an encoded frame size that is closest to the target encoded frame size. Alternative approaches may opt to select an entry having an encoded frame size that is closest to but less then (or greater than) the target encoded frame size. Once this entry is located, the initial QP selector 406 obtains the QP value corresponding to the located entry and sets it as the initial I-frame QP value. The initial QP value is the released over data lines 308.
The output module 412 is in communication with the initial QP selector module 406 and with output 454. The output module 412 releases to output 454 the initial QP value received over data line 408 from the initial QP selector module 406. In implementations where the "I-frame" bit-rate controller 304 includes the optional QP adjustment module 410, the output module 412 releases to output 454 a current I- frame QP value which is either the initial QP value received over data line 408 from the initial QP selector module 406 or an adjusted QP value received from the optional QP adjustment module 410.
The QP adjustment module 410 is for adjusting the last computed value I-frame QP value based on the how close the size of the encoded I-frame generated when using this last computed value of QP comes to the target encoded frame size. The QP
adjustment module 410 is in communication with input 452 for receiving the target encoded frame size, with input 456 for receiving the raw frame to be encoded and with the output module 412 for receiving the last computed value of QP. The QP adjustment module 410 includes an encoder 460 to encode the raw frame received at input 456 using the last computed value of QP to generate an encoded I-frame. The encoder 460 in the QP adjustment module 410 implements the same encoding process as encoder 306 (shown in Figure 3). Although encoder 460 and encoder 306 have been shown as being separate components in the Figures, it will be readily appreciated that a same encoder may perform the functionality of encoder 460 and encoder 306 in some implementations. While its operation will be described in more detail later, it may be appreciated at this point that the QP adjustment module 410 provides functionality for determining whether a given QP value is suitable for producing I- frames with sizes that fall within a range deemed acceptable in relation to the target encoded frame size. If the frame size of the I-frame falls outside of this range, the QP adjustment module 410 modifies the QP value to find a QP value that produces an encoded frame with a frame size that is closer to the target encoded frame size.
The QP adjustment module 410 is optional in the sense that it is not required for the operation of the "I-frame" bit-rate controller 304 and therefore may be omitted from certain specific implementations thereof. In addition, the use of the QP adjustment module 410 is optional as well and may depend, for example, on factors such as the current overall computational load on the "I-frame" bit-rate controller 304 as well as the availability of spare processing capacity within the system 100 (shown in figure
1).
The "I-frame" bit-rate controller 304 (shown in figures 3 and 4) implements a process for selecting a quantization parameter value for use by the video encoding processor of encoder 306 (shown in Figure 3) for encoding a certain video frame, the quantization parameter value corresponding to a certain level of compression of the video encoding processor in encoder 306. The selected quantization parameter value is selected such as to attempt to allow the video encoding processor to derive an encoded video frame having a frame size tending toward the desired frame size. An
exemplary process implemented by the "I-frame" bit-rate controller 304 will now be described with reference to Figure 5.
With respect to this figure, step 500 represents one optional starting point for the process, where the "I-frame" bit-rate controller 304 receives the information pertaining to the raw frame to be encoded.
In a first specific example of implementation, the information received at step 500 conveys the resolution (size) of the raw frame to be encoded. In a second specific example of implementation, the raw frame to be encoded is part of a sequence of frames associated to a video sequence type and the information received at step 500 conveys the video sequence type associated to the sequence of frame of which the raw frame to be encoded is part. Optionally, step 500 includes processing at least some frames in the sequence of frames of which the frame to be encoded is part to select, from a set of possible video sequence types, the video sequence type associated to the sequence of frames.
It will be readily appreciated that the resolution (size) of the raw frame to be encoded and the video sequence type associated to the sequence of frames of which the raw frame to be encoded is part may be used in combination.
The process then proceeds to step 502.
At step 502, a data structure is selected from an available set of data structures based in part on the information received in step 500. The selected data structure includes a plurality of entries, each entry mapping an encoded frame size to a corresponding quantization parameter value. In a non-limiting implementation, this step is implemented by the data structure selector module 400 (shown in Figure 4) which accesses the set of data structure in memory module 402 (also shown in Figure 4).
The process then proceeds to step 504. It will be appreciated that in implementations where the resolution of the raw frames is set to a preset resolution (or otherwise never
varies) and in situations where the video sequence type is not used as a discriminator, steps 500 and 502 would be omitted.
At step 504, the target encoded frame size, which conveys a desired encoded frame size, is received. Step 504 is an alternate starting point for the process in cases where optional steps 500 and 502 are omitted. The process then proceeds to step 505.
At step 505, an initial quantization parameter value is selected. The selected initial quantization parameter value corresponds to an entry in the data structure (selected at step 502), the entry being associated to an encoded frame size approximating the target encoded frame size (received at step 504). In a non-limiting example of implementation, the entry selected is associated to an encoded frame size that is closest to the target encoded frame size without exceeding the latter. The process then proceeds to step 507.
At step 507, the initial QP (quantization parameter) value selected at step 505 is set as the current QP value. As can be seen from Figure 5, steps 508 510 512 514 and 516 are optional and pertain to the adjustment of the initial quantization parameter value selected at step 505 based on one or more actual observed encoded frame sizes. In implementations providing functionality for adjusting the initial quantization parameter value selected at step 505, the process moves forward to step 508. Otherwise, the process moves forward to step 518.
At step 508, the raw frame is encoded using the current QP value. In a non-limiting example of implementation, step 508 is implemented by the encoder 460 of the QP adjustment module 410 (shown in Figure 4). The result of step 508 is a resulting encoded video frame having a certain size. The process then proceeds to step 512.
At step 512, the size of the resulting encoded video frame generated using the current QP value at step 508 is compared against the target encoded frame size received at step 504 to determine whether to adapt or release the current QP value. It will be appreciated that, in practical implementations, the actual frame size of the resulting
encoded video frame generated at step 508 is unlikely to be an exact match to the target encoded frame size. Therefore, in a specific example of implementation, step 512 determines if the actual frame size of the resulting encoded video frame generated at step 508 falls within an acceptable range of size near the target encoded frame size. In a non-limiting example, the range of what is deemed an acceptable frame size for resulting encoded video frames may be expressed in numerical values as a range of frame sizes, as a percentage in relation to the target encoded frame size (such as ±10%) or in any other suitable fashion. The outcome of decision step 512 is identified by the following two result branches, each of which is described below: - If the frame size of the resulting encoded video frame is considered acceptable in relation to the target encoded frame size, step 512 is answered in the negative and we proceed to step 518. - If the frame size of the resulting encoded video frame is not considered acceptable in relation to the target encoded frame size, 512 is answered in the positive and we proceed to step 514.
At step 514, a new QP value is selected based in part on a difference between the size of the resulting encoded video frame generated using the current QP value at step 508 and the target encoded frame size. More specifically, at step 514, a new QP value is selected so that when the new QP value is used to encode the raw frame, the newly obtained resulting encoded frame is likely to have a size that is closer to the target encoded frame size than the resulting encoded raw frame obtained by using the current QP value.
A number of different approaches may be used to select the new QP value. In a specific example, the approach makes use of information in the data structure (selected at step 502) and on the size of the resulting encoded video frame generated at step 508 in order to select a new QP value. More specifically, as described above, the data structure (selected at step 502), provides a mapping between encoded frames sizes and corresponding QP values. This mapping is derived based on statistical data and conveys a relationship between QP values and the encoded frames sizes as derived based on a reference set of video frames. The amount by which the size of
the resulting encoded video frame generated at step 508 deviates from the relationship between QP values and the encoded frames sizes conveyed by the mapping in the data structure provides an indication of the extend to which the current QP value must be adapted/modified.
The result of step 514 is a newly selected QP value.
Figure 14 of the drawings illustrates graphically a manner in which the amount by which the size of the resulting encoded video frame obtained at step 508 deviates from the relationship between QP values and the encoded frames sizes (as conveyed by the data structure selected at step 502) can be used to select a new QP value. In Figure 14, curve 1402 depicts an exemplary relationship between QP values and encoded frame sizes as stored in a data structure in memory 402 (shown in Figure 4). As can be observed from curve 1402, the relationship between QP values and encoded frame sizes in the specific example shown generally follows a negative exponential curve. Dotted line 1404 conveys a target encoded frame size. The intersection between curve 1402 and dotted line 1404 allows obtaining an initial QP value (1408), which for the purpose of illustration only is QP = 7. Location 1410 on the graph represents the size of a resulting encoded video frame obtained by using the initial QP value (1408) to encode a raw frame, wherein the y-axis value conveys the size of the resulting encoded frame. The difference between location 1410 and 1406 represents the amount by which the size of the resulting encoded video frame deviates from the relationship between QP values and the encoded frames sizes provided by curve 1402. Curve 1412, which goes through location 1410, is derived based on the difference between location 1410 and 1406 and is a revised estimate of the relationship between QP values and encoded frame sizes. The intersection between curve 1412 and dotted line 1404 allows obtaining a new QP value (1414), which for the purpose of illustration is QP = 2. It is to be noted that the relationship between QP values and encoded frame sizes shown in Figure 14, which has been provided for the purpose of illustration only, is associated with a specific coding protocol and that each coding will be associated with a respective relationship between QP values and encoded frame sizes.
Returning now to Figure 5, we now proceed to step 516.
At step 516, the current QP value is set to the new QP value selected at step 514. We then proceed to optional step 517 or to step 508 in cases where step 517 is omitted.
At optional step 517, a test is made to determine whether a maximum allowable number of iterations of steps 508 510 512 514 and 516 has been reached. In other words, the number of iterations or "adaptations" for determining a current QP value may be limited so the current QP may only be adjusted a certain number of times before being released at step 518 to the encoding processor. The number of permitted iterations will vary from one implementation to the next and may depend on a number of different factor which may include, without being limited to, a maximum allowable latency in the video encoding system. In a non-limiting example of implementation, the maximum number of iterations of steps 508 510 512 514 and 516 is set to two (2) however any suitable number of iterations may be used. If at step 517, it is determined that the maximum number of iterations has been reached, step 517 is answered in the affirmative and we proceed to step 518. If at step 517, it is determined that the maximum number of iterations has not yet been reached, step 517 is answered in the negative and we return to step 508. In another non-limiting example of implementation, the span of QP values is restricted between a minimum value and a maximum value. In such cases, at step 517, once the maximum value of QP available has been reached, there is no reason to do another iteration if the goal is to compress more and therefore step 517 is answered in the affirmative and we proceed to step 518. Similarly, once the minimum QP available has been reached, there is no reason to do another iteration if the goal is to compress less and therefore step 517 is also answered in the affirmative and we proceed to step 518. Otherwise, step 517 is answered in the negative and we return to step 508.
It is to be appreciated that step 517 is optional and that in implementation where a single iteration of steps 508 510 512 514 and 516 is permitted, step 517 is omitted and the process goes directly from step 516 to step 518. It has been observed that one
iteration of steps 508 512 514 and 516 to derive the I-frame QP value does not materially affect the real-time aspect of the encoding process. The person skilled in the art will appreciated that when encoding P or B frames, motion estimation typically consumes 50% of the encoding time in an encoder. I-frame coding does not require motion estimation, therefore making it possible to encode the same I-frame twice without exceeding the normally budgeted processing time per frame.
It is also to be appreciated that certain implementations may repetitively executes steps 508 510 512 514 and 516 until condition 512 is answered in the negative indicating that the frame size of the resulting encoded video frame encoded using the currently select QP value is considered acceptable in relation to the target encoded frame size. In such implementations, step 517 may also be omitted.
It is to be appreciated that although steps 508 512 514 516 and 517 provides useful functionality in adapting the selected QP value in order to attempt to obtain a better result, these steps are optional and may be omitted from certain implementations.
At step 518, the current QP value selected is released to the video encoding processor of encoder 306 (shown in Figure 3) for use in encoding a certain video frame.
Generation of Data Structure for I-frame QP
An exemplary process for generating information of the type stored in memory module 402 (shown in Figure 4) for use in selecting a quantization parameter value will now be described with reference to Figure 6. For the purpose of this example, the information generated is in the form of a data structure including a plurality of entries, each entry mapping an encoded frame size to a corresponding quantization parameter value.
For the purpose of simplicity, the process described with reference to Figure 6 is for generating a single data structure associated to a single frame resolution (size). The reader skilled in the art will readily appreciate, in light of the present description, that
the process depicted in Figure 6 can be performed repeatedly for different video sequence types, different frame resolutions (sizes) and/or different combinations of video sequence types/ frame resolutions (sizes) in order to generate different respective data structures. The manner in which such multiple data structures can be generated will become readily apparent to the person skilled in the art in light of the present specification and as such will not be described in further detail here.
At step 600, a plurality of video frames is provided as a basis for generating entries in a data structure. The video frames provided at this step preferably include frames representative of the types of frames expected to be encoded. Advantageously, using frames representative of the types of frames expected to be encoded allows generating estimates of the relationship between different QP values and encoded frame sizes that more closely resemble the relationship of the actual frames to be encoded.
Steps 602 604 606 608 and 610 are performed for each QP value in a set of possible QP values.
More specifically, at step 602, a quantization parameter (QP) value that has not yet been processed is selected. We then proceed to step 604.
At step 604, each video frame in the plurality of video frames provided at step 600 is encoded as an I-frame by a video encoding processor using the QP value selected in step 602 to generate an encoded frame group associated with the QP value selected in step 602. The resulting encoded frame group includes a plurality of encoded video frames derived using the quantization parameter value selected in step 602. We then proceed to step 606.
At step 606, an encoded frame size corresponding to the quantization parameter value selected at step 602 is derived at least in part based on frame sizes of encoded video frames in the encoded frame group generated at step 604. Different approaches for deriving the encoded frame size corresponding to the quantization parameter value selected at step 602 may be contemplated. In a specific example of implementation,
the encoded frame size corresponding to the quantization parameter value selected at step 602 is derived by observing statistical trends in the resulting encoded video frames.
In a first non-limiting example of implementation, the mean encoded frame size of the encoded frames in the encoded frame group obtained at step 604 is derived. In such an implementation, the encoded frame size corresponding to the quantization parameter value selected at step 602 is set to correspond to the mean frame size.
In another non-specific example of implementation, a statistical distribution of the frame sizes of the frames in the encoded frame group generated at step 604 is obtained. Figure 7 is a graphical representation of the frame size distribution for the resulting encoded frames obtained using a QP value of 10 to encode a sample set of frames. The frame size distribution depicted in Figure 7 is not a normal distribution but is shown for the purpose of illustration only. Based on this distribution 704, for a given QP value, a range of frame sizes having an upper limit can be derived so that a certain proportion of encoded frames have a frame size falling within the derived range of frame sizes. In the field of statistics, this is commonly referred to as a confidence interval. For example, an X% confidence interval indicates that X% of the frames processed have an encoded frame size falling within the interval. In such implementations, the encoded frame size corresponding to the quantization parameter value selected at step 602 is set to correspond substantially to the upper limit of the range of frame sizes.
Using a 99% confidence interval, an upper limit 702 for the range of frame sizes can be obtained for the given I-frame QP, so that 99% of the frames processed have an encoded frame size falling below the upper limit of the range. Although the example depicted in Figure 7 shows the use of a 99% confidence interval, it will be appreciated that other suitable confidence interval provided a range of frame size could also be used.
In an alternative example (not shown in the Figure), a 50% confidence interval may be used so that the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 50%. In yet another alternative example (not shown in the Figure), a 70% confidence interval may be used so that the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 70%. In yet another alternative example (not shown in the Figure), a 90% confidence interval may be used so that the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 90%. In yet another alternative example(not shown in the Figure), a 95% confidence interval may be used so that the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 95%.
The methods described above have been presented for the purpose of illustration as many other suitable statistical methods may also be used in order to derive an encoded frame size corresponding to the quantization parameter value selected at step 602 such as to convey information on the statistical relationship between QP values and encoded frame sizes.
Once the encoded frame size corresponding to the quantization parameter value selected at step 602 has been derived, we proceed to step 608.
At step 608, the encoded frame size derived in step 606 is stored in association with the QP value selected at step 602 on a computer readable storage medium. We then proceed to step 610.
At step 610, if not all QP values in the set of QP values have been processed, step 610 is answered in the affirmative and we return to step 602 and the process including steps 604, 606 and 608 is repeated for the next unprocessed QP value. If, on the other hand, all QP values in the set of QP values have been processed, step 610 is answered in the negative and the process ends at step 612.
Based on the above, a data structure including a plurality of entries is generated, wherein each entry maps a QP value to a corresponding encoded frame size. An example of a data structure in the form of a look-up table that could be generated by the process depicted in Figure 6 is shown below for a CIF type frame resolution.
It will be appreciated that the values presented in the above table were providing solely for the purpose of illustration.
As a variant, a set of data structures may be generated, wherein each data structure is associated to a respective frame resolution (size). In a non-limiting example of implementation, the plurality of video frames provided at step 600 includes sub-sets of frames associated with respective frame resolutions. In such an implementation the process depicted in Figure 6 is modified such that steps 602 604 606 608 and 610 are repeated for each subset of frames in the plurality of frames so that a respective data structure is generated for each frame resolution, wherein each respective data structure mapping an encoded frame size to a corresponding QP value.
As yet another variant, a set of data structures may be generated, each data structure being associated to a respective video sequence type. In a non-limiting example of implementation, the plurality of video frames provided at step 600 includes sub-sets of frames associated with respective video sequence type. In such an implementation the process depicted in Figure 6 is modified such that steps 602 604 606 608 and 610 are repeated for each subset of frames in the plurality of frames so that a respective data structure is generated for each video sequence type, wherein each respective data
structure mapping an encoded frame size to a corresponding QP value. Advantageously, the data structures obtained in accordance with this variant provide mapping that are particularly suited for sequences of frames of the same types as those for generating entries in the data structures. As will be appreciated, if the sequence of frames used to generate the entries in the data structure are of a sports broadcast type, then the entries in the data structure will be particularly well suited to predict the size of encoded frames for frames of the sports broadcast type. Similarly, if the sequence of frames used to generate the entries in the data structure is of a 'talking head' type, then the entries in the data structure will be particularly well suited to predict the size of encoded frames for frames of the talking head type and so on.
Those skilled in the art should appreciate that in some embodiments of the invention, all or part of the process previously described herein with reference to Figure 6 may be implemented as pre-programmed hardware or firmware elements (e.g., application specific integrated circuits (ASICs), DSPs, electrically erasable programmable readonly memories (EEPROMs), etc.), or other related components.
In other embodiments of the invention, all or part of the process previously described herein with reference to Figure 6 may be implemented as software consisting of a series of instructions for execution by a computing unit. The series of instructions could be stored on a medium which is fixed, tangible and readable directly by the computing unit, (e.g., removable diskette, CD-ROM, ROM, PROM, EPROM or fixed disk, or the instructions could be stored remotely but transmittable to the computing unit via a modem or other interface device (e.g., a communications adapter) connected to a network over a transmission medium. The transmission medium may be either a tangible medium (e.g., optical or analog communications lines) or a medium implemented using wireless techniques (e.g., microwave, infrared, RF or other transmission schemes).
The process described with reference to Figure 6 may be implemented on a computing platform separate from system 100 (shown in Figure 1). As such, the
process described with reference with Figure 6 may be performed off-line in order to generate the entries to the data structure(s) in memory module 402. The computing platform that is used for generating a data structure in accordance with the process described in Figure 6 may be placed in communication with the system 100 for the purpose of storing the generated data structure in the memory module 402 of "I- frarne" bit-rate controller 304 (both shown in Figure 4). Alternative, the computing platform may be integrated within system 100 (shown in Figure 1) as an integral component thereof.
Variant of bit-rate controller 304
As a variant, bit-rate controller 304 (shown in Figure 4) includes a self-update module implementing a process for generating updated information for use in selecting a quantization parameter value for a video encoding processor.
Figure 16 of the drawings depicts a bit-rate controller 304' which is analogous to bit- rate controller 304 (shown in Figure 4) but modified to include self-update module 1800.
In the example depicted in Figure 16, the self-update module 1800 is in communication with output module 412 for receiving therefrom the last computed QP value and is in communication with input 456 for receiving therefrom the raw frame to be encoded. The self-update module 1800 uses the last computed QP value released by output module 412 to encode the raw frames received at input 456 and stores the resulting frame size in a memory (not shown) in associated with the QP value used. In this fashion, the self-update module 1800 gathers statistical information pertaining to the relationship between encoded frame sizes and at least some QP values.
In a specific example, once the self-update module 1800 has gathered a sufficient amount of information pertaining to encoded frame sizes for a certain QP value, the self-update module 1800 processes this information pertaining to encoded frame sizes
to derive a new encoded frame size corresponding to the certain QP value using methods similar to those described with reference to step 606 in Figure 6. The "sufficiency" of the information gathered pertaining to encoded frame sizes for a certain QP value will vary from one implementation to the other. In a non-limiting example, once a pre-determined number of frames have been encoded using a given QP value, the self-update module 1800 processes the frame sizes of the encoded frames to derive a new encoded frame size corresponding to the given QP value.
Once the new encoded frame size corresponding to the given QP value has been derived, the self-update module 1800 accesses memory 402 to locate in a data structure an entry corresponding to the certain QP value. Once the entry has been located, the self-update module 1800 replaces the encoded frame size in that entry with the new encoded frame size.
In this fashion, the behaviour of bit-rate controller 304' can be adapted over time based on actual video frames to be encoded receiving at input 456.
It will be readily appreciated that the functionality of the self-update module 1800 may be adapted for updating statistical information in memory 402 in implementations where the memory 402 stores a plurality of data structures associated with respective frame resolutions (sizes) and/or respective video sequence types. Manners in which the self-update module 1800 could implement this functionality in such implementation will be readily apparent to the person skilled in the art in light of the present description and as such will not be described further detail here.
P-Frame Processing Module 120
Figure 8 is block diagram showing in greater detail components of the P-frame processing module 120 in accordance with a specific example of implementation.
As shown in Figure 8, the P-frame processing module 120 includes a bit-rate controller 804, herein referred to as the "P-frame" bit-rate controller 804, and an encoder module 806 in communication with the "P-frame" bit-rate controller 804.
The encoder 806 includes a video encoding processor and implements an inter-frame coding process corresponding to the intra-frame coding process implemented by encoder 306 (shown in Figure 3). Generally speaking, inter-frame coding exploits the temporal redundancies between the current frame and previous and/or following frames. In the specific example depicted the encoder 806 generates a P-frame based on a raw frame received at over data path 110 and a reconstructed version of an encoded I-frame received from the reconstruction module 150 over data line 152. The resulting encoded P-frame is based in part in the I-frame derived by the I-frame processing module 112 (shown in Figure 1), which was reconstructed by the reconstruction module 150. The encoding process implemented by the encoder 806 can be controlled such as to modify the level of compression applied to a frame based on a quantization parameter (QP) value received from the "P-frame" bit-rate controller 804 over data line 810.
The "P-frame" bit-rate controller 804 selects a quantization parameter value, herein referred to as the P-frame QP, for use by the encoder 806, the quantization parameter value corresponding to a certain level of compression from a set of possible quantization parameter values. The selection is effected at least in part based on target encoded frame size information conveying a desired frame size for the certain video frame to be encoded. The selected P-frame quantization parameter value is selected such as to attempt to cause encoder 806 to generate an encoded video frame having a frame size tending toward the desired frame size.
It has been noted by the inventors that, it is generally possible, for various values of P- frame QP, to estimate the size of a resulting P-frame following an I-frame if the size of the I-frame and the I-frame QP value are known. Consequently, when selecting a P-frame QP value, the "P-frame" bit-rate controller 804 makes use of statistical information conveying a relationship between I-frame QP values and P-frame QP
values and well as the associated I-frame and P-frame sizes of the resulting encoded frames.
The selected P-frame QP value is released to the encoder 806 over data line 810.
As shown in Figure 8, the "P-frame" bit-rate controller 804 receives the target encoded frame size over data line 116. Also as shown in Figure 8, the "P-frame" bit- rate controller 804 also receives information over data line 802 conveying a resolution (size) of the raw frame to be encoded as well as information related to the I-frame on which P-frame will be partly based. The information related to the I-frame includes the I-frame QP value derived by the I-frame processing module 112 (shown in Figure 1) and present on data line 118 and the size of the encoded I-frame generated by the I- frame processing module 112 (shown in Figure 1) present on data line 808. In embodiments in which the resolution of the raw frames to be encoded is the same for all frames, the information conveying a resolution (size) of the raw frame to be encoded, conveyed by data line 802, may be omitted. In the embodiment shown in Figure 8, the "P-frame" bit-rate controller 804 also receives the encoded P-frame from data line 124.
The "P-frame" bit-rate controller 804 will now be described in greater detail with reference to Figure 9.
As shown in Figure 9, the "P-frame" bit-rate controller 804 includes a set of inputs 950 952 954 956 990, a memory module 902, a processing unit 975 and an output 954.
The first input 950 for receiving from data line 802 information conveying a resolution (size) of the raw frame to be encoded. In implementation where the system 100 (shown in Figure 1) is designed to be used for a single frame resolution, this input 950 may be omitted.
The second input 952 is for receiving information from data line 116 conveying a target encoded P-frame size. The target encoded P-frame size conveys a desired size of a P-frame resulting from the encoding of the raw frame present on data line 110 (shown Figure 8) by the video encoding processor in encoder 806 (also shown in Figure 8).
The third input 954 is for receiving information from data line 808 conveying an encoded I-frame size of an I-frame on which the P-frame to be encoded is to be partly based.
The fourth input 956 is for receiving information from data line 118 conveying an I- frame quantization parameter (I-frame QP) value. The I-frame QP value corresponds to a level of compression applied to generate the I-frame and on which the P-frame to be encoded is to be based.
The fifth input 990 is for receiving information related to a previously encoded P- frame. The information related to a previously encoded P-frame may include, for example, size information associated with a previously encoded P-frame and/or the previously encoded P-frame released over data line 124 (shown in figure 8).
The output 958 is for releasing information derived by the processing unit 975 and conveying the level of compression to be applied to the raw frame to be encoded as a P-frame. In the example depicted, the information conveying the level of compression is in the form of a P-frame quantization parameter (P-frame QP) and is released over data line 810.
The memory module 902 stores statistical information related to the relationship between levels of compression of I-frames, levels of compression of P-frames, encoded P-frame sizes and encoded I-frame sizes. In a specific example of implementation, the memory module 902 stores one or more data structures. Each data structure includes information conveying an estimate of the relationship between P-frame QP values, I-frame QP values and resulting encoded frame sizes and is
derived based on statistical data. In the example of implementation depicted in Figure 9, each data structure in the set of data structures is associated to a respective raw (uncompressed) frame resolution and conveys an estimate of the relationship between P-frame QP values, I-frame QP values and resulting encoded frame sizes for the respective raw (uncompressed) frame resolution to which it is associated. In an alternative example of implementation (not shown in the Figures), each data structure in the set of data structures is associated to a respective a video sequence type and conveys an estimate of the relationship between P-frame QP values, I-frame QP values and resulting encoded frame sizes for the respective video sequence type to which it is associated. In yet another alternative example of implementation (not shown in the Figures), each data structure in the set of data structures is associated to a combination of a certain frame resolution and a certain the video sequence type.
Each data structure for a given resolution (size) of a raw frame includes a plurality of entries, each entries being associated to a respective "I-frame QP value"/ "P-frame QP value" combination. The entries in the data structure convey estimates of the relationship between the QP value used to encode an I-frame and the QP value that can be used to encode subsequent P-frames that are encoded based on this I-frame. In a specific example, these I-frame and P-frame QP values are linked through a ratio between the target encoded frame size and the encoded I-frame size. Referring back to Figure 9, a representation of a data structure 906 in memory 902 is shown. The data structure 906 is in the form of a table including a plurality of entries. The columns of the table are each associated to a respective I-frame QP value and the rows of the table are each associated to a respective P-frame QP value. Each entry in the table conveys a ratio of an encoded P-frame size to an encoded I-frame size associated to a given I-frame QP value/ P-frame QP value combination. In this fashion, the data structure 906 conveys a relationship between P-frame QP values, I- frame QP values and resulting encoded frame sizes for I-frames and P-frames. An example of a process for deriving entries in each data structure in memory module 902 will be described later in the present specification. It is to be appreciated that, although the relationship between I-frame QP value/ P-frame QP value combinations has been conveyed by means of ratios between encoded frame sizes for I-frames and
P-frames, other manners of providing mappings between frame QP value/ P-frame QP value combinations are possible and will become apparent to the reader in light of the present description.
The processing unit 975 is in communication with the inputs 950 952 954 956 and 990 and with the memory module 902. In a specific example of implementation, the processing unit 975 determines the level of compression to be applied when generating a P-frame at least in part based on the information conveying a resolution (size) of the raw frame to be encoded received at the first input 950, the target encoded frame size received at the second input 952, the encoded I-frame size received at the third input 954, the I-frame QP receiving at the fourth input 956 and the statistical information in the memory module 902.
In the specific example of implementation depicted in Figure 9, the processing unit 975 includes a data structure selector module 900, a ratio computation module 904 in communication with the second and third inputs 952 954, a P-frame QP selector module 908 and an output module 910 in communication with output 958.
The data structure selector module 900 is in communication with the first input 950 and with the memory module 902 and is for selecting a data structure from the set of data structures in memory unit 902. In a first example of implementation, of the type shown in Figure 9, each data structure in the set of data structures in memory 902 is associated to a respective raw (uncompressed) frame resolution. In this example, the data structure selector module 900 receives from input 950 the resolution (size) of the raw frame to be encoded and accesses memory 902 to select a data structure associated to a frame resolution corresponding to or approximating the frame resolution associated to the frame to be encoded. In a second example of implementation (not shown in the Figures), each data structure in the set of data structures in memory 902 is associated to a respective video sequence type. The raw frame 110 (shown in Figure 8) to be encoded is part of a sequence of frames and the data structure selector module 900 selects a data structure from the set of data structures based in part based on the video sequence type associated to the sequence
of frames of which the frame to be encoded is part. It will be appreciated that in alternative implementations in which the memory unit 902 includes a single data structure associated to a single resolution the data structure selector module 900 as well as input 950 may be omitted.
The ratio computation module 904 is in communication with second and third inputs 952 954 for receiving a target encoded frame size and an encoded I-frame size from data lines 116 and 808 respectively. The ratio computation module 904 is adapted to calculate the ratio between the target encoded frame size and an encoded I-frame size. Mathematically, this is expressed as:
Target encoded frame size = Ratio Encoded I-frame size
The P-frame QP selector module 908 is in communication with the fourth input 956, with the fifth input 990, with the ratio computation module 904 and with the data structure selection module 900. The P-frame QP selector 908 selects the P-frame QP value that will be used to encode the raw frame as a P-frame. In a specific example of implementation, in cases where the P-frame to be generated is a first P-frame following an I-frame, herein referred to as the "first" P-frame, the P-frame QP selector module 908 selects the P-frame quantization parameter (P-frame QP) based in part on statistical information in memory unit 902. In particular, the P-frame QP selector module 908 selects the P-frame QP value based on the I-frame QP value received at input 956, the data structure selected by selection module 900 and the ratio computed by the ratio computation module 904. In cases where the P-frame to be generated is not the first P-frame following an I-frame, the processing unit 975 selects a P-frame QP value based in part on information related to one or more previously encoded P-frames received at input 990 over data line 124. The processing unit 975 may make use of any suitable technique, such as PID based (Proportional-Integral- Derivative) techniques for example, in order to select a P-frame QP value. PID-based techniques are well-known in the art and as such will not be described in further detail
here. The P-frame QP selector module 908 releases the selected P-frame QP value to output module 910.
The output module 910 is in communication with the QP selector module 908 and with output 958. The output module 910 releases to output 958 the QP value derived by the QP selector module 908.
Optionally (not shown in the figures), the "P-frame" bit-rate controller 804 may include a P-frame QP adjustment module, analogous to the QP adjustment module 410 described with reference to the "I-frame" bit-rate controller 304 (shown in figure 4). Similarly to the QP adjustment module 410, the P-frame QP adjustment module is for adjusting the last computed value P-frame QP value based on the how close the size of the encoded P-frame generated when using this last computed value of QP comes to the target encoded frame size. The P-frame QP adjustment module is in communication with input 990 for receiving information related to one or more previously encoded P-frame, with data line 110 (shown in figure 8) for receiving the raw frame to be encoded, with data line 152 for receiving a reconstructed version of a video frame on which the encoded P-frame will be based and with the output module 910 for receiving the last computed value of QP. The P-frame QP adjustment module includes an encoder to encode the raw frame received at input 110 using the last computed value of QP to generate an encoded P-frame. The encoder in the P-frame QP adjustment module implements the same encoding process as encoder 806 (shown in Figure 8). It is to be appreciated at this point that the P-frame QP adjustment module provides functionality for determining whether a given QP value is suitable for producing P-frames with sizes that fall within a range deemed acceptable in relation to the target encoded frame size. If the frame size of the P-frame falls outside of this range, the P-frame QP adjustment module modifies the QP value to find a QP value that produces an encoded frame with a frame size that is closer to the target encoded frame size.
The P-frame QP adjustment module is optional in the sense that it is not required for the operation of the "P-frame" bit-rate controller 804 and therefore may be omitted
from certain specific implementations thereof. In addition, the use of the P-frame QP adjustment module is optional as well and may depend, for example, on factors such as the current overall computational load on the "P-frame" bit-rate controller 804 as well as the availability of spare processing capacity within the system 100 (shown in figure 1).
The "P-frame" bit-rate controller 804 implements a method for selecting a P-frame quantization parameter (QP) value for use by a video encoding processor in encoder 806 (shown in Figure 8), wherein the P-frame QP value corresponds to a certain level of compression of the video encoding processor. An exemplary process implemented by the "P-frame" bit-rate controller 804 will now be described with reference to Figure 10.
With respect to this Figure, at step 1050, if the P-frame to be generated is a first P- frame following an I-frame, condition 1050 is answered in the affirmative and we proceed to step 1000. Otherwise, if the P-frame to be generated is not a first P-frame following an I-frame, condition 1050 is answered in the negative and we proceed to step 1052.
At step 1052, a P-frame QP value is selected in accordance with any suitable conventional QP value selection process known in the art, such as PID based (Proportional-Integral-Derivative) techniques for example.
Returning now to the case where the P-frame to be generated is a first P-frame following an I-frame, we proceed with step 1000. At step 1000 information pertaining to a raw frame to be encoded is received.
In a first specific example of implementation, the information received at step 1000 conveys the resolution (size) of the raw frame to be encoded.
In a second specific example of implementation, the raw frame to be encoded is part of a sequence of frames associated to a video sequence type and the information
received at step 1000 conveys the video sequence type associated to the sequence of frame of which the raw frame to be encoded is part. Optionally, step 1000 includes processing at least some frames in the sequence of frames of which the frame to be encoded is part to select, from a set of possible video sequence types, the video sequence type associated to the sequence of frames.
It will be readily appreciated that the resolution (size) of the raw frame to be encoded and the video sequence type associated to the sequence of frames of which the raw frame to be encoded is part may be used in combination in certain implementations.
We then proceed to step 1002.
At step 1002, a data structure is selected from an available set of data structures based in part on the information received in step 1000. In a non-limiting implementation, this step is implemented by the data structure selector module 900 (shown in Figure 9) which accesses the set of data structure in memory module 902 (also shown in Figure 9). We then proceed to step 1004. It will be appreciated that in implementations where the resolution of the raw frames is set to a preset resolution (or otherwise never varies) and in situations where the video sequence type is not used as a discriminator, steps 1000 and 1002 would be omitted.
At step 1004, the following information is received:
- the target encoded P-frame size;
- the size of the encoded I-frame on which the P-frame will be partly based ; and the I-frame QP value (the QP value that was used to encode the I- frame on which the P-frame will be partly based).
The target encoded P-frame size conveys a desired size of a P-frame resulting from the encoding of the raw frame present on data line 110 (shown Figure 8) by the video encoding processor in encoder 806. We then proceed to step 1006.
At step 1006, a ratio between the target encoded frame size and the size of the encoded I-frame is computed. Once this calculation is performed, we proceed to step 1008.
At step 1008, the P-frame QP value is selected at least in part based on the I-frame QP value received at step 1004 and the ratio computed at step 1006. As the reader may recall, the data structure selected at step 1002 includes a plurality of entries, each entries being associated to a respective {I-frame QP value; P-frame QP value} combination. As such, at this step, a set of entries associated to the I-frame QP value received at step 1004 is first identified in the data structure selected at step 1002. In cases where the data structure selected is in the format of data structure 906 (shown in Figure 9), this process involves identifying in data structure 906 the column corresponding to the I-frame QP value received at step 1004. The entries in the column associated with the I-frame QP value are then compared to the ratio computed at step 1006 to identify amongst these entries a specific entry including a ratio approaching the ratio computed at step 1006. It will be appreciated that the ratio computed at step 1006 may not exactly match any of the ratios in the data structure 906 corresponding to the I-frame QP value received at step 1004. As such, an entry including a ratio that is closest to the ratio computed at step 1006 may be selected. Alternative approaches may opt to select an entry including a ratio that is closest to but less then (or greater than) the ratio computed at step 1006. The identified specific entry is associated to an I-frame QP value / P-frame QP value pair. The P-frame QP value of the I-frame QP value / P-frame QP value pair is then selected and released for use by a video encoding processor.
In implementations in which the "P-frame" bit-rate controller 804 includes a P-frame QP adjustment module, analogous to the QP adjustment module 410 described with reference to the "I-frame" bit-rate controller 304 (shown in figure 4), following step 1008 an adjustment of the P-frame QP may be made. In a non-limiting example of implementation, the adjustment to the selected P-frame QP may be made in a manner analogous to that of the I-frame QP described with reference to steps 508 510 512 514 and 516 shown in Figure 5. For the purpose of conciseness, steps pertaining to the
adjustment of the P-frame QP will not be described in greater detail here as their implementation will become apparent to the person skilled in the art in light of the present description.
The process then ends for the current frame.
The following simplified example will better illustrate the above process. Assume that the data structure selected at step 1002 and associated to the resolution of the frame to be encoded includes the following entries:
In addition, assume that at step 1004 we receive a target encoded frame size of 25 Kbits, an encoded 1-frame size of 100 Kbits and an I-frame QP-value of 10. Based on the above information, the ratio between the target encoded frame size and the encoded I-frame size is computed at step 1006 as being:
25 Kbits / 100 Kbits = 0.25
Following this, and referring to the above table, we first locate in the above table the column corresponding to an I-frame QP value of 10. This column includes a set of three entries namely: {.123; .264 and .843}. We then located in this set the entry having a ratio approaching the ratio between the target encoded frame size and the encoded I-frame size which was computed as 0.25 and find that the ratio approaching 0.25 is .264. This entry corresponds to a P-frame QP value of 2. This P-frame QP value is then selected a step 1008 and released for use by an encoder.
It is to be appreciated that the entries in the above table were provided for the purpose of illustration only and are not meant to convey in any way real values of entries as would be used in a practical implementation.
In another example for illustrating the process shown in Figure 10, consider Figure 12 of the drawings. Figure 12 of the drawings graphically illustrates a relationship, represented by curve 1202, between P-frame QP values (shown in the x-axis) and ratios between P-frame sizes and I-frame sizes for a given I-frame QP value (here shown as I-frame QP value = 15) as provided by the information memory 902 (shown in Figure 9). As can be observed from curve 1202, in the specific example shown the relationship between P-frame QP values (shown in the x-axis) and ratios between P- frame sizes and I-frame sizes for a fixed I-frame QP value generally follows a negative exponential curve. Dotted line 1204 conveys a ratio computed based a target encoded frame size and the size of the encoded I-frame on which the P-frame to be encoded is based. The intersection between curve 1202 and dotted line 1204 allows obtaining a P-frame QP value (1208), which for the purpose of illustration only is P- firame QP = 4. This P-frame QP value is then selected for use by an encoder.
It is to be noted that the relationship between P-frame QP values and ratios of P-frame sizes to I-frame sizes (for a fixed I-frame QP value) shown in Figure 12, which has been provided for the purpose of illustration only, is associated with a specific coding protocol and that each coding will be associated with a respective relationship between P-frame QP values and ratios of P-frame sizes to I-frame sizes.
Generation of Data Structure for P-frame QP
As described above, the P-frame bit-rate controller 804 (shown in Figures 8 and 9) makes use of statistical information conveying a relationship between I-frame QP values and P-frame QP values and well as the associated I-frame and P-frame sizes of resulting frames encoded using different combinations of I-frame QP values and P- frame QP values.
An exemplary process for generating information suitable for use in selecting a quantization parameter value for a video encoding processor will now be described with reference to Figure 11. For the purpose of this example, the information generated is in the form of a data structure including entries conveying a statistical relationship between I-frame QP values and P-frame QP values and resulting encoded frame sizes.
It is to be understood that other suitable approaches for generating different types of statistical information for use in selecting a quantization parameter value are possible and that the process described herein is presented here for the purpose of illustration.
For the purpose of simplicity, the process described with reference to Figure 11 is for generating a single data structure associated to a single frame resolution (size). The reader skilled in the art will readily appreciate in light of the present description, that the process depicted in Figure 11 can be performed repeatedly for different video sequence types, different frame resolutions (sizes) and/or different combinations of video sequence types/ frame resolutions (sizes) in order to generate different respective data structures. The manner in which such multiple data structures can be generated will becomes readily apparent to the person skilled in the art in light of the present specification and as such will not be described further here.
At step 1100, a sequence of video frames is provided as a basis for generating the data structure. The sequence of video frames provided at this step preferably includes frames sequences representative of the types of frame sequences expected to be encoded.
At step 1102, the frames in the sequence of video frames provided at step 1100 are assigned to be encoded as either I-frames or P-frames in such a manner as to interleave I-frame and P-frame assignment in the encoded video sequence.
In a specific non-limiting example of implementation, the frames in the sequence of frames are assigned to be encoded as either I-frames or P-frames according to the following pattern:
I-P-I-P-I-P- ...
As will be readily observed, in this pattern each frame assigned to be encoded as a P- frame immediately follows a frame assigned to be encoded as an I-frame. In addition, as will be appreciated by the person skilled in the art, each P-frame will be based in part on the I-frame immediately preceding it. It is to be appreciated that the above described pattern of I-P-I-P-I-P-(etα) constitutes only one possible pattern and other suitable patterns, including patterns making use of frame types other than I-frames and P-frames, may be used.
Steps 1104 1106 1108 1110 1112 1114 1116 1118 and 1120 are performed for each I- frame QP value from a set of possible I- frame QP values.
More specifically, at step 1104, an I-frame QP value that has not yet been processed is selected. We then proceed to step 1106.
At step 1106, the frames assigned to be encoded as I-frames at step 1102 are encoded using the I-frame QP value selected at step 1104 to generate an I-frame encoded frame group. The resulting I-frame encoded frame group includes a plurality of encoded video frames derived using the I-frame QP value selected in step 1104. We then proceed to step 1108.
At step 1108, an encoded frame size corresponding to the I-frame QP value selected in step 1104 is derived at least in part based on frame sizes of encoded video frames in the I-frame encoded frame group generated at step 1106. For the purpose of simplicity, the encoded frame size corresponding to the I-frame QP value selected in step 1104 will be designation as:
F (Size Of I-frameS {l-frame QP value})
Wherein F(x) is used to designate a certain function of variable x, and wherein "size of I-frames p-frame QP value}" is used to designate the size of the I-frames in the frame group associated with the I-frame QP value selected at step 1104. Different functions for deriving the encoded frame size corresponding to the I-frame QP value selected in step 1104 may be contemplated. In a non-limiting example of implementation, shown in Figure 11, function F(x) is an average function which computes the average encoded frame sizes of the encoded frames in the I-frame encoded frame group obtained at step 1106. In such an implementation, the encoded frame size corresponding to the I-frame value selected in step 1104 is set to correspond to the average I-frame size and is associated to the specific I-frame QP value selected at step 1104. We then proceed to step 1110.
As will become apparent from Figure 11, for each I-frame QP value selected at step 1104, steps 1112 1114 1116 and 1118 are repetitively performed for each P-frame QP value from a set of possible P- frame QP values.
More specifically, at step 1110, a P-frame (QP) value that has not yet been processed is selected. We then proceed to step 1112.
At step 1112, the frames assigned to be encoded as P-frames at step 1102 are encoded using the P-frame QP value selected at step 1110 to generate a P-frame encoded frame group. More specifically, the frames assigned to be encoded as P-frames in step 1102 are encoded using the I-frames generated in step 1106 (using the I-frame QP value selected at step 1104), as well as the selected P-frame QP value selected in step 1110. The methods for encoding a video frame as a P-frame based on a preceding I-frame are well known in the art and as such will not be described further here. The P-frame encoded frame group includes a plurality of encoded video frames derived using the P-frame QP value selected in step 1110. It is to be observed that each P-frame generated at step 1112 is associated to a specific combination of an I-frame QP value and P-frame QP value. We then proceed to step 1114.
In step 1114, an encoded frame size corresponding to the P-frame QP value selected in step 1110 is derived at least in part based on frame sizes of encoded video frames in the P-frame encoded frame group generated at step 1112. For the purpose of simplicity, this encoded frame size corresponding to the P-frame QP value selected in step 1114 will be designation as:
F (size Of P-frameS {i-frame QP value, P-Frame QP value})
Wherein F(x) is used to designate a certain function of variable x, and wherein "size of I-frames ^frame QP value, P-Frame QP value} " is used to designate the size of the P-frames in the frame group associated with the I-frame QP value selected at step 1104 and the P- frame QP value selected at step 1110. Different functions for deriving the encoded frame size corresponding to the P-frame QP value selected in step 1110 may be contemplated. In a non- limiting example of implementation, shown in Figure 11, function F(x) is an average function which computes the average encoded frame size of the encoded frames in the P-frame encoded frame group obtained at step 1112. In such an implementation, the encoded frame size corresponding to the P-frame QP value selected in step 1110 is set to correspond to the average encoded P-frame size.
It is to be observed that the average encoded P-frame size computed at step 1116 is associated to a specific combination of an I-frame QP value and P-frame QP value.
We then proceed to step 1116.
At step 1116, a data element conveying a relationship between the I-frame size derived at step 1108 and the P-frame size derived at step 1114 is derived. This data element is associated to a specific {I-frame QP value; P-frame QP value} combination. For the purpose of simplicity, the data element corresponding to the I- frame QP value selected in step 1104 and to the P-frame QP value selected at step 1110 will be designation as:
G (F (size of P-frames {m> n}); F (size of I-frames {m})) for a given {m,n} combination
Wherein G(x, y) is used to designate a certain function of variables x and y, m is used to designate and I-frame QP value and n is used to designate a P-frame QP value. Different functions for deriving a data element conveying a relationship between the I-frame size and the P-frame size derived at steps 1108 and 1114 respectively may be contemplated. In a non-limiting example of implementation, shown in Figure 11, function G(x, y) is a ratio function which computes:
G(x, y) = x/y
In other words, in accordance with the above non-limiting example of implementation, for a given ({I-frame QP value; P-frame QP value} combination, we compute:
Ratio {i-frame QP, P-frame QP} = F (size of P-frame (i-frame OP. P-frame OP )I
F (size of I-frame {i-frame QP})
The result of step 1116 is a data element associated with a given {I-frame QP value; P-frame QP value} combination conveying a relationship between the I-frame sizes and P-frame sizes associated with that given {I-frame QP value; P-frame QP value} combination. We then proceed to step 1118.
At step 1118, the data element derived in step 1116 is stored in a data structure on a computer readable storage medium. In particular, the data element is stored in association with an {I-frame QP value; P-frame QP value} combination, wherein the I-frame QP value of the combination is the I-frame QP value selected at step 1104 and the P-frame QP value of the combination is the P-frame QP value selected at step 1110. We then proceed to step 1120.
At step 1120, if not all P-frame QP values in the set of P-frame QP values have been processed for the I-frame QP value selected at the last iteration of step 1104, step 1120 is answered in the negative and we return to step 1110 where steps 1112 1114 1116 and 1118 are repeated for the next unprocessed P-frame QP value. If on the other hand all P-frame QP values in the set of P-frame QP values have been processed
for the I-frame QP value selected at the last iteration of step 1104, step 1120 is answered in the positive and the process continues at step 1124.
At step 1124, if not all I-frame QP values in the set of I-frame QP values have been processed, step 1124 is answered in the negative and we return to step 1104 where steps 1106 1108 1110 1112 1114 1116 1118 and 1120are repeated for the next unprocessed I-frame QP value. If, on the other hand, all I-frame QP values in the set of I-frame QP values have been processed, step 1124 is answered in the positive and we proceed to step 1128 where the process terminates.
Based on the above process, a data structure including a plurality of entries is generated, wherein each entry is associated to a respective {I-frame QP value; P- frame QP value} combination and conveys a relationship between I-frame sizes and P-frame sizes derived using the respective {I-frame QP value; P-frame QP value} combination to which it is associated.
As a variant, a set of data structures may be generated, wherein each data structure is associated to a respective frame resolution (size). As yet another variant, a set of data structures may be generated, each data structure being associated to a respective video sequence type. The manner in which the above variants would be implemented will be apparent to the person skilled in the art in light of the present description and as such will not be described further here.
Those skilled in the art will appreciate that in some embodiments of the invention, all or part of the process previously described herein with reference to Figure 11 may be implemented as pre-programmed hardware or firmware elements (e.g., application specific integrated circuits (ASICs), DSPs, electrically erasable programmable readonly memories (EEPROMs), etc.), or other related components.
In other embodiments of the invention, all or part of the process previously described herein with reference to Figure 11 may be implemented as software consisting of a series of instructions for execution by a computing unit. The series of instructions
could be stored on a medium which is fixed, tangible and readable directly by the computing unit, (e.g., removable diskette, CD-ROM, ROM, PROM, EPROM or fixed disk, or the instructions could be stored remotely but transmittable to the computing unit via a modem or other interface device (e.g., a communications adapter) connected to a network over a transmission medium. The transmission medium may be either a tangible medium (e.g., optical or analog communications lines) or a medium implemented using wireless techniques (e.g., microwave, infrared, RF or other transmission schemes).
It is also to be appreciated that the process described with reference to Figure 11 may be implemented on a computing platform separate from system 100 (shown in Figure 1). As such, the process described with reference with Figure 11 may be performed off-line in order to generate the entries to the data structure(s) in memory module 902. The computing platform that is used for generating a data structure in accordance with the process described in Figure 11 may be placed in communication with the system 100 for the purpose of storing the generated data structure in the memory module 902 of P-frame bit-rate controller 804 (both shown in Figure 9). Alternative, the computing platform may be integrated within system 100 (shown in Figure 1) as an integral component thereof.
Variant of bit-rate controller 804
As a variant, bit-rate controller 804 (shown in Figure 9) includes a self-update module implementing a process for generating updated information for use in selecting a P- frame quantization parameter value for a video encoding processor.
Figure 17 of the drawings depicts a bit-rate controller 804' which is analogous to bit- rate controller 804 (shown in Figure 9) but modified to include self-update module 1900.
In the example depicted in Figure 17, the self-update module 1900 is in communication with output module 910 for receiving a P-frame QP value, with input
990 for receiving information related to a P-frame previously encoded using the P- frarne QP value, with input 954 for receiving information conveying the size of the I- frame on which the previously encoded P-frame was based, with input 956 for receiving the I-frame QP value used to generate that I-frame and with input 950 for receiving the resolution(size) of the raw frames to be encoded. The self-update module 1900 stores the information it receives from the aforementioned inputs 990 956 954 and 950 in a memory (not shown) in associated with the P-frame QP value released by output module 910. In this fashion, the self-update module 1900 gathers statistical information pertaining to the relationship between I-frame QP values and P- frame QP values and the associated I-frame and P-frame sizes of the resulting encoded frames.
In a specific example, once the self-update module 1900 has gathered a sufficient amount of information pertaining a certain {P-frame QP value; I-frame QP value} combination, the self-update module 1900 processes this information to derive information using methods similar to those described with reference to Figure 11. The "sufficiency" of the information gathered pertaining to encoded frame sizes for certain {P-frame QP value; I-frame QP value} combinations will vary from one implementation to the other.
The self-update module 1900 then accesses memory 902 to locate in a data structure an entry corresponding to the certain {P-frame QP value; I-frame QP value} combination and replaces the data in that entry with the newly generated information.
In this fashion, the behaviour of bit-rate controller 804' can be adapted over time based on actual video frames to be encoded.
It will be readily appreciated that the functionality of the self-update module 1900 may be adapted for updating statistical information in memory 902 in implementations where the memory 902 stores a plurality of data structures associated with respective frame resolutions (sizes) and/or respective video sequence types.
Manners in which the self-update module 1900 could implement this functionality in such implementations will be readily apparent to the person skilled in the art in light of the present description and as such will not be described further detail here.
It is also to be appreciate that the bit-rate controller 804' may also include a P-frame QP adjustment module (no shown in figure 17), analogous to the QP adjustment module 410 described with reference to the "I-frame" bit-rate controller 304 (shown in figure 4), for adjusting the last computed value P-frame QP value based on the how close the size of the encoded P-frame generated when using this last computed value of QP comes to the target encoded frame size.
Specific Physical Implementation
Those skilled in the art should appreciate that in some embodiments of the invention, all or part of the functionality previously described herein with respect to any of the "I-frame" bit-rate controller 304 (shown in Figures 3 and 4), "I-frame" bit-rate controller 304' (shown in Figure 16), the "P-frame" bit-rate controller 804 (shown in Figures 8 and 9) and the "P-frame" bit-rate controller 804' (shown in Figure 17) may be implemented as pre-programmed hardware or firmware elements (e.g., application specific integrated circuits (ASICs), DSPs, electrically erasable programmable readonly memories (EEPROMs), etc.), or other related components.
In other embodiments of the invention, all or part of the functionality previously described herein with respect to either one of the "I-frame" bit-rate controllers 304 and 304' and either one of the "P-frame" bit-rate controllers 804 and 804' may be implemented as software consisting of a series of instructions for execution by a computing unit. The series of instructions could be stored on a medium which is fixed, tangible and readable directly by the computing unit, (e.g., removable diskette, CD-ROM, ROM, PROM, EPROM or fixed disk, or the instructions could be stored remotely but transmittable to the computing unit via a modem or other interface device (e.g., a communications adapter) connected to a network over a transmission medium. The transmission medium may be either a tangible medium (e.g., optical or
analog communications lines) or a medium implemented using wireless techniques (e.g., microwave, infrared, RF or other transmission schemes).
The apparatus implementing any of the "I-frame" bit-rate controllers 304 and 304', the "P-frame" bit-rate controller 804 and 804' may be configured as a computing unit of the type depicted in Figure 15, including a processing unit 1502 and a memory
1504 connected by a communication bus 1508. The memory 1504 includes data 1510 and program instructions 1506. The processing unit 1502 is adapted to process the data 1510 and the program instructions 1506 in order to implement the functional blocks described in the specification and depicted in the drawings.
In a specific implementation, the data 1510 includes a set of data structures in accordance with the set of data structures 402 described with reference to Figure 4 and the program instructions 1506 implement the functionality of the processing unit 475 described above with reference to Figure 4. In another specific implementation, the data 1510 includes a set of data structures in accordance with the set of data structures 902 described with reference to Figure 9 and the program instructions 1506 implement the functionality of the processing unit 975 described above with reference to Figure 9. The computing unit 1502 may also comprise a number of interfaces (now shown) for receiving or sending data elements to external modules.
Those skilled in the art should further appreciate that the program instructions 1506 may be written in a number of programming languages for use with many computer architectures or operating systems. For example, some embodiments may be implemented in a procedural programming language (e.g., "C") or an object oriented programming language (e.g., "C++" or "JAVA").
Although the present invention has been described in considerable detail with reference to certain preferred embodiments thereof, variations and refinements are possible.
For example, in an alternative example of implementation, the selection of a quantization parameter (either reference frame QP or non-reference frame QP) may be made for a subset of macro-blocks in a given frame, rather than for a frame as a whole. As such, different quantization parameters can be selected for different portions of a frame, each portion including one or more macroblocks, based on the concepts and processes described above. In addition, the data structures used for storing information providing a relationship between different encoded frame sizes and quantization parameters can be further indexed to correspond to subsets of macro- blocks in a frame so that the data structures map macroblock sizes (rather than frame sizes) to corresponding quantization parameters. For example, a first data structure may be assigned to macroblocks of a frame positioned in the top right quadrant of the frame, and a second data structure may be assigned to macroblocks of a frame positioned in the centre of the frame.
In addition, the principles described above in accordance with specific examples of implementation of the invention can be used in combination with existing methods and systems for selecting a quantization parameters value. For example, if while encoding macroblocks in a frame using a quantization parameter (QP) selected according to one of the methods described above, the encoder realizes that too many bits will be used if the selected compression level (QP) is maintained, the compression level can be increased (higher QP) for the remaining macroblocks in the frame. Many other variants are possible and will become readily apparent to the person skilled in the art in light of the present description.
In yet another example, although the examples of implementation of the "I-frame" bit-rate controller 304 (shown in Figures 3 and 4), "I-frame" bit-rate controller 304' (shown in Figure 16), the "P-frame" bit-rate controller 804 (shown in Figures 8 and 9) and the "P-frame" bit-rate controller 804' (shown in Figure 17) described implementations in which the quantization parameters are derived on the basis of data stored in data structures, alternative examples of implementation may derive quantization parameters using mathematical models mapping the relationship between various parameters (resolution, desired frame size and so on) and the suitable QP
value to use. Such mathematical models may be derived based on statistical models according to well-known methods.
Therefore, the scope of the invention should be limited only by the appended claims and their equivalents.
Claims
1. A method using statistical information for determining a level of compression to be applied to a certain frame by a video encoding processor in order to generate an encoded frame, the statistical information being obtained by encoding a plurality of representative video frames and observing statistical trends of a resulting encoded video stream and providing estimates of encoded frame sizes of encoded frames resulting from the video encoding processor using different levels of compression to encode the certain frame, said method comprising selecting the level of compression to be applied to the certain frame at least in part based on the statistical information and a target frame size.
2. A method as defined in claim 1, said method further comprising: a) receiving frame-type information conveying an encoded frame type in which the certain frame is to be encoded; b) receiving information conveying the target frame size; c) selecting the level of compression to be applied to the certain frame at least in part based on: i. said target frame size; ii. the statistical information; and iii. the encoded frame type conveyed by the frame-type information; d) releasing the selected level of compression to the video encoding processor.
3. A method as defined in claim 2, wherein the encoded frame type conveyed by the frame-type information is selected from a set including a reference frame type and a non-reference frame type.
4. A method as defined in claim 3, wherein the reference frame type is an I- frame type and the non-reference frame type is a P-frame type.
5. A method as defined in claim 2, wherein the statistical information comprises: a) a first data structure including a plurality of entries, each entry mapping an encoded frame size to a corresponding reference frame quantization parameter value, each reference frame quantization parameter value corresponding to a respective level of compression of the video encoding processor; b) a second data structure providing a mapping between reference frame quantization parameter values and non-reference frame quantization parameter values, each non-reference frame quantization parameter value corresponding to a respective level of compression of the video encoding processor; c) selecting the level of compression to be applied to the certain frame based in part on either one of said first data structure and said second data structure.
6. A bit-rate controller using statistical information for determining a level of compression to be applied to a certain frame by a video encoding processor in order to generate an encoded frame, wherein the statistical information was obtained by encoding a plurality of representative video frames and observing statistical trends of a resulting encoded video stream and providing estimates of encoded frame sizes of encoded frames resulting from the video encoding processor using different levels of compression to encode the certain frame, said bit-rate controller including: a) an input for receiving a target frame size; b) a memory module for storing the statistical information; c) a processing unit for selecting a level of compression at least in part based on the statistical information and the target frame size; d) an output for releasing data conveying the derived level of compression.
7. A bit-rate controller as defined in claim 6, wherein said input is a first input, said apparatus further comprising: a) a second input for receiving frame-type information conveying an encoded frame type in which the certain frame is to be encoded; wherein said processing unit selects the level of compression to be applied to the certain frame at least in part based on: i. said target frame size; ii. the statistical information; and iii. the encoded frame type conveyed by the frame-type information.
8. A bit-rate controller as defined in claim 7, wherein the encoded frame type conveyed by the frame-type information is selected from a set including a reference frame type and a non-reference frame type.
9. A bit-rate controller as defined in claim 8, wherein the reference frame type is an I-frame type and the non-reference frame type is a P-frame type.
10. A bit-rate controller as defined in claim 7, wherein the statistical information comprises: a) a first data structure including a plurality of entries, each entry mapping an encoded frame size to a corresponding reference frame quantization parameter value, each reference frame quantization parameter value corresponding to a respective level of compression of the video encoding processor; b) a second data structure providing a mapping between reference frame quantization parameter values and non-reference frame quantization parameter values, each non-reference frame quantization parameter value corresponding to a respective level of compression of the video encoding processor; and wherein said processing unit selects the level of compression to be applied to the certain frame based in part on either one of said first data structure and said second data structure.
11. A method for selecting a quantization parameter value for use by a video encoding processor encoding a certain video frame, the quantization parameter value corresponding to a certain level of compression of the video encoding processor, said method comprising: a) receiving target encoded frame size information conveying a desired frame size for the certain video frame; b) providing a data structure including a plurality of entries, each entry mapping an encoded frame size to a corresponding quantization parameter value; c) selecting a quantization parameter value at least in part based on said target encoded frame size information and on said data structure; d) releasing the selected quantization parameter value to the video encoding processor.
12. A method as defined in claim 11, wherein the selected quantization parameter value allows the video encoding processor to derive an encoded video frame based on the certain video frame such that the encoded video frame has a frame size tending toward the desired frame size.
13. A method as defined in claim 11, wherein each entry in the data structure maps an expected maximum encoded frame size to a corresponding quantization parameter value.
14. A method as defined in claim 11, wherein the selected quantization parameter value corresponds to an entry in the data structure associated to an encoded frame size approximating said target encoded frame size.
15. A method as defined in claim 11, wherein selecting the quantization parameter value comprises: a) selecting an initial quantization parameter value, said initial quantization parameter value corresponding to an entry in the data structure associated to an encoded frame size approximating said target encoded frame size; b) on the basis of the initial quantization parameter value, encoding the certain video frame to derive a resulting encoded video frame, the resulting encoded frame having a certain size; c) selecting the quantization parameter value at least in part based on the certain size of the resulting encoded video frame and the desired frame size for the certain video frame.
16. A method as defined in claim 15, wherein the certain video frame is encoded to derive the resulting video encoded frame using an encoding method substantially similar to that used by the video encoding processor.
17. A method as defined in claim 11, said method comprising selecting said data structure from a set of data structures based in part on an frame resolution associated with the certain video frame, wherein: a) each data structure in said set of data structures is associated to a respective frame resolution; and b) each data structure in said set of data structures includes a plurality of entries, each entry mapping an encoded frame size to a corresponding quantization parameter value.
18. A method as defined claim 11, wherein the certain video frame is part of a sequence of frames, said method comprising selecting said data structure from a set of data structures based in part on a video sequence type associated to the sequence of frames, wherein: a) each data structure in said set of data structures is associated to a respective video sequence type; and b) each data structure in said set of data structures includes a plurality of entries, each entry mapping an encoded frame size to a corresponding quantization parameter value.
19. A method as defined in claim 18, wherein selecting the given data structure comprises processing at least some frames in said sequence of frames to select, from a set of possible video sequence types, the video sequence type associated to the sequence of frames.
20. A method as defined in claim 11, wherein said quantization parameter value is reference frame quantization parameter value, the reference frame quantization parameter value being for use by the video encoding processor in encoding the certain video frame as a reference frame.
21. A method as defined in claim 11, wherein the certain video frame is part of a sequence of frames, said method comprising: a) processing at least some frames in the sequence of frames based on at least one quantization parameters value to derive actual encoded frame size information associated to frames in the sequence of frames; b) modifying at least one entry in the data structure at least in part based on the derived actual encoded frame size information, the at least one entry being associated with the at least one quantization parameters value.
22. An apparatus for selecting a quantization parameter value for use by a video encoding processor encoding a certain video frame, the quantization parameter value corresponding to a certain level of compression of the video encoding processor, said apparatus comprising: a) a memory module for storing a data structure including a plurality of entries, each entry mapping an encoded frame size to a corresponding quantization parameter value; b) an input for receiving target encoded frame size information conveying a desired frame size for the certain video frame; c) a processing unit in communication said input and with said memory module, said processing unit being programmed for selecting a quantization parameter value from a set of possible quantization parameters values at least in part based on said target encoded frame size information and on said data structure; d) an output for releasing the selected quantization parameter value to the video encoding processor.
23. An apparatus for selecting a quantization parameter value in accordance with the method described in any one of claims 11 to 21, the selected quantization parameter value being for use by a video encoding processor encoding a certain video frame and corresponding to a certain level of compression of the video encoding processor.
24. A computer readable storage medium storing a program element suitable for execution by a processor for selecting a quantization parameter value for use by a video encoding processor encoding a certain video frame, the quantization parameter value corresponding to a certain level of compression of the video encoding processor, said program element when executing on the processor being operative for implementing the method described in any one of claims 11 to 21.
25. A video encoding system for encoding a video stream including a sequence of frames, said system comprising: a) a first input for receiving a certain video frame originating from a sequence of frames to be encoded; b) a second input for receiving target encoded frame size information conveying a desired frame size for the certain video frame; c) an apparatus in communication with said second input, said apparatus being for selecting a quantization parameter value in accordance with the method described in any one of claims 11 to 21; d) an encoding processor in communication with said first input and with said apparatus, said encoding processor being operative for processing the certain video frame received at said first input to generate an encoded video frame based in part on a level of compression corresponding to the quantization parameter value selected by said apparatus; e) an output for releasing the encoded video frame generated by the encoding processor.
26. A method for generating information for use in selecting a quantization parameter value for a video encoding processor, the quantization parameter value corresponding to a certain level of compression of the video encoding processor, said method comprising: a) providing a plurality of video frames representative of types of frames expected to be encoded by the video encoding processor; b) encoding the plurality video frames for a quantization parameter value selected from a set of quantization parameter values to generate an encoded frame group, the encoded frame group including a plurality of encoded video frames derived using the quantization parameter value; c) deriving an encoded frame size corresponding to the quantization parameter value, the corresponding encoded frame size being derived at least in part based on frame sizes of encoded video frames in the encoded frame group; d) on a computer readable storage medium, storing information mapping the quantization parameter value to its derived corresponding encoded frame size.
27. A method as defined in claim 26, wherein said method comprises repeating steps b) c) and d) for multiple quantization parameter values selected from the set of quantization parameter values.
28. A method as defined in claim 26, wherein deriving the encoded frame size corresponding to the quantization parameter value includes: i. processing the frame sizes of encoded video frames in the encoded frame group to derive a range of frame sizes, the range of frame sizes having an upper limit, the range of frame sizes being such that a certain proportion of encoded frames in the encoded frame group have a frame size falling within said range of frame sizes; ii. deriving the encoded frame size corresponding to the quantization parameter value such so that it substantially corresponds to the upper limit of the derived range of frame sizes.
29. A method as defined in claim 28, wherein the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 50%.
30. A method as defined in claim 29, wherein the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 70%.
31. A method as defined in claim 30, wherein the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 90%.
32. A method as defined in claim 31, wherein the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 95%.
33. A method as defined in claim 32, wherein the certain proportion of encoded frames having a frame size falling within the range of frame sizes is at least about 99%.
34. A method as defined in claim 26, wherein the plurality of video frames includes a first sub-set of frames having a first frame resolution and a second sub-set of frames having a second frame resolution distinct from said first frame resolution, said method comprising: i. processing video frames in the first sub-set of frames to derive a first encoded frame size corresponding to the quantization parameter value and the first frame resolution; ii. processing video frames in the second sub-set of frames to derive a second encoded frame size corresponding to the quantization parameter value and the second frame resolution.
35. A method as defined claim 26, wherein the plurality of video frames includes a first sequence of frames of a first video sequence type and a second sequence of frames of a second video sequence type distinct from said first video sequence type, said method comprising: i. processing video frames in the first sequence of frames to derive a first encoded frame size corresponding to the quantization parameter value and the first video sequence type; ii. processing video frames in the second sequence of frames to derive a second encoded frame size corresponding to the quantization parameter value and the second video sequence type.
36. A computer readable storage medium storing a data structure for use in selecting a quantization parameter value for a video encoding processor from a set of quantization parameter values, each quantization parameter value in said set corresponding to a respective level of compression of the video encoding processor, wherein said data structure includes a plurality of entries, each entry mapping an encoded frame size to a corresponding quantization parameter value in the set of quantization parameter values.
37. A computer readable storage medium as defined in claim 36, wherein the entries in the data structure are derived at least in part by encoding a plurality of representative video frames and observing statistical trends of resulting encoded video streams.
38. A computer readable storage medium storing a data structure for selecting a quantization parameter for use by a video encoding processor, the quantization parameter value corresponding to a certain level of compression of the video encoding processor, the data structure including a plurality of entries mapping an encoded frame size to a corresponding quantization parameter value, the entries in the data structure being generated in accordance with the method described in any one of claims 26 to 35.
39. An apparatus for generating information for use in selecting a quantization parameter value according to the method described in any one of claims 26 to 35, the quantization parameter value being for use by a video encoding processor encoding a certain video frame and corresponding to a certain level of compression of the video encoding processor.
40. A method for selecting a non-reference frame quantization parameter value for use by a video encoding processor, the non-reference frame quantization parameter value corresponding to a certain level of compression of the video encoding processor, said method comprising: a) receiving target encoded non-reference frame size information conveying a desired frame size for a certain video frame to be encoded as a non-reference frame by the video encoding processor; b) receiving reference frame information associated to a reference frame on which the non-reference frame will be partly based, said reference frame information including; i. a reference frame quantization parameter value associated to the reference frame and corresponding to a level of compression applied to generate the reference frame; ii. reference frame size information conveying a frame size associated to the reference frame; c) selecting the non-reference frame quantization parameter value at least in part based on: i. the target encoded non-reference frame size information; ii. the reference frame quantization parameter value; and iii. the reference frame size information; d) releasing the selected non-reference frame quantization parameter value to the video encoding processor.
41. A method as defined in claim 40, said method comprising providing a data structure providing a mapping between reference frame quantization parameter values and non-reference frame quantization parameter values.
42. A method as defined in claim 41, wherein said data structure comprises a plurality of entries, each entry being associated to a respective combination of: a) a reference frame quantization parameter value; and b) a non- reference frame quantization parameter value.
43. A method as defined in claim 40, wherein the selected non- reference frame quantization parameter value allows the video encoding processor to derive an encoded video frame based on the certain video frame such that the encoded video frame has a frame size tending toward the desired frame size.
44. A method as defined in claim 42, wherein selecting the non- reference frame quantization parameter value comprises: a) identifying a set of entries in the data structure corresponding to the reference frame quantization parameter value received; b) identifying amongst the set of identified entries, an entry corresponding to a combination of the target encoded non- reference frame size information and the reference frame size information.
45. A method as defined in claim 40, said method comprising selecting said data structure from a set of data structures based in part on an frame resolution associated with the certain video frame, wherein: a) each data structure in said set of data structures is associated to a respective frame resolution; and b) each data structure in said set of data structures provides a respective mapping between reference frame quantization parameter values and non-reference frame quantization parameter values.
46. A method as defined in claim 40, wherein said method comprises selecting said data structure from a set of data structures based in part on a video sequence type associated to the sequence of frames, wherein: a) each data structure in said set of data structures is associated to a respective video sequence type; and b) each data structure in said set of data structures provides a respective mapping between reference frame quantization parameter values and non-reference frame quantization parameter values.
47. A method as defined in claim 46, wherein selecting the data structure comprises processing at least some frames in said sequence of frames to select, from a set of possible video sequence types, the video sequence type associated to the sequence of frames.
48. A method as defined in claim 40, said method comprising: a) processing at least some frames in the sequence of frames to derive actual non-reference frame associated to frames in the sequence of frames based on a combination of a non-reference frame quantization parameter and a reference frame quantization parameter value; b) modifying an entry in the data structure associated with the combination of the non-reference frame quantization parameter and the reference frame quantization parameter value at least in part based on the actual non-reference frame.
49. An apparatus for selecting a non-reference frame quantization parameter value according to the method defined in any one of claims 40 to 48, the non- reference frame quantization parameter value being for use by a video encoding processor, the non-reference frame quantization parameter value corresponding to a certain level of compression of the video encoding processor.
50. A computer readable storage medium storing a program element suitable for execution by a processor for selecting a non-reference frame quantization parameter value for use by a video encoding processor, the non-reference frame quantization parameter value corresponding to a certain level of compression of the video encoding processor, said program element when executing on the processor being operative for implementing the method according to any one of claim 40 to 48.
51. A video encoding system for encoding a video stream including a sequence of frames, said system comprising: a) a first input for receiving a certain video frame originating from a sequence of frames to be encoded; b) a second input for receiving target encoded frame size information conveying a desired frame size for the certain video frame; c) an apparatus in communication with said second input, said apparatus being for selecting a quantization parameter value in accordance with the method described in any one of claims 40 to 48; d) an encoding processor in communication with said first input and with said apparatus, said encoding processor being operative for processing the certain video frame received at said first input to generate an encoded video frame based in part on a level of compression corresponding to the quantization parameter value selected by said apparatus; e) an output for releasing the encoded video frame generated by the encoding processor.
52. An apparatus for selecting a non-reference frame quantization parameter value for use by a video encoding processor, the non-reference frame quantization parameter value corresponding to a certain level of compression of the video encoding processor, said apparatus comprising: a) a first input for receiving target encoded non-reference frame size information conveying a desired frame size for a certain video frame to be encoded as a non-reference frame by the video encoding processor, the certain video frame being part of a sequence of frames; b) a second input for receiving reference frame information associated to a reference frame on which non-reference frame will be partly based, the reference frame being generated based on a video frame preceding the certain video frame in the sequence of frames, said reference frame information including; i. an reference frame quantization parameter value associated to the reference frame and corresponding to a level of compression applied to generate the reference frame; ii. reference frame size information conveying a frame size associated to the reference frame; c) a processing unit in communication with said first input and said second input, said processing unit being programmed for selecting the non-reference frame quantization parameter value at least in part based on: i. the target encoded non-reference frame size information; ii. the reference frame quantization parameter value; and iii. the reference frame size information; d) an output for releasing the non-reference frame quantization parameter value selected by the processing unit to the video encoding processor.
53. A method for generating information for use in selecting a non-reference frame quantization parameter value from a set of possible non-reference frame quantization parameter values for use by a video encoding processor, each non-reference frame quantization parameter value in said set corresponding to a certain level of compression of the video encoding processor, said method comprising: a) providing a sequence of video frames representative of types of frames expected to be encoded by the video encoding processor; b) encoding the sequence video frames so as to generate: i. a plurality of reference frames associated with a given reference frame quantization parameter value, the reference frame quantization parameter value corresponding to a level of compression applied to generate the set reference frames; ii. a plurality of non-reference frames, the plurality of non- reference frames being arranged in sets, each set of non- reference frames being associated with a respective non- reference frame quantization parameter value corresponding to a level of compression applied to generate the non-reference frames in the set of non-reference frames; c) deriving a mapping between the given reference frame quantization parameter value and each non-reference frame quantization parameter value in the set of possible non-reference frame quantization parameter values based on the plurality of reference frames and on the plurality of non-reference frames; d) storing the derived mapping in a data structure on a computer readable storage medium.
54. A method as defined in claim 53, said method comprising repeating steps b) c) and d) for each reference frame quantization parameter value from a set of possible reference frame quantization parameter values such as to derive a mapping between each reference frame quantization parameter value in the set of possible reference frame quantization parameter values and each non- reference frame quantization parameter value in the set of possible non- reference frame quantization parameter values.
55. A method as defined in claim 53, wherein the sequence of video frames includes a first sub-set of frames having a first frame resolution and a second sub-set of frames having a second frame resolution distinct from said first frame resolution, said method comprising: a) processing video frames in the first sub-set of frames to derive a first mapping between the given reference frame quantization parameter value and each non-reference frame quantization parameter value in the set of possible non-reference frame quantization parameter values; b) processing video frames in the second sub-set of frames to derive a second mapping between the given reference frame quantization parameter value and each non-reference frame quantization parameter value in the set of possible non-reference frame quantization parameter values; c) storing the derived first mapping in a first data structure on the computer readable storage medium in association with the first frame resolution; d) storing the derived first mapping in a first data structure on the computer readable storage medium in association with the second frame resolution.
56. A method as defined in claim 53, wherein the sequence of video frames includes a first sequence of frames of a first video sequence type and a second sequence of frames of a second video sequence type distinct from said first video sequence type, said method comprising: i. processing video frames in the first sequence of frames to derive to derive a first mapping between the given reference frame quantization parameter value and each non-reference frame quantization parameter value in the set of possible non- reference frame quantization parameter values; ii. processing video frames in the second sequence of frames to derive to derive a second mapping between the given reference frame quantization parameter value and each non-reference frame quantization parameter value in the set of possible non- reference frame quantization parameter values; iii. storing the derived first mapping in a first data structure on the computer readable storage medium in association with the first video sequence type; iv. storing the derived first mapping in a first data structure on the computer readable storage medium in association with the second video sequence type.
57. A computer readable storage medium storing a data structure for use in selecting a non-reference frame quantization parameter value for a video encoding processor from a set of quantization parameter values, each quantization parameter value in said set corresponding to a respective level of compression of the video encoding processor, wherein said data structure includes a plurality of sets of entries, each set of entries mapping an reference frame quantization parameter value to a plurality of non-reference frame quantization parameter values.
58. A computer readable storage medium as defined in claim 57, wherein the entries in the data structure are derived at least in part by encoding a plurality of representative video frames and observing statistical trends of resulting encoded video streams.
59. A computer readable storage medium storing a data structure for selecting a quantization parameter for use by a video encoding processor, the quantization parameter value corresponding to a certain level of compression of the video encoding processor, the entries in the data structure being generated in accordance with the method described in any one of claims 53 to 56.
60. An apparatus for generating information for use in selecting a quantization parameter value according to the method described in any one of claims 43 to 56, the quantization parameter value corresponding to a certain level of compression of the video encoding processor and being for use by a video encoding processor encoding a certain video frame.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US6132908P | 2008-06-13 | 2008-06-13 | |
US61/061,329 | 2008-06-13 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2009149564A1 true WO2009149564A1 (en) | 2009-12-17 |
Family
ID=41416326
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CA2009/000833 WO2009149564A1 (en) | 2008-06-13 | 2009-06-12 | Method and device for controlling bit-rate for video encoding, video encoding system using same and computer product therefor |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2009149564A1 (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9680901B2 (en) | 2013-03-14 | 2017-06-13 | Openwave Mobility, Inc. | Method, apparatus and non-transitory computer medium for encoding data of a media file |
US9794375B2 (en) | 2013-03-14 | 2017-10-17 | Openwave Mobility, Inc. | Method, apparatus, and non-transitory computer medium for obtaining a required frame size for a compressed data frame |
US10349059B1 (en) | 2018-07-17 | 2019-07-09 | Wowza Media Systems, LLC | Adjusting encoding frame size based on available network bandwidth |
US10356149B2 (en) | 2014-03-13 | 2019-07-16 | Wowza Media Systems, LLC | Adjusting encoding parameters at a mobile device based on a change in available network bandwidth |
WO2020089702A1 (en) * | 2018-10-31 | 2020-05-07 | Ati Technologies Ulc | Content adaptive quantization strength and bitrate modeling |
WO2020103384A1 (en) * | 2018-11-19 | 2020-05-28 | 浙江宇视科技有限公司 | Video encoding method and apparatus, electronic device, and computer readable storage medium |
CN112351276A (en) * | 2020-11-04 | 2021-02-09 | 北京金山云网络技术有限公司 | Video encoding method and device and video decoding method and device |
CN112351277A (en) * | 2020-11-04 | 2021-02-09 | 北京金山云网络技术有限公司 | Video encoding method and device and video decoding method and device |
CN115484470A (en) * | 2021-06-15 | 2022-12-16 | 武汉斗鱼鱼乐网络科技有限公司 | Method, device, medium and computer equipment for improving quality of live broadcast picture |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5535138A (en) * | 1993-11-24 | 1996-07-09 | Intel Corporation | Encoding and decoding video signals using dynamically generated quantization matrices |
US20050036699A1 (en) * | 2003-07-18 | 2005-02-17 | Microsoft Corporation | Adaptive multiple quantization |
US20050180500A1 (en) * | 2001-12-31 | 2005-08-18 | Stmicroelectronics Asia Pacific Pte Ltd | Video encoding |
US20060262847A1 (en) * | 2005-05-17 | 2006-11-23 | Benq Corporation | Method of adaptive encoding video signal and apparatus thereof |
US20070153916A1 (en) * | 2005-12-30 | 2007-07-05 | Sharp Laboratories Of America, Inc. | Wireless video transmission system |
-
2009
- 2009-06-12 WO PCT/CA2009/000833 patent/WO2009149564A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5535138A (en) * | 1993-11-24 | 1996-07-09 | Intel Corporation | Encoding and decoding video signals using dynamically generated quantization matrices |
US20050180500A1 (en) * | 2001-12-31 | 2005-08-18 | Stmicroelectronics Asia Pacific Pte Ltd | Video encoding |
US20050036699A1 (en) * | 2003-07-18 | 2005-02-17 | Microsoft Corporation | Adaptive multiple quantization |
US20060262847A1 (en) * | 2005-05-17 | 2006-11-23 | Benq Corporation | Method of adaptive encoding video signal and apparatus thereof |
US20070153916A1 (en) * | 2005-12-30 | 2007-07-05 | Sharp Laboratories Of America, Inc. | Wireless video transmission system |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9794375B2 (en) | 2013-03-14 | 2017-10-17 | Openwave Mobility, Inc. | Method, apparatus, and non-transitory computer medium for obtaining a required frame size for a compressed data frame |
US9680901B2 (en) | 2013-03-14 | 2017-06-13 | Openwave Mobility, Inc. | Method, apparatus and non-transitory computer medium for encoding data of a media file |
US10356149B2 (en) | 2014-03-13 | 2019-07-16 | Wowza Media Systems, LLC | Adjusting encoding parameters at a mobile device based on a change in available network bandwidth |
US10349059B1 (en) | 2018-07-17 | 2019-07-09 | Wowza Media Systems, LLC | Adjusting encoding frame size based on available network bandwidth |
US10560700B1 (en) | 2018-07-17 | 2020-02-11 | Wowza Media Systems, LLC | Adjusting encoding frame size based on available network bandwidth |
US10848766B2 (en) | 2018-07-17 | 2020-11-24 | Wowza Media Systems, LLC | Adjusting encoding frame size based on available network bandwith |
US11368692B2 (en) | 2018-10-31 | 2022-06-21 | Ati Technologies Ulc | Content adaptive quantization strength and bitrate modeling |
WO2020089702A1 (en) * | 2018-10-31 | 2020-05-07 | Ati Technologies Ulc | Content adaptive quantization strength and bitrate modeling |
CN112868230A (en) * | 2018-10-31 | 2021-05-28 | Ati科技无限责任公司 | Content adaptive quantization strength and bit rate modeling |
WO2020103384A1 (en) * | 2018-11-19 | 2020-05-28 | 浙江宇视科技有限公司 | Video encoding method and apparatus, electronic device, and computer readable storage medium |
US11838507B2 (en) | 2018-11-19 | 2023-12-05 | Zhejiang Uniview Technologies Co., Ltd. | Video encoding method and apparatus, electronic device, and computer-readable storage medium |
CN112351276A (en) * | 2020-11-04 | 2021-02-09 | 北京金山云网络技术有限公司 | Video encoding method and device and video decoding method and device |
CN112351277A (en) * | 2020-11-04 | 2021-02-09 | 北京金山云网络技术有限公司 | Video encoding method and device and video decoding method and device |
CN112351277B (en) * | 2020-11-04 | 2024-04-05 | 北京金山云网络技术有限公司 | Video encoding method and device and video decoding method and device |
CN112351276B (en) * | 2020-11-04 | 2024-05-31 | 北京金山云网络技术有限公司 | Video encoding method and device and video decoding method and device |
CN115484470A (en) * | 2021-06-15 | 2022-12-16 | 武汉斗鱼鱼乐网络科技有限公司 | Method, device, medium and computer equipment for improving quality of live broadcast picture |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5351040B2 (en) | Improved video rate control for video coding standards | |
WO2009149564A1 (en) | Method and device for controlling bit-rate for video encoding, video encoding system using same and computer product therefor | |
JP5180294B2 (en) | Buffer-based rate control that utilizes frame complexity, buffer level, and intra-frame location in video encoding | |
US6526097B1 (en) | Frame-level rate control for plug-in video codecs | |
Chen et al. | Recent advances in rate control for video coding | |
EP1549074A1 (en) | A bit-rate control method and device combined with rate-distortion optimization | |
JP2005236990A (en) | Video codec system equipped with real-time complexity adaptation and region-of-interest coding | |
WO2005076632A2 (en) | Encoder with adaptive rate control for h.264 | |
MXPA05002511A (en) | A method and an apparatus for controlling the rate of a video sequence; a video encoding device. | |
US20090245371A1 (en) | Method and apparatus for encoding/decoding information about intra-prediction mode of video | |
JP2007525063A (en) | How to control multipath video rate to match sliding window channel limit | |
KR101959490B1 (en) | Method for video bit rate control and apparatus thereof | |
WO2002096120A1 (en) | Bit rate control for video compression | |
CN100574442C (en) | Bit rate control method based on image histogram | |
KR20040007818A (en) | Method for controlling DCT computational quantity for encoding motion image and apparatus thereof | |
JP3779066B2 (en) | Video encoding device | |
Benyaminovich et al. | Optimal transrating via dct coefficients modification and dropping | |
Tun et al. | A novel rate control algorithm for the Dirac video codec based upon the quality factor optimization | |
Wang et al. | Bit allocation for scalable video coding of multiple video programs | |
Park et al. | An adaptive quantization using modified QP in H. 264 | |
Tan et al. | Accurate H. 264 rate control with new rate-distortion models | |
He et al. | Optimal Bit Allocation for Low Bit Rate Video Streaming Applications | |
Wu | Rate-distortion based optimal bit allocation for video streaming applications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 09761222 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 09761222 Country of ref document: EP Kind code of ref document: A1 |