US20100309976A1 - Method and apparatus for enhancing reference frame selection - Google Patents
Method and apparatus for enhancing reference frame selection Download PDFInfo
- Publication number
- US20100309976A1 US20100309976A1 US12/478,213 US47821309A US2010309976A1 US 20100309976 A1 US20100309976 A1 US 20100309976A1 US 47821309 A US47821309 A US 47821309A US 2010309976 A1 US2010309976 A1 US 2010309976A1
- Authority
- US
- United States
- Prior art keywords
- frame
- histogram
- scene
- intra
- reference frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/107—Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/14—Coding unit complexity, e.g. amount of activity or edge presence estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
- H04N19/87—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving scene cut or scene change detection in combination with video compression
Definitions
- Embodiments of the present invention generally relate to a method and apparatus for enhancing reference frame selection.
- ISP Image Signal Processing
- State-of-the-art video coding methods use block-based approaches to compress frames in a video sequence.
- DCT-based block transforms are easy to implement and, thus, are used frequently in block-based video coding.
- These block-based methods partition frames into blocks (or macroblocks) of possibly variable sizes. Later, the blocks of the current frame are matched with the previously encoded frames (inter-coding). If appropriate match can be found only the difference is usually encoded. Hence, one is capable of only sending motion vector for the location of matched prediction and transformed difference data with fewer coefficients than original image.
- scene cuts a sudden replacement of the current scene by the next scene.
- two consecutive frames would likely have different content. Therefore, such scenes have totally different coding complexity.
- This poses a significant problem because there is no coding continuity between consecutive pictures in a scene cut. Disruption of coding continuity disables inter-coding strategy. Consequently, locating scene cuts and using intra coding for the entire image in these cases improves coding efficiency by providing better bit-allocation, saves from redundant computations spent on motion prediction, and overall consumes less overhead.
- Embodiments of the present invention relate to a method and apparatus for selecting a reference frame for producing an encoded image.
- the method includes retrieving a histogram for a current frame, determining the difference between the histogram and a previous histogram, and calculating adaptive threshold utilizing the determined difference and encoding the frame as intra frame if it is an intra frame, and selecting a reference frame and encoding the frame as non-intra frame if the frame is a non-intra frame.
- a computer readable processor is any medium accessible by a computer for saving, writing, archiving, executing and/or accessing data.
- the method described herein may be coupled to a processing unit, wherein said processing unit is capable of performing the method.
- FIG. 1 is an embodiment of a block diagram of an encoding system
- FIG. 2A is an embodiment of a graph depicting distance metrics for histogram difference
- FIG. 2A is an embodiment of a graph depicting a frame differentiation
- FIG. 3 is an embodiment of a graph depicting a sequence showing variations
- FIG. 4 is an embodiment of a reference frame selection after a scene cut.
- FIG. 5 is a flow diagram depicting an embodiment of a method for a histogram based module.
- detecting scene cuts will help to enhance rate-control performance for bit-budget allocations of each frame.
- An adaptive learning method for extraction of statistics has been implemented for statistical outlier detection.
- Single-frame scene changes are considered as a clear example to study the effect of histogram-based reference frame selection.
- An improved histogram-based cost function is utilized to determine a better reference frame among the previously encoded frames.
- reference frame assessment can help video coder to reduce compression artifacts.
- Locating scene cuts and using intra coding for the entire image improves coding efficiency by providing better bit-allocation, saves from redundant computations spent on block matching and consumes less overhead.
- the reference frame selection is a key element, for example in H.264-like video encoders, because it provides the flexibility to use previous frames as the reference frame candidates.
- Such property of the codec may be utilized in deciding whether a scene-cut frame is the best choice to be a reference frame for the trailing frames.
- such a method may assist the encoder to differentiate whether a scene cut introduces a new and continuous video content or it is just observed in a single frame like the burst of a nearby camera's flash affecting a single frame.
- a histogram-based, low-complexity scene cut detection (SCD) algorithm is used to indicate scene cuts to the video encoder before encoding the current frame and an enhanced algorithm may be used for reference frame selection.
- ISP Image Signal Processing
- ISP chip extracts and uses image histogram information for pre-processing image data, for example, Gamma correction.
- the image histograms from ISP chips can be used to determine existence of a scene cut in the current frame before it is encoded. Frame difference between consecutive frames may be used for SCD decision; however, utilizing frame difference requires extra computational power and extra memory-bandwidth resources for loading current and previous frames. Since the dimensionality of the image histograms is much smaller than the frame size, a histogram-based method will not have these restrictions.
- an on-the-fly, adaptive, and low-complexity scene change detection algorithm which uses image histograms of the current and previous frames and a robust scene change detection that utilizes weighting channel histograms.
- one embodiment utilizes histograms of frames to select a new reference frame.
- One embodiment incorporates an adaptive thresholding mechanism that detects statistical outliers in the observations.
- FIG. 1 is an embodiment of a block diagram of an encoding system 100 .
- the method includes a camera lens 102 , a charged coupled device 104 104 , an Image signal Processing (ISP) 106 , a video encoder 108 , and a histogram-based module 110 .
- the histogram-based module 110 includes a scene cut detection (SCD) unit 112 and a picture coding type analysis (PCT) module 114 .
- SCD scene cut detection
- PCT picture coding type analysis
- Image projected on a CCD chip by a camera lens 102 goes first to ISP 106 for pre-processing.
- ISP 106 outputs two data types. First one the pre-processes image that is sent to video encoder 108 and the histograms of that image that go to SCD 112 . These histograms can be Luminance, chrominance or color histograms. Histograms are combined in histogram-based module 110 , where a new histogram is created.
- SCD 112 measures the distance between current and previous histograms. This distance is added to the previously found histogram distance. A threshold value is calculated by an update mechanism and compared against the current measurement. If current measurement exceeds the threshold value, scene cut will be detected and depending on the current encoding settings either that frame will be coded as Intra frame (I-frame) or will remain as it is (as a I- or P-frame). The method of the histogram-based module will be discussed in details in FIG. 5 .
- the first step is to create a new feature vector that can best represent variations in the image.
- a histogram essentially shows the number of occurrences of intensity values in a given image.
- the ISP 106 can provide histograms of different color and illumination component of the observed scene; thus, this utilizes a new weighted histogram feature that combines different characteristics of these channels for SCD problem. For instance, scene changes that include illumination changes can be detected by using Luminance component (Y). However, if a scene cut has dominant color changes rather than illumination, using just histogram of Luminance will not suffice. Thus, following weighting scheme is proposed as a new feature vector for the observed images
- K is the number of channels (luminance, chrominance, color etc.) and M is the number of bin for the histograms.
- a histogram is a histogram obtained utilizing Eq. (1).
- CC Correlation Coefficient
- SAD Sum of Absolute Differences
- SSD Sum of Squared Differences
- Hist_curr ⁇ ⁇ ⁇ Histogram ⁇ ⁇ of ⁇ ⁇ current ⁇ ⁇ frame ⁇ ⁇ Hist_prev ⁇ : ⁇ ⁇ Histogram ⁇ ⁇ of ⁇ ⁇ previous ⁇ ⁇ frame ⁇ ⁇ Dist ⁇ ( a , b ) ⁇ : ⁇ ⁇ S ⁇ ⁇ S ⁇ ⁇ D ⁇ ⁇ between ⁇ ⁇ vectors ⁇ ⁇ a ⁇ ⁇ and ⁇ ⁇ b .
- a new measure for scene cut is defined as the absolute difference between consecutive histogram distances.
- This metric measures the accumulated variations observed in consecutive histogram distances. The metric is defined as,
- histograms are normalized to have unit sum. Shown in FIG. 2A is the distance between consecutive histograms measured by Eq. (1) and in FIG. 2B the measure ‘Change’ (Eq. (3)) in sequence dlp — 352x288_cif.yuv. As shown in the FIG. 2A and B, the variation within the first 200 frames is reduced by using ‘Change’ metric. Also it is important to note that in the beginning of coding first N frames have to arrive before a SCD can be signaled to the encoder. Therefore, choice of N should be small. In this embodiment, N is set to five (5) and the results are reasonable.
- adaptive threshold is utilized that is drawn from learned statistics of the Change metric. It should be noted that any metric may be utilized. Therefore, SAD, CC, SSD or any difference metric between frames or histograms may be used to locate scene change by the following adaptive threshold.
- ⁇ i can be called as the moving average of measured observations before i'th frame.
- ⁇ i is the moving standard deviation (or variance).
- k defines the confidence interval.
- Adaptation of the threshold value is accomplished by a simple update procedure for moving average and moving variance. The update is controlled by a parameter such that the rate of adaptation of threshold to the observations is managed. This kind of schemes that permit learning and forgetting rates under control can be implemented as follows:
- ⁇ i 2 (1 ⁇ i ) ⁇ i ⁇ 1 2 + ⁇ i .(Change( i ⁇ 1) ⁇ i ) 2
- ⁇ i is called learning rate.
- Learning rate can either be selected as a fixed number or it can be made adaptive too as given,
- ⁇ i ⁇ (Change( i )
- ⁇ ⁇ ( x ⁇ ⁇ x , ⁇ x ) 1 2 ⁇ ⁇ ⁇ ⁇ ⁇ x ⁇ ⁇ - 1 2 ⁇ ⁇ ⁇ x 2 ⁇ ( x - ⁇ x ) 2 . ( 6 )
- the advantage of having adaptive learning rate is the ability to control the effect of outlier observations to the moving average and variance adaptively. For instance, occurrence of outlier values would be less likely. Thus, learning rate, ⁇ , will have small value. Small ⁇ value affects the update mechanism of Eq. 4.
- T lim limiting threshold
- FIG. 4 is an embodiment of a reference frame selection after a scene cut.
- FIG. 4 shows such a case when illumination changes just in one frame with a close by camera's flash. In this case, only scene cut frame disturbs the continuity of video coding. Fortunately, H.264-like video coding strategies enable one to control the reference frame for the current frame. Therefore, intra-coded scene cut frame are not required as the reference frame for the frames trailing it. Note in the FIG. 4 after scene cut frame, we do not detect another scene cut because of the limit in Eq. (6) and Eq. (7) for the number of consecutive scene cuts.
- FIG. 5 is a flow diagram depicting an embodiment of a method 500 for a histogram based module.
- the method 500 starts at step 502 and proceeds to step 504 .
- the method 500 retrieves a current histogram.
- the method 500 determines the distance difference between the current histogram and the previous histogram.
- the method 500 calculates the adaptive threshold.
- the method 500 determines the picture coding type.
- the method 500 determines if the frame is an Intra frame (I-frame). If the frame is an I-frame, the method 500 proceeds to step 514 , wherein the method 500 encodes it as an I-frame for bit-allocation.
- I-frame Intra frame
- the method 500 proceeds to step 516 .
- the method 500 selected a reference frame and proceeds to step 518 .
- the method 500 encodes the frame as a non I-frame. From steps 514 and 518 , the method 500 proceeds and ends at step 516 .
- the proposed method and apparatus use image histograms that may be from ISP chip and since histogram has much smaller dimensionality compared to a frame, the proposed method and apparatus are low in complexity and do not introduce delay. Consequently, a fast, on-the-fly decision about the existence of a scene cut and reference frame selection for the current frame is made, without using extra memory-bandwidth.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
A method and apparatus for selecting a reference frame for producing an encoded image. The method includes retrieving a histogram for a current frame, determining the difference between the histogram and a previous histogram, and calculating adaptive threshold utilizing the determined difference and encoding the frame as intra frame if it is an intra frame, and selecting a reference frame and encoding the frame as non-intra frame if the frame is a non-intra frame.
Description
- 1. Field of the Invention
- Embodiments of the present invention generally relate to a method and apparatus for enhancing reference frame selection.
- Most portable devices with video applications require on-chip video encoders that can process video sequences on-the-fly. An acquired image of a scene, which is projected on CCD chip by camera lens, is pre-processed by an Image Signal Processing (ISP) chip before video encoding.
- State-of-the-art video coding methods use block-based approaches to compress frames in a video sequence. DCT-based block transforms are easy to implement and, thus, are used frequently in block-based video coding. These block-based methods partition frames into blocks (or macroblocks) of possibly variable sizes. Later, the blocks of the current frame are matched with the previously encoded frames (inter-coding). If appropriate match can be found only the difference is usually encoded. Hence, one is capable of only sending motion vector for the location of matched prediction and transformed difference data with fewer coefficients than original image.
- However, when no match is found for a particular block, that block will be encoded by intra-coding. With intra-coding, no relative information from previous frames is used. As a result, such a block will be encoded by its own information. Occurrence of many intra-coded blocks in an inter-coded frame (Inter frame) actually signals a significant change of the content in the video sequence.
- Among various kinds of scene changes, a sudden replacement of the current scene by the next scene is called as scene cuts. At a scene cut, two consecutive frames would likely have different content. Therefore, such scenes have totally different coding complexity. This poses a significant problem because there is no coding continuity between consecutive pictures in a scene cut. Disruption of coding continuity disables inter-coding strategy. Consequently, locating scene cuts and using intra coding for the entire image in these cases improves coding efficiency by providing better bit-allocation, saves from redundant computations spent on motion prediction, and overall consumes less overhead.
- Traditionally, frame difference information between frames is used to assess presence of a scene cut in the observed data or for reference frame selection. However, there are some disadvantages of such an implementation. First, taking frame differences requires additional computations, and second, sufficient memory-bandwidth is needed to read frames. Moreover, these procedures introduce time-delay to the entire process.
- Therefore, there is a need for an improved method and apparatus for detecting scene cuts and reference frame selection.
- Embodiments of the present invention relate to a method and apparatus for selecting a reference frame for producing an encoded image. The method includes retrieving a histogram for a current frame, determining the difference between the histogram and a previous histogram, and calculating adaptive threshold utilizing the determined difference and encoding the frame as intra frame if it is an intra frame, and selecting a reference frame and encoding the frame as non-intra frame if the frame is a non-intra frame.
- So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of its scope, for the invention may admit to other equally effective embodiments. In this application, a computer readable processor is any medium accessible by a computer for saving, writing, archiving, executing and/or accessing data. Furthermore, the method described herein may be coupled to a processing unit, wherein said processing unit is capable of performing the method.
-
FIG. 1 is an embodiment of a block diagram of an encoding system; -
FIG. 2A is an embodiment of a graph depicting distance metrics for histogram difference; -
FIG. 2A is an embodiment of a graph depicting a frame differentiation; -
FIG. 3 is an embodiment of a graph depicting a sequence showing variations; -
FIG. 4 is an embodiment of a reference frame selection after a scene cut; and -
FIG. 5 is a flow diagram depicting an embodiment of a method for a histogram based module. - The sudden scene changes in video sequences challenge both video quality and bit-rate during video encoding process. If a scene change occurs at a frame that is intended to be coded as an inter frame, most of the macroblocks in that frame will be intra coded which will reduce coding efficiency. Therefore, making a decision for intra frame coding at scene cuts will both save from unnecessary computations done for prediction, in the inter frame coding, and improve visual quality of the encoded video.
- Also detecting scene cuts will help to enhance rate-control performance for bit-budget allocations of each frame. Thus, using histograms of current and previous frames to make scene change decision. An adaptive learning method for extraction of statistics has been implemented for statistical outlier detection. Moreover. Utilizing histograms for detecting scene cuts and considering flexible reference frame selection scheme in video codec like H.264, one can select better reference frame for each frame from the previously encoded K frames using correlations between histograms.
- Single-frame scene changes are considered as a clear example to study the effect of histogram-based reference frame selection. To describe briefly, there are cases when the continuity of the content of a video can be disturbed by a single frame such as sudden burst of a nearby camera's flash. Detection of these frames plays important role for reference frame selection. An improved histogram-based cost function is utilized to determine a better reference frame among the previously encoded frames. Combined with scene cut detection algorithm, reference frame assessment can help video coder to reduce compression artifacts.
- Locating scene cuts and using intra coding for the entire image improves coding efficiency by providing better bit-allocation, saves from redundant computations spent on block matching and consumes less overhead. In order to increase coding efficiency, the reference frame selection is a key element, for example in H.264-like video encoders, because it provides the flexibility to use previous frames as the reference frame candidates. Such property of the codec may be utilized in deciding whether a scene-cut frame is the best choice to be a reference frame for the trailing frames. Moreover, such a method may assist the encoder to differentiate whether a scene cut introduces a new and continuous video content or it is just observed in a single frame like the burst of a nearby camera's flash affecting a single frame. Thus, a histogram-based, low-complexity scene cut detection (SCD) algorithm is used to indicate scene cuts to the video encoder before encoding the current frame and an enhanced algorithm may be used for reference frame selection.
- In video cameras, acquired image of a scene is projected on CCD chip by camera lens and through a pre-processing done by an Image Signal Processing (ISP) chip before video encoding step. Thus, incorporating readily available data from ISP chip for video encoder improves visual quality of encoded video sequence by signaling scene cut information.
- ISP chip extracts and uses image histogram information for pre-processing image data, for example, Gamma correction. The image histograms from ISP chips can be used to determine existence of a scene cut in the current frame before it is encoded. Frame difference between consecutive frames may be used for SCD decision; however, utilizing frame difference requires extra computational power and extra memory-bandwidth resources for loading current and previous frames. Since the dimensionality of the image histograms is much smaller than the frame size, a histogram-based method will not have these restrictions.
- As such, described herein is an on-the-fly, adaptive, and low-complexity scene change detection algorithm which uses image histograms of the current and previous frames and a robust scene change detection that utilizes weighting channel histograms. Hence, one embodiment utilizes histograms of frames to select a new reference frame. One embodiment incorporates an adaptive thresholding mechanism that detects statistical outliers in the observations.
-
FIG. 1 is an embodiment of a block diagram of anencoding system 100. The method includes acamera lens 102, a charged coupleddevice 104 104, an Image signal Processing (ISP) 106, avideo encoder 108, and a histogram-basedmodule 110. The histogram-basedmodule 110 includes a scene cut detection (SCD)unit 112 and a picture coding type analysis (PCT)module 114. - Image projected on a CCD chip by a
camera lens 102 goes first toISP 106 for pre-processing. Usually, only Inter frame coding type isP. ISP 106 outputs two data types. First one the pre-processes image that is sent tovideo encoder 108 and the histograms of that image that go toSCD 112. These histograms can be Luminance, chrominance or color histograms. Histograms are combined in histogram-basedmodule 110, where a new histogram is created. - Next,
SCD 112 measures the distance between current and previous histograms. This distance is added to the previously found histogram distance. A threshold value is calculated by an update mechanism and compared against the current measurement. If current measurement exceeds the threshold value, scene cut will be detected and depending on the current encoding settings either that frame will be coded as Intra frame (I-frame) or will remain as it is (as a I- or P-frame). The method of the histogram-based module will be discussed in details inFIG. 5 . - The first step is to create a new feature vector that can best represent variations in the image. A histogram essentially shows the number of occurrences of intensity values in a given image. The
ISP 106 can provide histograms of different color and illumination component of the observed scene; thus, this utilizes a new weighted histogram feature that combines different characteristics of these channels for SCD problem. For instance, scene changes that include illumination changes can be detected by using Luminance component (Y). However, if a scene cut has dominant color changes rather than illumination, using just histogram of Luminance will not suffice. Thus, following weighting scheme is proposed as a new feature vector for the observed images -
- where K is the number of channels (luminance, chrominance, color etc.) and M is the number of bin for the histograms. Throughout the document, a histogram is a histogram obtained utilizing Eq. (1).
- Next is determining the distance metric that enables robust differentiation between histograms. There are three possible distance metrics that can be used to measure difference between histograms. These metrics are Correlation Coefficient (CC), Sum of Absolute Differences (SAD) and Sum of Squared Differences SSD. For two vector v1 and v2, these metrics can be written as
-
- SSD between histograms of consecutive frames exhibits more stable statistical characteristics than any other candidate distance metrics. If each histogram is defined as a vector in M dimensional space where M is the number of bins, we have the following formulation as the distance metric between two histograms,
-
- In one embodiment, a new measure for scene cut is defined as the absolute difference between consecutive histogram distances. This metric, called ‘Change’ measures the accumulated variations observed in consecutive histogram distances. The metric is defined as,
-
- Change(k): Metric that measures the dissimilarity between current frame and previous N frames.
- Hist[k]: Histogram of kth frame
- Dist(a,b): Euclidean Distance between vectors a and b.
-
- In one embodiment, histograms are normalized to have unit sum. Shown in
FIG. 2A is the distance between consecutive histograms measured by Eq. (1) and inFIG. 2B the measure ‘Change’ (Eq. (3)) in sequence dlp—352x288_cif.yuv. As shown in theFIG. 2A and B, the variation within the first 200 frames is reduced by using ‘Change’ metric. Also it is important to note that in the beginning of coding first N frames have to arrive before a SCD can be signaled to the encoder. Therefore, choice of N should be small. In this embodiment, N is set to five (5) and the results are reasonable. - After measuring the dissimilarity between consecutive frames, the threshold is utilized to separate scene cuts from regular flow of video content. Therefore, having a statistical stable metric for frame differentiation makes sense. Nevertheless, applying a fix threshold usually may not work. In one embodiment, adaptive threshold is utilized that is drawn from learned statistics of the Change metric. It should be noted that any metric may be utilized. Therefore, SAD, CC, SSD or any difference metric between frames or histograms may be used to locate scene change by the following adaptive threshold.
- Decision for a scene cut for the i'th frame can be given as follows;
-
- where label ‘1’ denotes scene change at that frame. μi can be called as the moving average of measured observations before i'th frame. Similarly, σi is the moving standard deviation (or variance). Here, k defines the confidence interval. Adaptation of the threshold value is accomplished by a simple update procedure for moving average and moving variance. The update is controlled by a parameter such that the rate of adaptation of threshold to the observations is managed. This kind of schemes that permit learning and forgetting rates under control can be implemented as follows:
-
μi=(1−αi)μi−1−αi.Change(i−1) (4) -
σi 2=(1−αi)σi−1 2+αi.(Change(i−1)−μi)2 - where αi is called learning rate. Learning rate can either be selected as a fixed number or it can be made adaptive too as given,
-
αi=η(Change(i)|μi−1,σi−1) (5) - for η is a Gaussian distribution for given mean and variance,
-
- The advantage of having adaptive learning rate is the ability to control the effect of outlier observations to the moving average and variance adaptively. For instance, occurrence of outlier values would be less likely. Thus, learning rate, α, will have small value. Small α value affects the update mechanism of Eq. 4.
- Note that in such a case the current observation (i.e, Change(i)) will have less influence on moving average and moving variance. On the other hand, if two many outliers are detected successively; these values pull up the moving average which actually signals a change in statistics of observations. Essentially, a small α value slow downs the adaptation of threshold. Hence, the threshold learns and forgets the observations slowly. On the contrarily, threshold adapts faster to the observation if α value is large.
- The choice of using an adaptive learning rate and N value in Eq. (3) has a close relation. If N is greater than one, a scene change will affect subsequent observations of the defined measure in Eq. (3) while the threshold will remain unchanged and a false detection would occur. Therefore, if adaptive learning rate is used, N has to be kept one (1). On the other hand, in some applications evaluating Eq. (5) or keeping a look-up table for that would be undesirable, in such cases, learning level can be fixed to a value between zero (0) to one (1). Adaptation gets faster as learning rate gets close to one. In our experiments, we adopt 0.6 as the learning rate of the algorithm for such cases.
- Another consideration that we addressed is to limit the number of consecutive Intra frames. Having a cluster of consecutive Intra frames would increase bit-rate, thus we have a limit on the number of consecutive frames labeled as scene cut. This can be accomplished by enforcing addition constraint to have a single scene cut in L frames; the equation follows as,
-
- In case of a static scene, where nothing much changes in the content, the proposed metric Change will tend to become zero (0) as time goes on. Therefore, even a subtle change in color or illumination may be detected as scene cut. This is because the threshold value (μi+k.σi) approaches to zero (0). In order to reduce such false detections, a bottom limit for Change metric is used in this work. This bottom limit, which can be called as limiting threshold (Tlim), is a constant obtained by experimental observations. In our experiments, Tlim is set to 0.01*N where N is defined in Eq. (3). Final decision is given according to the following formulation,
-
- There are many disadvantages to use just image histograms for making SCD decision. Primarily, one can find two totally different images that have very similar histogram. Moreover, even if the content of two images is the same, changes in global or local illumination might hinder an accurate SCD decision. Detecting scene changes created artificially by fade-ins and fade-outs; also, it brings challenges due to smooth transitions between consecutive histograms. Although these issues pose a difficult problem in general, the proposed method provides the detection performance for a video quality improvement.
- Another important factor that affects video coding quality is the selection of reference frame, i.e. in for H.264-like video coders, that provide such flexibility. The proposed solution for determining best reference frame among the previously encoded frames uses frame histograms. General case of locating reference frame is explained by single-frame scene cut example. Locating a single-frame scene cut before encoding by the SCD algorithm solves the first part of video coding efficiency problem by encoding that frame as an I-frame for given bit-allocation. However, the next frame following the scene cut will have less correlation with the frame labeled as scene cut than the ones preceding it.
-
FIG. 4 is an embodiment of a reference frame selection after a scene cut.FIG. 4 shows such a case when illumination changes just in one frame with a close by camera's flash. In this case, only scene cut frame disturbs the continuity of video coding. Fortunately, H.264-like video coding strategies enable one to control the reference frame for the current frame. Therefore, intra-coded scene cut frame are not required as the reference frame for the frames trailing it. Note in theFIG. 4 after scene cut frame, we do not detect another scene cut because of the limit in Eq. (6) and Eq. (7) for the number of consecutive scene cuts. -
FIG. 5 is a flow diagram depicting an embodiment of amethod 500 for a histogram based module. Themethod 500 starts atstep 502 and proceeds to step 504. Atstep 504, themethod 500 retrieves a current histogram. Atstep 506, themethod 500 determines the distance difference between the current histogram and the previous histogram. Atstep 508, themethod 500 calculates the adaptive threshold. Atstep 510, themethod 500 determines the picture coding type. Atstep 512, themethod 500 determines if the frame is an Intra frame (I-frame). If the frame is an I-frame, themethod 500 proceeds to step 514, wherein themethod 500 encodes it as an I-frame for bit-allocation. Otherwise, themethod 500 proceeds to step 516. Atstep 516, themethod 500 selected a reference frame and proceeds to step 518. Atstep 518, themethod 500 encodes the frame as a non I-frame. Fromsteps method 500 proceeds and ends atstep 516. - Since the proposed method and apparatus use image histograms that may be from ISP chip and since histogram has much smaller dimensionality compared to a frame, the proposed method and apparatus are low in complexity and do not introduce delay. Consequently, a fast, on-the-fly decision about the existence of a scene cut and reference frame selection for the current frame is made, without using extra memory-bandwidth.
- While the foregoing is directed to embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.
Claims (4)
1. A method for selecting a reference frame for producing an encoded image, comprising
retrieving a histogram for a current frame;
determining the difference between the histogram and a previous histogram; and
calculating adaptive threshold utilizing the determined difference and encoding the frame as intra frame if it is an intra frame, and selecting a reference frame and encoding the frame as non-intra frame if the frame is a non-intra frame.
2. The method of claim 1 , wherein the threshold is determined by an adaptive thresholding mechanism that detects statistical outliers in the observations.
3. A apparatus for selecting a reference frame for producing an encoded image, comprising:
means for retrieving a histogram for a current frame;
means for determining the difference between the histogram and a previous histogram; and
means for calculating adaptive threshold utilizing the determined difference and means for encoding the frame as intra frame if it is an intra frame, and means for selecting a reference frame and means for encoding the frame as non-intra frame if the frame is a non-intra frame.
4. The apparatus of claim 3 , wherein the threshold is determined by an adaptive thresholding mechanism that detects statistical outliers in the observations.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/478,213 US20100309976A1 (en) | 2009-06-04 | 2009-06-04 | Method and apparatus for enhancing reference frame selection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/478,213 US20100309976A1 (en) | 2009-06-04 | 2009-06-04 | Method and apparatus for enhancing reference frame selection |
Publications (1)
Publication Number | Publication Date |
---|---|
US20100309976A1 true US20100309976A1 (en) | 2010-12-09 |
Family
ID=43300730
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/478,213 Abandoned US20100309976A1 (en) | 2009-06-04 | 2009-06-04 | Method and apparatus for enhancing reference frame selection |
Country Status (1)
Country | Link |
---|---|
US (1) | US20100309976A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110051010A1 (en) * | 2009-08-27 | 2011-03-03 | Rami Jiossy | Encoding Video Using Scene Change Detection |
US20140301486A1 (en) * | 2011-11-25 | 2014-10-09 | Thomson Licensing | Video quality assessment considering scene cut artifacts |
US20150288973A1 (en) * | 2012-07-06 | 2015-10-08 | Intellectual Discovery Co., Ltd. | Method and device for searching for image |
CN112351278A (en) * | 2020-11-04 | 2021-02-09 | 北京金山云网络技术有限公司 | Video encoding method and device and video decoding method and device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5592226A (en) * | 1994-01-26 | 1997-01-07 | Btg Usa Inc. | Method and apparatus for video data compression using temporally adaptive motion interpolation |
US6738099B2 (en) * | 2001-02-16 | 2004-05-18 | Tektronix, Inc. | Robust camera motion estimation for video sequences |
US7751473B2 (en) * | 2000-05-15 | 2010-07-06 | Nokia Corporation | Video coding |
-
2009
- 2009-06-04 US US12/478,213 patent/US20100309976A1/en not_active Abandoned
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5592226A (en) * | 1994-01-26 | 1997-01-07 | Btg Usa Inc. | Method and apparatus for video data compression using temporally adaptive motion interpolation |
US7751473B2 (en) * | 2000-05-15 | 2010-07-06 | Nokia Corporation | Video coding |
US6738099B2 (en) * | 2001-02-16 | 2004-05-18 | Tektronix, Inc. | Robust camera motion estimation for video sequences |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110051010A1 (en) * | 2009-08-27 | 2011-03-03 | Rami Jiossy | Encoding Video Using Scene Change Detection |
US20140301486A1 (en) * | 2011-11-25 | 2014-10-09 | Thomson Licensing | Video quality assessment considering scene cut artifacts |
US20150288973A1 (en) * | 2012-07-06 | 2015-10-08 | Intellectual Discovery Co., Ltd. | Method and device for searching for image |
CN112351278A (en) * | 2020-11-04 | 2021-02-09 | 北京金山云网络技术有限公司 | Video encoding method and device and video decoding method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR100901904B1 (en) | Video content understanding through real time video motion analysis | |
US6834080B1 (en) | Video encoding method and video encoding apparatus | |
US20180139456A1 (en) | Analytics-modulated coding of surveillance video | |
JP5969389B2 (en) | Object recognition video coding strategy | |
US8036263B2 (en) | Selecting key frames from video frames | |
CN111670580B (en) | Progressive compressed domain computer vision and deep learning system | |
CN101072342B (en) | Situation switching detection method and its detection system | |
US7986847B2 (en) | Digital video camera with a moving image encoding feature and control method therefor, that selectively store decoded images as candidate reference images | |
US7551234B2 (en) | Method and apparatus for estimating shot boundaries in a digital video sequence | |
CN109104609B (en) | Shot boundary detection method fusing HEVC (high efficiency video coding) compression domain and pixel domain | |
US20100322300A1 (en) | Method and apparatus for adaptive feature of interest color model parameters estimation | |
US20100303150A1 (en) | System and method for cartoon compression | |
US20100302453A1 (en) | Detection of gradual transitions in video sequences | |
CN101352029A (en) | Randomly sub-sampled partition voting(RSVP) algorithm for scene change detection | |
KR20060075204A (en) | The apparatus for detecting the homogeneous region in the image using the adaptive threshold value | |
US8421928B2 (en) | System and method for detecting scene change | |
US20200380290A1 (en) | Machine learning-based prediction of precise perceptual video quality | |
US20100309976A1 (en) | Method and apparatus for enhancing reference frame selection | |
US20130155228A1 (en) | Moving object detection method and apparatus based on compressed domain | |
JP3714871B2 (en) | Method for detecting transitions in a sampled digital video sequence | |
WO2020248715A1 (en) | Coding management method and apparatus based on high efficiency video coding | |
KR20060132977A (en) | Video processing method and corresponding encoding device | |
CN109769120B (en) | Method, apparatus, device and medium for determining skip coding mode based on video content | |
JP2002281508A (en) | Skip area detection type moving image encoder and recording medium | |
KR20020040503A (en) | Shot detecting method of video stream |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: TEXAS INSTRUMENTS INCORPORATED, TEXAS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SEZER, OSMAN G.;ZHOU, MINHUA;REEL/FRAME:022780/0862 Effective date: 20090601 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |