US10230955B2 - Method and system to detect and utilize attributes of frames in video sequences - Google Patents
Method and system to detect and utilize attributes of frames in video sequences Download PDFInfo
- Publication number
- US10230955B2 US10230955B2 US15/105,558 US201415105558A US10230955B2 US 10230955 B2 US10230955 B2 US 10230955B2 US 201415105558 A US201415105558 A US 201415105558A US 10230955 B2 US10230955 B2 US 10230955B2
- Authority
- US
- United States
- Prior art keywords
- frame
- component
- chroma
- macroblock
- pixel values
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 230000000087 stabilizing effect Effects 0.000 claims abstract description 11
- 241000023320 Luma <angiosperm> Species 0.000 claims description 46
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 claims description 46
- 230000011218 segmentation Effects 0.000 claims description 33
- 230000008859 change Effects 0.000 claims description 31
- 238000005562 fading Methods 0.000 claims description 14
- 238000003491 array Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 4
- 238000004891 communication Methods 0.000 description 61
- 239000004576 sand Substances 0.000 description 50
- 101100041681 Takifugu rubripes sand gene Proteins 0.000 description 26
- 238000004364 calculation method Methods 0.000 description 26
- 238000011156 evaluation Methods 0.000 description 23
- 230000005540 biological transmission Effects 0.000 description 9
- 230000007704 transition Effects 0.000 description 6
- 238000013459 approach Methods 0.000 description 5
- 230000007246 mechanism Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000006978 adaptation Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000033001 locomotion Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 239000003086 colorant Substances 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 238000001851 vibrational circular dichroism spectroscopy Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/142—Detection of scene cut or scene change
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/115—Selection of the code volume for a coding unit prior to coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/179—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a scene or a shot
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/182—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
Definitions
- the present invention relates to the field of video segmentation, and in particular, relates to methods and systems that detect and utilize attributes of frames in video sequences.
- This video data can be in the form of commercial DVDs and VCDs, personal camcorder recordings, off-air recordings onto HDD and DVR systems, video downloads on a personal computer or mobile phone or PDA or portable player, and the like.
- VCDs digital versatile disks
- PDA or portable player video downloads on a personal computer or mobile phone or PDA or portable player, and the like.
- new automatic video management technologies are being developed that allow users efficient access to their video content and functionalities such as video categorization, summarization, searching and the like.
- First step of the analysis includes segmentation of the video into its constituent shots.
- a shot can be defined as a sequence of video frames obtained by one camera without being interrupted.
- the video is generally configured of a connection of many shots and various video editing effects are used according to methods of connecting the shots. For example, an hour of a TV program typically contains 1000 shots.
- the video editing effects include an abrupt shot transition and a gradual shot transition.
- the abrupt shot transition (generally referred to as hard cut) is a technique that the current picture is abruptly changed into another picture.
- the gradual shot transition is a technique that a picture is gradually changed into another picture such as fade, dissolve, wipe and other special effects.
- a common example of the gradual shot transition is the fade, whereby intensity of a shot gradually drops, ending at a black monochromatic frame (fade-out), or the intensity of a black monochromatic frame gradually increases until actual shot becomes visible at its normal intensity (fade-in). Fades to and from black are more common, but fades involving monochromatic frames of other colors are also used.
- Another example of the gradual shot transition is the dissolve, which can be envisaged as a combined fade-out and fade-in. The dissolve involves two shots, overlapping for a number of frames, during which time the first shot gradually dims and the second shot becomes gradually more distinct.
- various methods and systems are used for identifying and utilizing the different frame characteristics of a video sequence such as Scene Change, the Fading in, the Fading out and the Dissolve.
- Some of the present methods and systems use comparisons of segmentation mask maps between two successive video frames.
- object tracking technique is employed as a complement to handle situations of scene rotation without any extra overhead.
- Some methods and systems use a two-phase reject-to-refine strategy. According to this strategy, the frames are tested against mean absolute frame differences (MAFD) with a relaxed threshold. Then, these frames are further examined by combined metrics of signed difference of mean absolute frame difference (SDMAFD), absolute difference of frame variance (ADFV), and MAFD after normalization. This approach can be referred to as a histogram equalization process.
- Some other methods and systems combine the intensity and motion information to detect the scene changes. Most of these approaches have higher overhead. Further, these methods and systems are complex and not very effective in detecting the scene changes.
- a method for determining one or more attributes of each frame of a plurality of frames of a video includes evaluating a first set of pre-defined values for the each frame of the plurality of frames, determining a second set of pre-defined values for the each frame of the plurality of frames based on a second pre-determined criterion, computing a third pre-defined value for the each frame based on a third pre-determined criterion, and identifying the one or more attributes of the each frame.
- the identified one or more attributes is utilized for stabilizing a rate control model of an encoder.
- the evaluating is based on a first pre-determined criterion.
- the first pre-determined criterion includes categorizing one or more pixel values of each macroblock of the each frame based on one or more components of the each frame and calculating the first set of pre-defined values from each of the one or more components of the each frame based on a pre-defined criterion.
- the second pre-determined criterion includes calculation of the second set of pre-defined values for the each frame for a component of the one or more components.
- the third pre-determined criterion includes subtracting the first set of predefined values of a previous frame from the first set of pre-defined values of a present frame, and the second set of pre-defined values of a previous frame from the second set of predefined values of a present frame.
- the method further includes storing the identified one or more attributes of the each frame.
- the method further includes transmitting data corresponding to the identified one or more attributes to the rate control model of the encoder.
- the one or more attributes includes scene change, fading-in, fading out, dissolve and the like.
- the one or more components include one or more luma components and one or more chroma components.
- the pre-defined criterion includes finding an average value of the one or more pixel values of the each macroblock of the each frame, subtracting the average value from each of the one or more pixel values of the each macroblock and determining sum of the average subtracted values of each of the macroblock.
- the determined sum of the average subtracted values of each of the macroblock is stored in one or more arrays.
- the component of the one or more components is one or more luma components.
- a video segmentation system for determining one or more attributes of each frame of a plurality of frames of a video.
- the system includes an evaluation module to evaluate a first set of pre-defined values for the each frame of the plurality of frames, a determination module to determine a second set of pre-defined values for the each frame of the plurality of frames based on a second pre-determined criterion, a computational module to compute a third pre-defined value for the each frame based on a third pre-determined criterion, and an identification module to identify the one or more attributes of the each frame.
- the identified one or more attributes is utilized for stabilizing a rate control model of an encoder.
- the first set of predefined values is evaluated based on a first pre-determined criterion.
- the second predetermined criterion includes calculation of the second set of pre-defined values for the each frame for a component of the one or more components.
- the third pre-determined criterion includes subtracting the first set of pre-defined values of a previous frame from the first set of pre-defined values of a present frame, and the second set of pre-defined values of a previous frame from the second set of pre-defined values of a present frame.
- the evaluation module is further configured to categorize one or more pixel values of each macroblock of the each frame based on one or more components of the each frame.
- the evaluation module is further configured to calculate the first set of pre-defined values from each of the one or more components of the each frame based on a pre-defined criterion.
- the video segmentation system further includes a storage module to store the one or more attributes of the each frame.
- the video segmentation system further includes a transmission module to transmit data corresponding to the identified one or more attributes to the rate control model of the encoder.
- the one or more attributes includes scene change, fading-in, fading out, dissolve and the like.
- the one or more components include one or more luma components and one or more chroma components.
- the pre-defined criterion includes finding an average value of the one or more pixel values of the each macroblock of the each frame, subtracting the average value from each of the one or more pixel values of the each macroblock and determining sum of the average subtracted values of each of the macroblock.
- the determined sum of the average subtracted values of each of the macroblock is stored in one or more arrays.
- the component of the one or more components is one or more luma components.
- a computer system for determining one or more attributes of each frame of a plurality of frames of a video.
- the computer system includes one or more processors and a non-transitory memory containing instructions that, when executed by the one or more processors, causes the one or more processors to perform a set of steps.
- the set of steps includes evaluating a first set of predefined values for the each frame of the plurality of frames, determining a second set of predefined values for the each frame of the plurality of frames based on a second pre-determined criterion, computing a third pre-defined value for the each frame based on a third pre-determined criterion and identifying the one or more attributes of the each frame.
- the identified one or more attributes is utilized for stabilizing a rate control model of an encoder.
- the evaluating is based on a first pre-determined criterion.
- the first pre-determined criterion includes categorizing one or more pixel values of each macroblock of the each frame based on one or more components of the each frame and calculating the first set of pre-defined values from each of the one or more components of the each frame based on a pre-defined criterion.
- the second pre-determined criterion includes calculation of the second set of pre-defined values for the each frame for a component of the one or more components.
- the third pre-determined criterion includes subtracting the first set of pre-defined values of a previous frame from the first set of pre-defined values of a present frame, and the second set of predefined values of a previous frame from the second set of pre-defined values of a present frame.
- the non-transitory memory containing instructions that, when executed by the one or more processors, cause the one or more processors to perform a further step of transmitting data corresponding to the identified one or more attributes to the rate control model of the encoder.
- FIG. 1 illustrates a video segmentation system for determining one or more attributes of each frame, in accordance with various embodiments of the present disclosure
- FIG. 2 illustrates a block diagram of the video segmentation system, in accordance with various embodiments of the present disclosure
- FIGS. 3A, 3B, 3C, and 3D illustrate a flowchart showing calculations for determining the one or more attributes, in accordance with various embodiments of the present disclosure
- FIG. 4 illustrates a flowchart for calculating SAND value of an 8 ⁇ 8 block, in accordance with various embodiments of the present disclosure
- FIG. 5 is a flowchart for determining the one or more attributes of the each frame, in accordance with various embodiments of the present disclosure.
- FIG. 6 illustrates a block diagram of a communication device, in accordance with various embodiments of the present disclosure.
- FIG. 1 illustrates a video segmentation system 100 for determining one or more attributes of each frame of a plurality of frames of a video, in accordance with various embodiments of the present disclosure.
- the video segmentation system 100 identifies and utilizes the one or more attributes of the each frame of the plurality of frames.
- the one or more attributes are different frame characteristics including scene change, fading in, fading out, dissolve or any other attribute which is essential to stabilize a rate control model in an encoder.
- the video segmentation system 100 includes a video image data source 102 , a segmented video data sink 104 , a first data link 106 , a second data link 108 , an I/O interface 110 , a discontinuous cut detector 112 , a gradual change detector 114 , a frame capture device 116 , a controller 118 , a memory 120 and a data bus 122 .
- the video image data source 102 inputs a video signal to the video segmentation system 100 .
- the video image data source 102 provides the video signal to the video segmentation system 100 over the first data link 106 .
- Examples of the video image data source 102 include but may not be limited to a video camera, a video recorder, a camcorder, a video cassette player/recorder, a digital video disk player/recorder, video decoder, or any device suitable for storing and/or transmitting electronic video image data, such as a client or server of a network, a cable television network, or the Internet, and especially the World Wide Web.
- the video segmentation system 100 receives the video image data via the I/O interface 110 .
- the I/O interface 110 is a device or module that receives and/or transmits the data.
- the frame capture device 116 captures the each frame of the plurality of frames of the video signal.
- the discontinuous cut detector 112 detects discontinuous cuts in the each frame of the plurality of frames of the video signal.
- the gradual change detector 114 detects changes in the each frame of the plurality of frames by identifying the one or more attributes of the video signal (as described in detailed description of FIG. 2 and FIGS. 3A, 3B, 3C , and 3 D).
- the discontinuous cut detector 112 detects the discontinuous cuts and gradual change detector 114 detects the changes at direction of the controller 118 .
- the memory 120 stores the identified one or more attributes of the each frame.
- the discontinuous cut detector 112 , gradual change detector 114 , the controller 118 and other components work in combination to determine the one or more attributes of the each frame of the plurality of frames of the video signal.
- the data bus 122 connects the I/O interface 110 , the discontinuous cut detector 112 , the gradual change detector 114 , the frame capture device 116 , the controller 118 and the memory 120 .
- the data bus 122 is a communication system that transfers data between components inside a device or between two or more devices.
- the video segmentation system 100 transmits the segmented video data signal to the segmented video data sink 104 via the second data link 108 .
- the second data link 108 can be a device that is capable of receiving the segmented video signal by the video segmentation system 100 and storing, transmitting or displaying the segmented video image data (segmented video data signal).
- the segmented video data sink 104 can be a channel device for transmitting the segmented video data for display or storage, or a storage device for indefinitely storing the segmented video data until there arises a need to display or further transmit the segmented video data.
- first data link 106 and the second data link 108 can be any known structure or apparatus for transmitting the video image data to or from the video segmentation system 100 to a physically remote or physically collocated storage or display device.
- first data link 106 and the second data link 108 can be a hardwired link, a wireless link, a public switched telephone network, a local or wide area network, an intranet, the Internet, a wireless transmission channel, any other distributed network and the like.
- the memory 120 can be any known structural apparatus/module for storing the segmented video data including RAM, a hard drive and disk, a floppy drive and disk, an optical drive and disk, a flash memory and the like.
- the video segmentation system 100 receives a video V captured by a camera.
- the video segmentation system 100 detects changes and discontinuous cuts in it and further identifies attributes of every frame of the video, according to which two frames has scene change, five has fade-in, eight are normal frames and rest have fade-out.
- the video segmentation system 100 includes the memory 120 ; however those skilled in the art would appreciate that the video segmentation system 100 may include more memory modules.
- FIG. 2 illustrates a block diagram 200 of the video segmentation system 100 , in accordance with various embodiments of the present disclosure. It may be noted that to explain the system elements of FIG. 2 , references will be made to the system elements of FIG. 1 .
- the video segmentation system 100 includes an evaluation module 202 , a determination module 204 , a computational module 206 , an identification module 208 , a storage module 210 and a transmission module 212 .
- the evaluation module 202 evaluates a first set of pre-defined values for the each frame of the plurality of frames. The first set of pre-defined values is evaluated based on a first pre-determined criterion.
- the first predetermined criterion includes categorization of one or more pixel values of each macroblock of the each frame based on one or more components of the each frame and calculation of the first set of pre-defined values from each of the one or more components of the each frame based on a pre-defined criterion.
- the first set of pre-defined values includes values for SAND based Adaptive New Frame Detection (hereinafter SAND) mechanism.
- the pre-defined criterion includes finding an average value of the one or more pixel values of the each macroblock of the each frame, subtracting the average value from each of the one or more pixel values of the each macroblock and determining sum of the average subtracted values of each of the macroblock.
- the first set of pre-defined values is evaluated for each of the one or more components of the each frame. For example, the SAND values are evaluated for Luma (Y), Chroma-Cb (U) and Chroma-Cr (V) frames.
- the categorization/arrangement of one or more pixel values of the each macroblock of the each frame based the on one or more components are illustrated in detailed description of FIGS. 3A, 3B, 3C , and 3 D.
- the determination module 204 determines a second set of pre-defined values for the each frame of the plurality of frames based on a second pre-determined criterion.
- the second pre-determined criterion includes calculation of the second set of pre-defined values for the each frame for a component of the one or more components.
- the second set of pre-defined values includes histogram values. The histogram values are determined by analyzing the one or more pixel values of the each frame. Moreover, the histogram values are determined for only luma pixels and a corresponding bucket in a 256 bucket array, each representing values from 0 to 255 is incremented.
- the computational module 206 computes a third pre-defined value for the each frame based on a third pre-determined criterion.
- the third pre-determined criterion includes subtracting the first set of pre-defined values of a previous frame from the first set of pre-defined values of a present frame and the second set of pre-defined values of a previous frame from the second set of pre-defined values of a present frame.
- the computational module 206 computes a difference between the SAND value of the previous frame and the SAND value of the present frame. Further, the computational module 206 computes a difference between the histogram value of the previous frame and the histogram value of the present frame.
- the identification module 208 identifies the one or more attributes of the each frame.
- the identified one or more attributes are utilized for stabilizing the rate control model of the encoder.
- the one or more attributes are identified based on the difference between the SAND value of the present frame and the previous frame and the histogram value of the present frame and the previous frame.
- the one or more attributes include the scene change, the fading in, the fading out, the dissolve and the like (as illustrated in detailed description of FIG. 1 ). In the fade-out, intensity of an image decreases and tends to zero over time. The fade-in begins as a blank frame and then, an image begins to appear over time. In the dissolve, while one image disappears, another image simultaneously appears.
- the calculations for determining/identifying the one or more attributes of the each frame is illustrated in detailed description of FIGS. 3A, 3B, 3C, and 3D . Based on these calculations, type of frame is identified.
- the storage module 210 stores the one or more attributes.
- the transmission module 212 transmits data corresponding to the identified one or more attributes to the rate control model of the encoder.
- the transmitted data stabilizes the rate control model (described below).
- the transmission module 212 transmits the segmented video to the rate control model.
- the transmission module 212 transmits the data corresponding to the identified one or more attributes to the segmented video data sink 104 and the segmented video data sink 104 transmits the one or more attributes to the rate control model.
- the transmission module 212 transmits the segmented video to the segmented video data sink 104 and the segmented video data sink 104 transmits the segmented video to the rate control model.
- the discontinuous cut detector 112 , the gradual change detector 114 and the controller 118 of the video segmentation system 100 collectively controls the functioning of the evaluation module 202 , the determination module 204 , the computational module 206 and the identification module 208 .
- the memory 120 controls the functioning of the storage module 210 .
- the rate control model dynamically adjusts itself based on the data and/or segmented video transmitted by the transmission module 212 .
- the frames may have the one or more attributes including scene change, fading in frame, fading out and dissolve in a video sequence. These one or more attributes may change the quality/stability of the rate control model.
- the identification of these one or more attributes stabilizes the rate control model. For example, if the present frame is not an intra frame, the present frame as intra frame is announced and QStep for this intra frame by the bitrate model (the rate control model) is found out. However, if the present frame is the intra frame, then, steps of an open GOP structure (bitrate adaptation model) is followed. In another embodiment of the present disclosure, if the present frame is not the intra frame, the QStep founded by the bitrate adaptation model of the present frame is decreased by 25%. If the present frame is the intra frame, the bitrate adaptation model is continued.
- no scene change is entertained if the scene change comes within five frames of previous scene change.
- an I-picture is inserted and a new GOP is started for flexible GOP.
- QP for this I-picture is fetched through an I-frame model.
- no I-frame is inserted and QP of P-frame is reduced by 25%.
- the rate control model may not be reset in any of these cases (Scene Change or Fading).
- bitrate window is increased up to 2-5 seconds.
- the QP is allocated based on previous I-frame to current I-frame SAND rather than using 75 percent metric. In an embodiment of the present disclosure, previous/upcoming I-frame for QP decision is checked.
- the video segmentation system 100 includes the evaluation module 202 , the determination module 204 , the computational module 206 , the identification module 208 , the storage module 210 and the transmission module 212 ; however those skilled in the art would appreciate that the video segmentation system 100 may include more modules for determining the one or more attributes of the each frame of the video.
- FIGS. 3A, 3B, 3C, and 3D illustrate a flowchart 300 showing calculations for determining the one or more attributes, in accordance with various embodiments of the present disclosure. It may be noted that to explain the process steps of FIGS. 3A, 3B, 3C, and 3D , references will be made to the system elements of FIG. 1 and FIG. 2 . It may also be noted that to explain the process steps of the flowchart 300 , following assumptions are made. The assumptions are described below:
- following arrays are used to store the SAND values of the luma and the chroma components at the macroblock level.
- the arrays are illustrated as: SANDs MBY ⁇ array of (luma wd/ 8 ⁇ luma ht/ 8) SAND MBY ⁇ array of (luma wd/ 16 ⁇ luma ht/ 16) SAND MBCb ⁇ array of (chroma wd/ 8 ⁇ chroma ht/ 8) SAND MBCr ⁇ array of (chroma wd/ 8 ⁇ chroma ht/ 8)
- the flowchart 300 initiates at step 302 .
- the discontinuous cut detector 112 and the gradual change detector 114 of the video segmentation system 100 reads the one or more pixel values of the present frame. Further, the discontinuous cut detector 112 and the gradual change detector 114 arranges Luma pixels in orgluma, Chroma (Cb) pixel values in orgCb and Chroma (Cr) pixel values in orgCr (illustrated above as assumptions). If the resolution is a non-multiple of 16, the pixel values of frame for last macroblock column/row are repeated through to the last remaining row/column respectively.
- the evaluation module 202 evaluates the SAND values from the Luma components of the present frame.
- the steps that are followed for each 8 ⁇ 8 block of the Luma components are as follows: At first step, the evaluation module 202 evaluates an average of 8 ⁇ 8 block from the orgluma. At second step, the evaluation module 202 subtracts the average from each value of the 8 ⁇ 8 block. At third step, the evaluation module 202 evaluates a sum of absolute of the average subtracted values of the 8 ⁇ 8 block.
- the evaluation module 202 evaluates the SAND values from the chroma components of the present frame.
- the steps that are followed for each 8 ⁇ 8 block of the chroma components are as follows: At first step, the evaluation module 202 evaluates the average of the 8 ⁇ 8 block from orgCb/orgCr. At second step, the evaluation module 202 subtracts the average from each value of the 8 ⁇ 8 block. At third step, the evaluation module 202 evaluates a sum of absolute of the 8 ⁇ 8 block.
- the storage module 210 stores the SAND values from the luma components in SANDs MBY array and the SAND values from the chroma components in SANDMBCb/SANDMBCr array.
- the step 304 , the step 306 and the step 308 are repeated for remaining three 8 ⁇ 8 blocks of Luma macro block. Accordingly, all the four SAND values are added and stored in SANDMBY array.
- the evaluation module 202 determines sum of all values in the array
- the storage module 210 stores the sum in SAND[ 0 ] [frame] for the present frame.
- the evaluation module 202 determines sum of all values in the SANDMBCb and the storage module 210 stores the sum in SAND [ 1 ] [frame] for the present frame.
- the evaluation module 202 determines sum of all values in the SANDMBCr and the storage module 210 stores the sum in SAND [ 2 ] [frame].
- the determination module 204 determines the histogram values of the present frame.
- the number of pixels in the present frame having same values is identified.
- the pixel values vary from 0 to 255.
- the numbers of pixels are stored in N array. This process is performed for the Luma values only.
- N [present frame] be the histogram of present frame. It has 256 values.
- the 0 th value is number of pixels in the frame which are ‘0’; the 1 st value is number pixels in the frame which are ‘1’, and the like.
- the computational module 206 computes the difference between the histogram of the present frame and the previous frame.
- the calculation for computing the difference is found out as follows: N [present frame] ⁇ N [previous frame],present frame>0
- the computational module 206 adds absolute values of these differences. Further, the computational module 206 computes a normalized value by dividing the difference by (lumawd ⁇ lumaht).
- the storage module 210 stores the normalized value in h array for the present frame, by performing the below calculation: ⁇
- H [present frame] (luma wd ⁇ luma ht )
- the computational module 206 checks if the present frame is a first frame. If the present frame is the first frame, then the computational module 206 repeats the steps 306 , 308 , 310 and 312 for each of the macroblock of the frame. If the present frame is not the first frame, then the computational module 206 employ a list of certain conditions to determine the SAND values after one frame.
- the computational module 206 checks if the h [frame]>0.1. If the h [frame]>0.1, then at step 318 , the computational module 206 computes an accelerated histogram value, histacc by subtracting h of previous frame from h of present frame. The calculation is described below:
- h ⁇ [ presentframe ] ⁇ ⁇ ⁇ N ⁇ [ presentframe ] - N ⁇ [ previousframe ] ⁇ ( luma wd ⁇ luma ht )
- the computational module 206 declares the frame as normal.
- the computational module 206 checks if h [frame]>0.26&
- the computational module 206 checks if any flag of the SANDY, SANDCb and SANDCr is set. If any flag of the SANDY, SANDCb and SANDCr are not set, then computational module 206 declares the frame as the normal. However, if any flag of the SANDY, SANDCb and SANDCr is set, then at step 328 , the computational module 206 computes SANDoverall.
- the SANDoverall is computed as:
- the computational module 206 employs a list of certain conditions to declare a frame as a ‘Scene Change’ or ‘Fading’. These conditions are hereinafter stated as Filter 1 , Filter 2 , Filter 3 and so on.
- the calculations to declare the frame as the ‘Scene Change’, the ‘Fading’ and the like are as follows: If SANDYflag ⁇ SAND Cb flag ⁇ SAND Cr flag
- the computational module 206 checks if the filter 1 is satisfied.
- the calculation of filter 1 is described below: If SANDoverall ⁇ 1&(histacc>0.1 or h [frame]>0.1)
- the computational module 206 declares the frame as the normal. However, if the filter 1 is satisfied, then at step 332 , the computational module 206 checks if the filter 2 is satisfied. The calculation of filter 2 is described below: If SANDY>60 or If (SAND Y> 90&SAND Cb >90&SAND Cr> 90) or If (SAND Y ⁇ 0orSAND Cb ⁇ 0orSAND Cr ⁇ 0) or If (SAND Cb> 40&SAND Cr> 40)
- the computational module 206 checks if SAND_Y>10. If the SAND_Y>10, then the computational module 206 declares the frame as the fade-out. However, if the SAND_Y ⁇ 10, then the computational module 206 declares the frame as the normal. However, if the filter 2 is satisfied, then at step 336 , the computational module 206 checks if the filter 3 is satisfied.
- filter 3 The calculation of filter 3 is described below: If histacc>0.8 or If (SAND Y> 60) or If (SAND Y >90andSAND Cb> 90andSAND Cr> 90) or If ( h [frame]>0.25andSAND Y ⁇ 0) If ( h [frame]>0.25andSAND Cb ⁇ 0) or If ( h [frame]>0.25andSAND Cr ⁇ 0) or If (ABS(histacc)>0.7) or If ( h [frame]>0.7&
- the computational module 206 declares the frame as the scene change. However, if the filter 3 is not satisfied, then at step 338 , the computational module 206 checks if the filter 4 is satisfied. The calculation of filter 4 is described below: If (SAND Y> 40&(SAND Cb> 60orSAND Cr> 60)) or If (SAND Y> 60) or If (SAND Cb> 97&SAND Cr> 97) or If (SAND Cb> 200) or If (SAND Cr> 200)
- the computational module 206 declares the frame as the fade-in. However, if the filter 4 is not satisfied, then at step 340 , the computational module 206 checks if the filter 5 is satisfied. The calculation of filter 5 is described below: If SAND Y ⁇ 10&SAND Cb ⁇ 10&SAND Cr ⁇ 10,
- the computational module 206 declares the frame as the fade-out. However, if the filter 5 is not satisfied, then at step 342 , the computational module 206 checks if the filter 6 is satisfied. The calculation of filter 6 is described below: If histacc>0.16
- the computational module 206 If the filter 6 is satisfied, then the computational module 206 declares the frame as the normal. However, if the filter 6 is not satisfied, then at step 344 , the computational module 206 checks if the filter 7 is satisfied. The calculation of filter 7 is described below: if ABS(SANDY>10)
- the computational module 206 declares the frame as the fade-out. However, if the filter 7 is not satisfied, the computational module 206 declares the frame as the normal.
- the flowchart 300 terminates at step 346 .
- the histogram and histogram gradients of the present frame are stored back as previous frame's values and serve as a feedback loop.
- the present frame's data are treated as previous frame's data when next frame comes.
- the video segmentation system 100 performs perceptual video quality control by Adaptive QP control over macroblock approach.
- MBQS ⁇ [ index ] ( 4 ⁇ SAND ⁇ [ index ] + 2 ⁇ SANDavg ) ⁇ currentslice_qs 2 ⁇ SAND ⁇ [ index ] + 3 ⁇ SANDavg + 1
- the MBQS [index] is clipped to a minimum of 1.25 or a maximum of 63.4375.
- the above stated steps are repeated for all macroblocks. Once the stated steps are performed, the each macroblock has its own QStep value and the each macroblock is separately quantized using this parameter. This helps in using the bits efficiently in the bitrate control model of the encoder.
- FIG. 4 illustrates a flowchart 400 for calculating the SAND value of the 8 ⁇ 8 block, in accordance with various embodiments of the present disclosure. It may be noted that to explain various process steps of FIG. 4 , references will be made to the system elements of FIG. 1 , FIG. 2 and the process steps of FIGS. 3A, 3B, 3C, and 3D .
- the flow chart 400 initiates at step 402 . Following step 402 , at step 404 , the evaluation module 202 receives each of the luma frames and the chroma frames (both, Cb and Cr). At step 406 , the evaluation module 202 adds all the pixel values of the 8 ⁇ 8 block and divides the sum by 64 to find an average.
- the evaluation module 202 subtracts each pixel value in the 8 ⁇ 8 block from the average and accumulates the value. This accumulated value is referred to as SAND value of the 8 ⁇ 8 block. Further, at step 410 , the evaluation module 202 verifies that the particular 8 ⁇ 8 block is the last macroblock. If the particular macroblock is not the last macroblock, then the steps 404 - 408 are followed for the next macroblock. If the particular macroblock is the last macroblock, the flow chart 400 terminates at step 412 .
- FIG. 5 is a flowchart 500 for determining the one or more attributes of the each frame, in accordance with various embodiments of the present disclosure. It may be noted that to explain various process steps of FIG. 5 , references will be made to the system elements of FIG. 1 , FIG. 2 and the process steps of FIGS. 3A, 3B, 3C, and 3D and FIG. 4 .
- the flowchart 500 initiates at step 502 .
- the evaluation module 202 evaluates the first set of pre-defined values for the each frame of the plurality of frames.
- the determination module 204 determines the second set of pre-defined values for the each frame of the plurality of frames based on the second pre-determined criterion.
- the computational module 206 computes the third pre-defined value for the each frame based on the third pre-determined criterion.
- the identification module 208 identifies the one or more attributes of the each frame. The flowchart 500 terminates at step 512 .
- FIG. 6 illustrates a block diagram of a communication device 600 , in accordance with various embodiments of the present disclosure.
- the communication device 600 includes a control circuitry module 602 , a storage module 604 , an input/output circuitry module 606 , and a communication circuitry module 608 .
- the communication device 600 includes any suitable type of portable electronic device. Examples of the communication device 600 include but may not be limited to a personal e-mail device (e.g., a BlackberryTM made available by Research in Motion of Waterloo, Ontario), a personal data assistant (“PDA”), a cellular telephone, a Smartphone, a handheld gaming device, a digital camera, a laptop computer, and a tablet computer. In another embodiment of the present disclosure, the communication device 600 can be a desktop computer.
- a personal e-mail device e.g., a BlackberryTM made available by Research in Motion of Waterloo, Ontario
- PDA personal data assistant
- the communication device 600 can be a desktop computer.
- control circuitry module 602 includes any processing circuitry or processor operative to control the operations and performance of the communication device 600 .
- the control circuitry module 602 may be used to run operating system applications, firmware applications, media playback applications, media editing applications, or any other application.
- the control circuitry module 602 drives a display and process inputs received from a user interface.
- the storage module 604 includes one or more storage mediums including a hard-drive, solid state drive, flash memory, permanent memory such as ROM, any other suitable type of storage component, or any combination thereof.
- the storage module 604 may store, for example, media data (e.g., music and video files), application data (e.g., for implementing functions on the communication device 600 ).
- the I/O circuitry module 606 may be operative to convert (and encode/decode, if necessary) analog signals and other signals into digital data.
- the I/O circuitry module 606 may also convert the digital data into any other type of signal and vice-versa.
- the I/O circuitry module 606 may receive and convert physical contact inputs (e.g., from a multi-touch screen), physical movements (e.g., from a mouse or sensor), analog audio signals (e.g., from a microphone), or any other input.
- the digital data may be provided to and received from the control circuitry module 602 , the storage module 604 , or any other component of the communication device 600 .
- the I/O circuitry module 606 is illustrated in FIG. 6 as a single component of the communication device 600 ; however those skilled in the art would appreciate that several instances of the I/O circuitry module 606 may be included in the communication device 600 .
- the communication device 600 may include any suitable interface or component for allowing a user to provide inputs to the I/O circuitry module 606 .
- the communication device 600 may include any suitable input mechanism. Examples of the input mechanism include but may not be limited to a button, keypad, dial, a click wheel, and a touch screen.
- the communication device 600 may include a capacitive sensing mechanism, or a multi-touch capacitive sensing mechanism.
- the communication device 600 may include specialized output circuitry associated with output devices such as, for example, one or more audio outputs.
- the audio output may include one or more speakers built into the communication device 600 , or an audio component that may be remotely coupled to the communication device 600 .
- the one or more speakers can be mono speakers, stereo speakers, or a combination of both.
- the audio component can be a headset, headphones or ear buds that may be coupled to the communication device 600 with a wire or wirelessly.
- the I/O circuitry module 606 may include display circuitry for providing a display visible to the user.
- the display circuitry may include a screen (e.g., an LCD screen) that is incorporated in the communication device 600 .
- the display circuitry may include a movable display or a projecting system for providing a display of content on a surface remote from the communication device 600 (e.g., a video projector).
- the display circuitry may include display driver circuitry, circuitry for driving display drivers or both.
- the display circuitry may be operative to display content.
- the display content can include media playback information, application screens for applications implemented on the electronic device, information regarding ongoing communications operations, information regarding incoming communications requests, or device operation screens under the direction of the control circuitry module 602 .
- the display circuitry may be operative to provide instructions to a remote display.
- the communication device 600 includes the communication circuitry module 608 .
- the communication circuitry module 608 may include any suitable communication circuitry operative to connect to a communication network and to transmit communications (e.g., voice or data) from the communication device 600 to other devices within the communications network.
- the communications circuitry 608 may be operative to interface with the communication network using any suitable communication protocol. Examples of the communication protocol include but may not be limited to Wi-Fi, Bluetooth RTM, radio frequency systems, infrared, LTE, GSM, GSM plus EDGE, CDMA, and quadband.
- the communications circuitry module 608 may be operative to create a communications network using any suitable communications protocol.
- the communication circuitry module 608 may create a short-range communication network using a short-range communications protocol to connect to other devices.
- the communication circuitry module 608 may be operative to create a local communication network using the Bluetooth, RTM protocol to couple the communication device 600 with a Bluetooth, RTM headset.
- the computing device is shown to have only one communication operation; however, those skilled in the art would appreciate that the communication device 600 may include one more instances of the communication circuitry module 608 for simultaneously performing several communication operations using different communication networks.
- the communication device 600 may include a first instance of the communication circuitry module 608 for communicating over a cellular network, and a second instance of the communication circuitry module 608 for communicating over Wi-Fi or using Bluetooth RTM.
- the same instance of the communications circuitry module 608 may be operative to provide for communications over several communication networks.
- the communication device 600 may be coupled a host device for data transfers, synching the communication device 600 , software or firmware updates, providing performance information to a remote source (e.g., providing riding characteristics to a remote server) or performing any other suitable operation that may require the communication device 600 to be coupled to a host device.
- a remote source e.g., providing riding characteristics to a remote server
- Several computing devices may be coupled to a single host device using the host device as a server.
- the communication device 600 may be coupled to the several host devices (e.g., for each of the plurality of the host devices to serve as a backup for data stored in the communication device 600 ).
- the above stated method and system involves calculation of only two components (the SAND calculation and the histogram calculation) for determining the one or more attributes of the frame, which makes this approach significantly less complex. Moreover, the above stated method and system are more effective in detecting the scene changes. Further, the above stated method and system transmits the one or more attributes of the each frame to the rate control model that dynamically adjusts itself according to the one or more attributes, thereby resulting in enhanced quality of the video.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Television Systems (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The present disclosure relates to a method and a system for determining one or more attributes of each frame of a plurality of frames of a video. The method includes evaluating a first set of pre-defined values for the each frame of the plurality of frames, determining a second set of pre-defined values for the each frame of the plurality of frames based on a second pre-determined criterion, computing a third pre-defined value for the each frame based on a third pre-determined criterion, and identifying the one or more attributes of the each frame. The identified one or more attributes is utilized for stabilizing a rate control model of an encoder. The evaluating is based on a first pre-determined criterion.
Description
The present invention relates to the field of video segmentation, and in particular, relates to methods and systems that detect and utilize attributes of frames in video sequences.
In this Internet era, the amount of digital video data that consumers have access to are increasing by leaps and bounds. This video data can be in the form of commercial DVDs and VCDs, personal camcorder recordings, off-air recordings onto HDD and DVR systems, video downloads on a personal computer or mobile phone or PDA or portable player, and the like. To manage these video libraries, new automatic video management technologies are being developed that allow users efficient access to their video content and functionalities such as video categorization, summarization, searching and the like.
The realization of such functionalities relies on analysis and understanding of individual videos. First step of the analysis includes segmentation of the video into its constituent shots. A shot can be defined as a sequence of video frames obtained by one camera without being interrupted. The video is generally configured of a connection of many shots and various video editing effects are used according to methods of connecting the shots. For example, an hour of a TV program typically contains 1000 shots. The video editing effects include an abrupt shot transition and a gradual shot transition. The abrupt shot transition (generally referred to as hard cut) is a technique that the current picture is abruptly changed into another picture. The gradual shot transition is a technique that a picture is gradually changed into another picture such as fade, dissolve, wipe and other special effects.
A common example of the gradual shot transition is the fade, whereby intensity of a shot gradually drops, ending at a black monochromatic frame (fade-out), or the intensity of a black monochromatic frame gradually increases until actual shot becomes visible at its normal intensity (fade-in). Fades to and from black are more common, but fades involving monochromatic frames of other colors are also used. Another example of the gradual shot transition is the dissolve, which can be envisaged as a combined fade-out and fade-in. The dissolve involves two shots, overlapping for a number of frames, during which time the first shot gradually dims and the second shot becomes gradually more distinct.
Presently, various methods and systems are used for identifying and utilizing the different frame characteristics of a video sequence such as Scene Change, the Fading in, the Fading out and the Dissolve. Some of the present methods and systems use comparisons of segmentation mask maps between two successive video frames. In addition, object tracking technique is employed as a complement to handle situations of scene rotation without any extra overhead. Some methods and systems use a two-phase reject-to-refine strategy. According to this strategy, the frames are tested against mean absolute frame differences (MAFD) with a relaxed threshold. Then, these frames are further examined by combined metrics of signed difference of mean absolute frame difference (SDMAFD), absolute difference of frame variance (ADFV), and MAFD after normalization. This approach can be referred to as a histogram equalization process. Some other methods and systems combine the intensity and motion information to detect the scene changes. Most of these approaches have higher overhead. Further, these methods and systems are complex and not very effective in detecting the scene changes.
In light of the above discussion, there is a need for a method and system which overcomes all the above stated disadvantages.
In an aspect of the present disclosure, a method for determining one or more attributes of each frame of a plurality of frames of a video is provided. The method includes evaluating a first set of pre-defined values for the each frame of the plurality of frames, determining a second set of pre-defined values for the each frame of the plurality of frames based on a second pre-determined criterion, computing a third pre-defined value for the each frame based on a third pre-determined criterion, and identifying the one or more attributes of the each frame. The identified one or more attributes is utilized for stabilizing a rate control model of an encoder. The evaluating is based on a first pre-determined criterion. The first pre-determined criterion includes categorizing one or more pixel values of each macroblock of the each frame based on one or more components of the each frame and calculating the first set of pre-defined values from each of the one or more components of the each frame based on a pre-defined criterion. The second pre-determined criterion includes calculation of the second set of pre-defined values for the each frame for a component of the one or more components. The third pre-determined criterion includes subtracting the first set of predefined values of a previous frame from the first set of pre-defined values of a present frame, and the second set of pre-defined values of a previous frame from the second set of predefined values of a present frame.
In an embodiment of the present disclosure, the method further includes storing the identified one or more attributes of the each frame.
In an embodiment of the present disclosure, the method further includes transmitting data corresponding to the identified one or more attributes to the rate control model of the encoder.
In an embodiment of the present disclosure, the one or more attributes includes scene change, fading-in, fading out, dissolve and the like.
In an embodiment of the present disclosure, the one or more components include one or more luma components and one or more chroma components.
In an embodiment of the present disclosure, the pre-defined criterion includes finding an average value of the one or more pixel values of the each macroblock of the each frame, subtracting the average value from each of the one or more pixel values of the each macroblock and determining sum of the average subtracted values of each of the macroblock. The determined sum of the average subtracted values of each of the macroblock is stored in one or more arrays.
In an embodiment of the present disclosure, the component of the one or more components is one or more luma components.
In another aspect of the present disclosure, a video segmentation system for determining one or more attributes of each frame of a plurality of frames of a video is provided. The system includes an evaluation module to evaluate a first set of pre-defined values for the each frame of the plurality of frames, a determination module to determine a second set of pre-defined values for the each frame of the plurality of frames based on a second pre-determined criterion, a computational module to compute a third pre-defined value for the each frame based on a third pre-determined criterion, and an identification module to identify the one or more attributes of the each frame. The identified one or more attributes is utilized for stabilizing a rate control model of an encoder. The first set of predefined values is evaluated based on a first pre-determined criterion. The second predetermined criterion includes calculation of the second set of pre-defined values for the each frame for a component of the one or more components. The third pre-determined criterion includes subtracting the first set of pre-defined values of a previous frame from the first set of pre-defined values of a present frame, and the second set of pre-defined values of a previous frame from the second set of pre-defined values of a present frame.
In an embodiment of the present disclosure, the evaluation module is further configured to categorize one or more pixel values of each macroblock of the each frame based on one or more components of the each frame.
In another embodiment of the present disclosure, the evaluation module is further configured to calculate the first set of pre-defined values from each of the one or more components of the each frame based on a pre-defined criterion.
In an embodiment of the present disclosure, the video segmentation system further includes a storage module to store the one or more attributes of the each frame.
In an embodiment of the present disclosure, the video segmentation system further includes a transmission module to transmit data corresponding to the identified one or more attributes to the rate control model of the encoder.
In an embodiment of the present disclosure, the one or more attributes includes scene change, fading-in, fading out, dissolve and the like.
In an embodiment of the present disclosure, the one or more components include one or more luma components and one or more chroma components.
In an embodiment of the present disclosure, the pre-defined criterion includes finding an average value of the one or more pixel values of the each macroblock of the each frame, subtracting the average value from each of the one or more pixel values of the each macroblock and determining sum of the average subtracted values of each of the macroblock. The determined sum of the average subtracted values of each of the macroblock is stored in one or more arrays.
In an embodiment of the present disclosure, the component of the one or more components is one or more luma components.
In yet another aspect of the present disclosure, a computer system for determining one or more attributes of each frame of a plurality of frames of a video is provided. The computer system includes one or more processors and a non-transitory memory containing instructions that, when executed by the one or more processors, causes the one or more processors to perform a set of steps. The set of steps includes evaluating a first set of predefined values for the each frame of the plurality of frames, determining a second set of predefined values for the each frame of the plurality of frames based on a second pre-determined criterion, computing a third pre-defined value for the each frame based on a third pre-determined criterion and identifying the one or more attributes of the each frame. The identified one or more attributes is utilized for stabilizing a rate control model of an encoder. The evaluating is based on a first pre-determined criterion. The first pre-determined criterion includes categorizing one or more pixel values of each macroblock of the each frame based on one or more components of the each frame and calculating the first set of pre-defined values from each of the one or more components of the each frame based on a pre-defined criterion. The second pre-determined criterion includes calculation of the second set of pre-defined values for the each frame for a component of the one or more components. The third pre-determined criterion includes subtracting the first set of pre-defined values of a previous frame from the first set of pre-defined values of a present frame, and the second set of predefined values of a previous frame from the second set of pre-defined values of a present frame.
In an embodiment of the present disclosure, the non-transitory memory containing instructions that, when executed by the one or more processors, cause the one or more processors to perform a further step of transmitting data corresponding to the identified one or more attributes to the rate control model of the encoder.
Having thus described the invention in general terms, reference will now be made to the accompanying drawings, which are not necessarily drawn to scale, and wherein:
It should be noted that the terms “first”, “second”, and the like, herein do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. Further, the terms “a” and “an” herein do not denote a limitation of quantity, but rather denote the presence of at least one of the referenced item.
The video segmentation system 100 includes a video image data source 102, a segmented video data sink 104, a first data link 106, a second data link 108, an I/O interface 110, a discontinuous cut detector 112, a gradual change detector 114, a frame capture device 116, a controller 118, a memory 120 and a data bus 122. The video image data source 102 inputs a video signal to the video segmentation system 100. The video image data source 102 provides the video signal to the video segmentation system 100 over the first data link 106. Examples of the video image data source 102 include but may not be limited to a video camera, a video recorder, a camcorder, a video cassette player/recorder, a digital video disk player/recorder, video decoder, or any device suitable for storing and/or transmitting electronic video image data, such as a client or server of a network, a cable television network, or the Internet, and especially the World Wide Web.
In an embodiment of the present disclosure, the video segmentation system 100 receives the video image data via the I/O interface 110. The I/O interface 110 is a device or module that receives and/or transmits the data. The frame capture device 116 captures the each frame of the plurality of frames of the video signal. The discontinuous cut detector 112 detects discontinuous cuts in the each frame of the plurality of frames of the video signal. The gradual change detector 114 detects changes in the each frame of the plurality of frames by identifying the one or more attributes of the video signal (as described in detailed description of FIG. 2 and FIGS. 3A, 3B, 3C , and 3D). The discontinuous cut detector 112 detects the discontinuous cuts and gradual change detector 114 detects the changes at direction of the controller 118. The memory 120 stores the identified one or more attributes of the each frame. In an embodiment of the present disclosure, the discontinuous cut detector 112, gradual change detector 114, the controller 118 and other components work in combination to determine the one or more attributes of the each frame of the plurality of frames of the video signal. In an embodiment of the present disclosure, the data bus 122 connects the I/O interface 110, the discontinuous cut detector 112, the gradual change detector 114, the frame capture device 116, the controller 118 and the memory 120. The data bus 122 is a communication system that transfers data between components inside a device or between two or more devices.
The video segmentation system 100 transmits the segmented video data signal to the segmented video data sink 104 via the second data link 108. The second data link 108 can be a device that is capable of receiving the segmented video signal by the video segmentation system 100 and storing, transmitting or displaying the segmented video image data (segmented video data signal). The segmented video data sink 104 can be a channel device for transmitting the segmented video data for display or storage, or a storage device for indefinitely storing the segmented video data until there arises a need to display or further transmit the segmented video data. In addition, the first data link 106 and the second data link 108 can be any known structure or apparatus for transmitting the video image data to or from the video segmentation system 100 to a physically remote or physically collocated storage or display device. Moreover, the first data link 106 and the second data link 108 can be a hardwired link, a wireless link, a public switched telephone network, a local or wide area network, an intranet, the Internet, a wireless transmission channel, any other distributed network and the like. Similarly, the memory 120 can be any known structural apparatus/module for storing the segmented video data including RAM, a hard drive and disk, a floppy drive and disk, an optical drive and disk, a flash memory and the like. For example, the video segmentation system 100 receives a video V captured by a camera. The video segmentation system 100 detects changes and discontinuous cuts in it and further identifies attributes of every frame of the video, according to which two frames has scene change, five has fade-in, eight are normal frames and rest have fade-out. It may be noted that in FIG. 1 , the video segmentation system 100 includes the memory 120; however those skilled in the art would appreciate that the video segmentation system 100 may include more memory modules.
The first set of pre-defined values includes values for SAND based Adaptive New Frame Detection (hereinafter SAND) mechanism. The pre-defined criterion includes finding an average value of the one or more pixel values of the each macroblock of the each frame, subtracting the average value from each of the one or more pixel values of the each macroblock and determining sum of the average subtracted values of each of the macroblock. The first set of pre-defined values is evaluated for each of the one or more components of the each frame. For example, the SAND values are evaluated for Luma (Y), Chroma-Cb (U) and Chroma-Cr (V) frames. The categorization/arrangement of one or more pixel values of the each macroblock of the each frame based the on one or more components are illustrated in detailed description of FIGS. 3A, 3B, 3C , and 3D.
The determination module 204 determines a second set of pre-defined values for the each frame of the plurality of frames based on a second pre-determined criterion. The second pre-determined criterion includes calculation of the second set of pre-defined values for the each frame for a component of the one or more components. The second set of pre-defined values includes histogram values. The histogram values are determined by analyzing the one or more pixel values of the each frame. Moreover, the histogram values are determined for only luma pixels and a corresponding bucket in a 256 bucket array, each representing values from 0 to 255 is incremented. The computational module 206 computes a third pre-defined value for the each frame based on a third pre-determined criterion. The third pre-determined criterion includes subtracting the first set of pre-defined values of a previous frame from the first set of pre-defined values of a present frame and the second set of pre-defined values of a previous frame from the second set of pre-defined values of a present frame. In a simpler term, the computational module 206 computes a difference between the SAND value of the previous frame and the SAND value of the present frame. Further, the computational module 206 computes a difference between the histogram value of the previous frame and the histogram value of the present frame.
The identification module 208 identifies the one or more attributes of the each frame. The identified one or more attributes are utilized for stabilizing the rate control model of the encoder. Moreover, the one or more attributes are identified based on the difference between the SAND value of the present frame and the previous frame and the histogram value of the present frame and the previous frame. The one or more attributes include the scene change, the fading in, the fading out, the dissolve and the like (as illustrated in detailed description of FIG. 1 ). In the fade-out, intensity of an image decreases and tends to zero over time. The fade-in begins as a blank frame and then, an image begins to appear over time. In the dissolve, while one image disappears, another image simultaneously appears. The calculations for determining/identifying the one or more attributes of the each frame is illustrated in detailed description of FIGS. 3A, 3B, 3C, and 3D . Based on these calculations, type of frame is identified. The storage module 210 stores the one or more attributes.
The transmission module 212 transmits data corresponding to the identified one or more attributes to the rate control model of the encoder. The transmitted data stabilizes the rate control model (described below). In another embodiment of the present disclosure, the transmission module 212 transmits the segmented video to the rate control model. In an embodiment of the present disclosure, the transmission module 212 transmits the data corresponding to the identified one or more attributes to the segmented video data sink 104 and the segmented video data sink 104 transmits the one or more attributes to the rate control model. In another embodiment of the present disclosure, the transmission module 212 transmits the segmented video to the segmented video data sink 104 and the segmented video data sink 104 transmits the segmented video to the rate control model.
In an embodiment of the present disclosure, the discontinuous cut detector 112, the gradual change detector 114 and the controller 118 of the video segmentation system 100 collectively controls the functioning of the evaluation module 202, the determination module 204, the computational module 206 and the identification module 208. In an embodiment of the present disclosure, the memory 120 controls the functioning of the storage module 210.
The rate control model dynamically adjusts itself based on the data and/or segmented video transmitted by the transmission module 212. For example, the frames may have the one or more attributes including scene change, fading in frame, fading out and dissolve in a video sequence. These one or more attributes may change the quality/stability of the rate control model. Thus, the identification of these one or more attributes stabilizes the rate control model. For example, if the present frame is not an intra frame, the present frame as intra frame is announced and QStep for this intra frame by the bitrate model (the rate control model) is found out. However, if the present frame is the intra frame, then, steps of an open GOP structure (bitrate adaptation model) is followed. In another embodiment of the present disclosure, if the present frame is not the intra frame, the QStep founded by the bitrate adaptation model of the present frame is decreased by 25%. If the present frame is the intra frame, the bitrate adaptation model is continued.
In an embodiment of the present disclosure, no scene change is entertained if the scene change comes within five frames of previous scene change. In an embodiment of the present disclosure, once the scene is decided as a changed scene, an I-picture is inserted and a new GOP is started for flexible GOP. Further, QP for this I-picture is fetched through an I-frame model. In addition, for strict GOP, no I-frame is inserted and QP of P-frame is reduced by 25%. In an embodiment of the present disclosure, the rate control model may not be reset in any of these cases (Scene Change or Fading). In an embodiment of the present disclosure, bitrate window is increased up to 2-5 seconds. In an embodiment of the present disclosure, the QP is allocated based on previous I-frame to current I-frame SAND rather than using 75 percent metric. In an embodiment of the present disclosure, previous/upcoming I-frame for QP decision is checked.
It may be noted that in FIG. 2 , the video segmentation system 100 includes the evaluation module 202, the determination module 204, the computational module 206, the identification module 208, the storage module 210 and the transmission module 212; however those skilled in the art would appreciate that the video segmentation system 100 may include more modules for determining the one or more attributes of the each frame of the video.
-
- lumawd: Number of pixels in a Lumarow
- lumaht: Number of rows in a Luma frame
- chromawd: Number of pixels in a Chromarow
- chromaht: Number of rows in a Chroma frame
- frame: Present frame
- frame−1: Previous frame
- orgluma: Luma array of lumawd×lumaht (ensure that the array size is multiples of 16)
- orgCb: Chroma (Cb) array of chromawd×chromaht (ensure that the array size is multiples of 8)
- orgCr: Chroma (Cr) array of chromawd×chromaht (ensure that the array size is multiples of 8)
- N-Histogram array, having 256 cells in a row and total number of frames as rows
- h-normalized absolute difference of histograms array, having total number of frames as rows with single cell in a row.
In an embodiment of the present disclosure, following arrays are used to store the SAND values of the luma and the chroma components at the macroblock level. The arrays are illustrated as:
SANDsMBY−array of (lumawd/8×lumaht/8)
SANDMBY−array of (lumawd/16×lumaht/16)
SANDMBCb−array of (chromawd/8×chromaht/8)
SANDMBCr−array of (chromawd/8×chromaht/8)
SANDsMBY−array of (lumawd/8×lumaht/8)
SANDMBY−array of (lumawd/16×lumaht/16)
SANDMBCb−array of (chromawd/8×chromaht/8)
SANDMBCr−array of (chromawd/8×chromaht/8)
The flowchart 300 initiates at step 302. At step 304, the discontinuous cut detector 112 and the gradual change detector 114 of the video segmentation system 100 reads the one or more pixel values of the present frame. Further, the discontinuous cut detector 112 and the gradual change detector 114 arranges Luma pixels in orgluma, Chroma (Cb) pixel values in orgCb and Chroma (Cr) pixel values in orgCr (illustrated above as assumptions). If the resolution is a non-multiple of 16, the pixel values of frame for last macroblock column/row are repeated through to the last remaining row/column respectively.
At step 306, the evaluation module 202 evaluates the SAND values from the Luma components of the present frame. The steps that are followed for each 8×8 block of the Luma components are as follows: At first step, the evaluation module 202 evaluates an average of 8×8 block from the orgluma. At second step, the evaluation module 202 subtracts the average from each value of the 8×8 block. At third step, the evaluation module 202 evaluates a sum of absolute of the average subtracted values of the 8×8 block.
In addition, the evaluation module 202 evaluates the SAND values from the chroma components of the present frame. The steps that are followed for each 8×8 block of the chroma components (Cb and Cr, separately) are as follows: At first step, the evaluation module 202 evaluates the average of the 8×8 block from orgCb/orgCr. At second step, the evaluation module 202 subtracts the average from each value of the 8×8 block. At third step, the evaluation module 202 evaluates a sum of absolute of the 8×8 block.
At step 308, the storage module 210 stores the SAND values from the luma components in SANDs MBY array and the SAND values from the chroma components in SANDMBCb/SANDMBCr array. In an embodiment of the present disclosure, the step 304, the step 306 and the step 308 are repeated for remaining three 8×8 blocks of Luma macro block. Accordingly, all the four SAND values are added and stored in SANDMBY array.
At step 310, the evaluation module 202 determines sum of all values in the array
SANDMBY and the storage module 210 stores the sum in SAND[0] [frame] for the present frame. In addition, the evaluation module 202 determines sum of all values in the SANDMBCb and the storage module 210 stores the sum in SAND [1] [frame] for the present frame. Moreover, the evaluation module 202 determines sum of all values in the SANDMBCr and the storage module 210 stores the sum in SAND [2] [frame].
At step 312, the determination module 204 determines the histogram values of the present frame. In an embodiment, the number of pixels in the present frame having same values is identified. The pixel values vary from 0 to 255. The numbers of pixels are stored in N array. This process is performed for the Luma values only. Let N [present frame] be the histogram of present frame. It has 256 values. The 0th value is number of pixels in the frame which are ‘0’; the 1st value is number pixels in the frame which are ‘1’, and the like.
Further, the computational module 206 computes the difference between the histogram of the present frame and the previous frame. The calculation for computing the difference is found out as follows:
N[present frame]−N[previous frame],present frame>0
N[present frame]−N[previous frame],present frame>0
In addition, the computational module 206 adds absolute values of these differences. Further, the computational module 206 computes a normalized value by dividing the difference by (lumawd×lumaht).
The storage module 210 stores the normalized value in h array for the present frame, by performing the below calculation:
Σ|N[present frame]−N[previous frame]|
H[present frame]=(lumawd×lumaht)
Σ|N[present frame]−N[previous frame]|
H[present frame]=(lumawd×lumaht)
At step 314, the computational module 206 checks if the present frame is a first frame. If the present frame is the first frame, then the computational module 206 repeats the steps 306, 308, 310 and 312 for each of the macroblock of the frame. If the present frame is not the first frame, then the computational module 206 employ a list of certain conditions to determine the SAND values after one frame.
At step 316, the computational module 206 checks if the h [frame]>0.1. If the h [frame]>0.1, then at step 318, the computational module 206 computes an accelerated histogram value, histacc by subtracting h of previous frame from h of present frame. The calculation is described below:
In an embodiment of the present disclosure, the histogram value (histacc) is determined by subtracting h of previous frame from h of present frame. The calculation is described below:
histacc=h[frame]−h[frame−1].
histacc=h[frame]−h[frame−1].
However, if the h [frame]<0.1, then the computational module 206 declares the frame as normal. At step 320, the computational module 206 checks if h [frame]>0.26&|histacc|>0.18. If the condition is satisfied, then the frame is declared as the scene change. However, if the stated condition is not met, then at step 322, the computational module 206 checks if the histacc>0.1. If the condition is not satisfied, then the computational module 206 declares the frame as the normal. However, if the condition is satisfied, then at step 324, a set of calculations is performed in a sequential manner. The set of calculations include computations and are stated below:
At step 326, the computational module 206 checks if any flag of the SANDY, SANDCb and SANDCr is set. If any flag of the SANDY, SANDCb and SANDCr are not set, then computational module 206 declares the frame as the normal. However, if any flag of the SANDY, SANDCb and SANDCr is set, then at step 328, the computational module 206 computes SANDoverall. The SANDoverall is computed as:
In an embodiment of the present disclosure, the computational module 206 employs a list of certain conditions to declare a frame as a ‘Scene Change’ or ‘Fading’. These conditions are hereinafter stated as Filter 1, Filter 2, Filter 3 and so on. The calculations to declare the frame as the ‘Scene Change’, the ‘Fading’ and the like are as follows:
If SANDYflag∥SANDCbflag∥SANDCrflag
If SANDYflag∥SANDCbflag∥SANDCrflag
Calculate SANDoverall
At step 330, the computational module 206 checks if the filter 1 is satisfied. The calculation of filter 1 is described below:
If SANDoverall≥1&(histacc>0.1 or h[frame]>0.1)
If SANDoverall≥1&(histacc>0.1 or h[frame]>0.1)
If the filter 1 is not satisfied, then the computational module 206 declares the frame as the normal. However, if the filter 1 is satisfied, then at step 332, the computational module 206 checks if the filter 2 is satisfied. The calculation of filter 2 is described below:
If SANDY>60 or
If (SANDY>90&SANDCb>90&SANDCr>90) or
If (SANDY<0orSANDCb<0orSANDCr<0) or
If (SANDCb>40&SANDCr>40)
If SANDY>60 or
If (SANDY>90&SANDCb>90&SANDCr>90) or
If (SANDY<0orSANDCb<0orSANDCr<0) or
If (SANDCb>40&SANDCr>40)
If the filter 2 is not satisfied, then at step 334, the computational module 206 checks if SAND_Y>10. If the SAND_Y>10, then the computational module 206 declares the frame as the fade-out. However, if the SAND_Y<10, then the computational module 206 declares the frame as the normal. However, if the filter 2 is satisfied, then at step 336, the computational module 206 checks if the filter 3 is satisfied. The calculation of filter 3 is described below:
If histacc>0.8 or
If (SANDY>60) or
If (SANDY>90andSANDCb>90andSANDCr>90) or
If (h[frame]>0.25andSANDY<0)
If (h[frame]>0.25andSANDCb<0) or
If (h[frame]>0.25andSANDCr<0) or
If (ABS(histacc)>0.7) or
If (h[frame]>0.7&|SANDCb|>40&|SANDCr|>40)
If histacc>0.8 or
If (SANDY>60) or
If (SANDY>90andSANDCb>90andSANDCr>90) or
If (h[frame]>0.25andSANDY<0)
If (h[frame]>0.25andSANDCb<0) or
If (h[frame]>0.25andSANDCr<0) or
If (ABS(histacc)>0.7) or
If (h[frame]>0.7&|SANDCb|>40&|SANDCr|>40)
If the filter 3 is satisfied, then the computational module 206 declares the frame as the scene change. However, if the filter 3 is not satisfied, then at step 338, the computational module 206 checks if the filter 4 is satisfied. The calculation of filter 4 is described below:
If (SANDY>40&(SANDCb>60orSANDCr>60)) or
If (SANDY>60) or
If (SANDCb>97&SANDCr>97) or
If (SANDCb>200) or
If (SANDCr>200)
If (SANDY>40&(SANDCb>60orSANDCr>60)) or
If (SANDY>60) or
If (SANDCb>97&SANDCr>97) or
If (SANDCb>200) or
If (SANDCr>200)
If the filter 4 is satisfied, then the computational module 206 declares the frame as the fade-in. However, if the filter 4 is not satisfied, then at step 340, the computational module 206 checks if the filter 5 is satisfied. The calculation of filter 5 is described below:
If SANDY≥−10&SANDCb≤−10&SANDCr≤−10,
If SANDY≥−10&SANDCb≤−10&SANDCr≤−10,
If the filter 5 is satisfied, then the computational module 206 declares the frame as the fade-out. However, if the filter 5 is not satisfied, then at step 342, the computational module 206 checks if the filter 6 is satisfied. The calculation of filter 6 is described below:
If histacc>0.16
If histacc>0.16
If the filter 6 is satisfied, then the computational module 206 declares the frame as the normal. However, if the filter 6 is not satisfied, then at step 344, the computational module 206 checks if the filter 7 is satisfied. The calculation of filter 7 is described below: if ABS(SANDY>10)
If the filter 7 is satisfied, then the computational module 206 declares the frame as the fade-out. However, if the filter 7 is not satisfied, the computational module 206 declares the frame as the normal. The flowchart 300 terminates at step 346.
In an embodiment of the present disclosure, the histogram and histogram gradients of the present frame are stored back as previous frame's values and serve as a feedback loop. In simpler term, the present frame's data are treated as previous frame's data when next frame comes.
N[previousframe]←N[presentframe]
h[previousframe]←h[presentframe]
N[previousframe]←N[presentframe]
h[previousframe]←h[presentframe]
In an embodiment of the present disclosure, the video segmentation system 100 performs perceptual video quality control by Adaptive QP control over macroblock approach. In this approach, the highest SANDY (SANDmax) among all Macro block in Luma frame is found out from the SANDMBY array in SAND module. If the SANDmax<1000, then complete MB QS array is updated with the current SLICEQS value. If this condition is not met, the following steps are followed. Firstly, the average
SANDavg=(SANDmax/number of MBs)
is determined. Then, MBQS [index] is calculated according to the below given formula. The index represents each Macroblock.
SANDavg=(SANDmax/number of MBs)
is determined. Then, MBQS [index] is calculated according to the below given formula. The index represents each Macroblock.
Furthermore, the MBQS [index] is clipped to a minimum of 1.25 or a maximum of 63.4375. The above stated steps are repeated for all macroblocks. Once the stated steps are performed, the each macroblock has its own QStep value and the each macroblock is separately quantized using this parameter. This helps in using the bits efficiently in the bitrate control model of the encoder.
It may be noted that the flowchart 300 is explained to have above stated process steps; however, those skilled in the art would appreciate that the flowchart 300 may have more/less number of process steps which may enable all the above stated embodiments of the present disclosure.
It may be noted that the flowchart 400 is explained to have above stated process steps; however, those skilled in the art would appreciate that the flowchart 400 may have more/less number of process steps which may enable all the above stated embodiments of the present disclosure.
It may be noted that the flowchart 500 is explained to have above stated process steps; however, those skilled in the art would appreciate that the flowchart 500 may have more/less number of process steps which may enable all the above stated embodiments of the present disclosure.
From the perspective of this disclosure, the control circuitry module 602 includes any processing circuitry or processor operative to control the operations and performance of the communication device 600. For example, the control circuitry module 602 may be used to run operating system applications, firmware applications, media playback applications, media editing applications, or any other application. In an embodiment, the control circuitry module 602 drives a display and process inputs received from a user interface.
From the perspective of this disclosure, the storage module 604 includes one or more storage mediums including a hard-drive, solid state drive, flash memory, permanent memory such as ROM, any other suitable type of storage component, or any combination thereof. The storage module 604 may store, for example, media data (e.g., music and video files), application data (e.g., for implementing functions on the communication device 600).
From the perspective of this disclosure, the I/O circuitry module 606 may be operative to convert (and encode/decode, if necessary) analog signals and other signals into digital data. In an embodiment, the I/O circuitry module 606 may also convert the digital data into any other type of signal and vice-versa. For example, the I/O circuitry module 606 may receive and convert physical contact inputs (e.g., from a multi-touch screen), physical movements (e.g., from a mouse or sensor), analog audio signals (e.g., from a microphone), or any other input. The digital data may be provided to and received from the control circuitry module 602, the storage module 604, or any other component of the communication device 600.
It may be noted that the I/O circuitry module 606 is illustrated in FIG. 6 as a single component of the communication device 600; however those skilled in the art would appreciate that several instances of the I/O circuitry module 606 may be included in the communication device 600.
The communication device 600 may include any suitable interface or component for allowing a user to provide inputs to the I/O circuitry module 606. The communication device 600 may include any suitable input mechanism. Examples of the input mechanism include but may not be limited to a button, keypad, dial, a click wheel, and a touch screen. In an embodiment, the communication device 600 may include a capacitive sensing mechanism, or a multi-touch capacitive sensing mechanism.
In an embodiment, the communication device 600 may include specialized output circuitry associated with output devices such as, for example, one or more audio outputs. The audio output may include one or more speakers built into the communication device 600, or an audio component that may be remotely coupled to the communication device 600.
The one or more speakers can be mono speakers, stereo speakers, or a combination of both. The audio component can be a headset, headphones or ear buds that may be coupled to the communication device 600 with a wire or wirelessly.
In an embodiment, the I/O circuitry module 606 may include display circuitry for providing a display visible to the user. For example, the display circuitry may include a screen (e.g., an LCD screen) that is incorporated in the communication device 600.
The display circuitry may include a movable display or a projecting system for providing a display of content on a surface remote from the communication device 600 (e.g., a video projector). The display circuitry may include display driver circuitry, circuitry for driving display drivers or both. The display circuitry may be operative to display content. The display content can include media playback information, application screens for applications implemented on the electronic device, information regarding ongoing communications operations, information regarding incoming communications requests, or device operation screens under the direction of the control circuitry module 602. Alternatively, the display circuitry may be operative to provide instructions to a remote display.
In addition, the communication device 600 includes the communication circuitry module 608. The communication circuitry module 608 may include any suitable communication circuitry operative to connect to a communication network and to transmit communications (e.g., voice or data) from the communication device 600 to other devices within the communications network. The communications circuitry 608 may be operative to interface with the communication network using any suitable communication protocol. Examples of the communication protocol include but may not be limited to Wi-Fi, Bluetooth RTM, radio frequency systems, infrared, LTE, GSM, GSM plus EDGE, CDMA, and quadband.
In an embodiment, the communications circuitry module 608 may be operative to create a communications network using any suitable communications protocol. For example, the communication circuitry module 608 may create a short-range communication network using a short-range communications protocol to connect to other devices. For example, the communication circuitry module 608 may be operative to create a local communication network using the Bluetooth, RTM protocol to couple the communication device 600 with a Bluetooth, RTM headset.
It may be noted that the computing device is shown to have only one communication operation; however, those skilled in the art would appreciate that the communication device 600 may include one more instances of the communication circuitry module 608 for simultaneously performing several communication operations using different communication networks. For example, the communication device 600 may include a first instance of the communication circuitry module 608 for communicating over a cellular network, and a second instance of the communication circuitry module 608 for communicating over Wi-Fi or using Bluetooth RTM.
In an embodiment, the same instance of the communications circuitry module 608 may be operative to provide for communications over several communication networks. In an embodiment, the communication device 600 may be coupled a host device for data transfers, synching the communication device 600, software or firmware updates, providing performance information to a remote source (e.g., providing riding characteristics to a remote server) or performing any other suitable operation that may require the communication device 600 to be coupled to a host device. Several computing devices may be coupled to a single host device using the host device as a server. Alternatively or additionally, the communication device 600 may be coupled to the several host devices (e.g., for each of the plurality of the host devices to serve as a backup for data stored in the communication device 600).
The above stated method and system involves calculation of only two components (the SAND calculation and the histogram calculation) for determining the one or more attributes of the frame, which makes this approach significantly less complex. Moreover, the above stated method and system are more effective in detecting the scene changes. Further, the above stated method and system transmits the one or more attributes of the each frame to the rate control model that dynamically adjusts itself according to the one or more attributes, thereby resulting in enhanced quality of the video.
While the disclosure has been presented with respect to certain specific embodiments, it will be appreciated that many modifications and changes may be made by those skilled in the art without departing from the spirit and scope of the disclosure. It is intended, therefore, by the appended claims to cover all such modifications and changes as fall within the true spirit and scope of the disclosure.
Claims (10)
1. A method for identifying one or more attributes of each frame of a plurality of frames of a video for stabilizing a rate control of an encoder encoding each frame, the method comprising:
capturing, by a processor, a plurality of frames of a video signal, wherein each frame of the plurality of frames comprises a Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component, wherein each frame comprises macroblocks, each macroblock having dimension of 8×8 pixels;
evaluating, by the processor, pixel values of each macroblock of each frame;
categorizing, by the processor, the pixel values of each macroblock of each frame in each of Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component;
calculating, by the processor, a histogram for the pixel values in each frame, wherein the histogram is calculated for previous frame and current frame, and wherein the histogram is calculated for the pixel values in each frame in the Luma (Y) component;
calculating, by the processor, a first average of the pixel values of each macroblock of each previous frame and current frame in each of the Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component;
subtracting, by the processor, the first average of the pixel values of each macroblock of each previous frame and current frame from the pixel values of each macroblock each previous frame and current frame, respectively, in each of Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component to obtain second pixel values for each macroblock of each previous frame and current frame in each of Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component;
determining, by the processor, a sum of average for the second pixel values for each macroblock of each previous frame and current frame in each of the Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component;
and
identifying, by the processor, change in one or more attributes in each of the macroblock of each previous frame and current frame, wherein the one or more attributes are identified based on change in the histogram of the pixel values in each previous frame and current frame in the Luma (Y) component, and the sum of average for the second pixel value for each macroblock of each previous frame and current frame in each of the Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component, and wherein the one or more attributes are utilized for stabilizing a rate control of an encoder encoding each frame for processing the video.
2. The method of claim 1 , further comprising transmitting the one or more attributes to the encoder.
3. The method of claim 1 , wherein the one or more attributes comprises at least one of scene change, fading-in, fading out and dissolve.
4. The method of claim 1 , further comprising storing the sum of average for the second pixel values is in one or more arrays.
5. A video segmentation system for identifying one or more attributes of each frame of a plurality of frames of a video for stabilizing a rate control of an encoder encoding each frame, the video segmentation system comprising:
a processor; and
a memory coupled to the processor, wherein the processor is configured to execute program instructions stored in the memory to:
capture a plurality of frames of a video signal, wherein each frame of the plurality of frames comprises a Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component, wherein each frame comprises macroblocks, each macroblock having dimension of 8×8 pixels;
evaluate pixel values of each macroblock of each frame;
categorize the pixel values of each macroblock of each frame in each of Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component;
calculate a histogram for the pixel values in each frame, wherein the histogram is calculated for previous frame and current frame, and wherein the histogram is calculated for the pixel values in each frame in the Luma (Y) component;
calculate a first average of the pixel values of each macroblock of each previous frame and current frame in each of the Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component;
subtract the first average of the pixel values of each macroblock of each previous frame and current frame from the pixel values of each macroblock each previous frame and current frame, respectively, in each of Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component to obtain second pixel values for each macroblock of each previous frame and current frame in each of Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component;
determine a sum of average for the second pixel values for each macroblock of each previous frame and current frame in each of the Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component;
and
identify change in one or more attributes in each of the macroblock of each previous frame and current frame, wherein the one or more attributes are identified based on change in the histogram of the pixel values in each previous frame and current frame in the Luma (Y) component, and the sum of average for the second pixel value for each macroblock of each previous frame and current frame in each of the Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component, and wherein the one or more attributes are utilized for stabilizing a rate control of an encoder encoding each frame for processing the video.
6. The video segmentation system of claim 5 , wherein the memory is configured to store the one or more attributes of each frame.
7. The video segmentation system of claim 5 , wherein the processor is configured to transmit the one or more attributes to the encoder.
8. The video segmentation system of claim 5 , wherein the sum of average for the second pixel values is stored in one or more arrays.
9. A non-transitory computer readable medium embodying a program executable in a computing device for identifying one or more attributes of each frame of a plurality of frames of a video for stabilizing a rate control of an encoder encoding each frame, the program comprising:
a program code for capturing a plurality of frames of a video signal, wherein each frame of the plurality of frames comprises a Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component, wherein each frame comprises macroblocks, each macroblock having dimension of 8×8 pixels;
a program code for evaluating pixel values of each macroblock of each frame;
a program code for categorizing pixel the values of each macroblock of each frame in each of Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component;
a program code for a histogram for the pixel values in each frame, wherein the histogram is calculated for previous frame and current frame, and wherein the histogram is calculated for the pixel values in each frame in the Luma (Y) component;
a program code for calculating a first average of the pixel values of each macroblock of each previous frame and current frame in each of the Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component;
a program code for subtracting the first average of the pixel values of each macroblock of each previous frame and current frame from the pixel values of each macroblock each previous frame and current frame, respectively, in each of Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component to obtain second pixel values for each macroblock of each previous frame and current frame in each of Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component;
a program code for determining a sum of average for the second pixel values for each macroblock of each previous frame and current frame in each of the Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component;
and
a program code for identifying change in one or more attributes in each of the macroblock of each previous frame and current frame, wherein the one or more attributes are identified based on change in the histogram of the pixel values in each previous frame and current frame in the Luma (Y) component, and the sum of average for the second pixel value for each macroblock of each previous frame and current frame in each of the Luma (Y) component, Chroma-Cb (U) component and Chroma-Cr (V) component, and wherein the one or more attributes are utilized for stabilizing a rate control of an encoder encoding each frame for processing the video.
10. The program of claim 9 , further comprising a program code for transmitting the one or more attributes to the encoder.
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| IN5842CH2013 | 2013-12-16 | ||
| IN5842/CHE/2013 | 2013-12-16 | ||
| PCT/IB2014/066939 WO2015092665A2 (en) | 2013-12-16 | 2014-12-16 | Method and system to detect and utilize attributes of frames in video sequences |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20160309157A1 US20160309157A1 (en) | 2016-10-20 |
| US10230955B2 true US10230955B2 (en) | 2019-03-12 |
Family
ID=53403836
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US15/105,558 Active 2035-05-18 US10230955B2 (en) | 2013-12-16 | 2014-12-16 | Method and system to detect and utilize attributes of frames in video sequences |
Country Status (2)
| Country | Link |
|---|---|
| US (1) | US10230955B2 (en) |
| WO (1) | WO2015092665A2 (en) |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9799373B2 (en) * | 2015-11-05 | 2017-10-24 | Yahoo Holdings, Inc. | Computerized system and method for automatically extracting GIFs from videos |
| US11568527B2 (en) * | 2020-09-24 | 2023-01-31 | Ati Technologies Ulc | Video quality assessment using aggregated quality values |
| US20220400244A1 (en) * | 2021-06-15 | 2022-12-15 | Plantronics, Inc. | Multi-camera automatic framing |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20050025361A1 (en) | 2003-07-25 | 2005-02-03 | Sony Corporation And Sony Electronics Inc. | Video content scene change determination |
| US20090109341A1 (en) | 2007-10-30 | 2009-04-30 | Qualcomm Incorporated | Detecting scene transitions in digital video sequences |
| US20100253835A1 (en) | 2009-04-03 | 2010-10-07 | Samsung Electronics Co., Ltd. | Fade in/fade-out fallback in frame rate conversion and motion judder cancellation |
| US20100321584A1 (en) | 2004-07-28 | 2010-12-23 | Huaya Microelectronics (Shanghai), Inc. | System and Method For Accumulative Stillness Analysis of Video Signals |
| US20120281757A1 (en) | 2011-05-04 | 2012-11-08 | Roncero Izquierdo Francisco J | Scene change detection for video transmission system |
-
2014
- 2014-12-16 US US15/105,558 patent/US10230955B2/en active Active
- 2014-12-16 WO PCT/IB2014/066939 patent/WO2015092665A2/en not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20050025361A1 (en) | 2003-07-25 | 2005-02-03 | Sony Corporation And Sony Electronics Inc. | Video content scene change determination |
| US20100321584A1 (en) | 2004-07-28 | 2010-12-23 | Huaya Microelectronics (Shanghai), Inc. | System and Method For Accumulative Stillness Analysis of Video Signals |
| US20090109341A1 (en) | 2007-10-30 | 2009-04-30 | Qualcomm Incorporated | Detecting scene transitions in digital video sequences |
| US20100253835A1 (en) | 2009-04-03 | 2010-10-07 | Samsung Electronics Co., Ltd. | Fade in/fade-out fallback in frame rate conversion and motion judder cancellation |
| US20120281757A1 (en) | 2011-05-04 | 2012-11-08 | Roncero Izquierdo Francisco J | Scene change detection for video transmission system |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2015092665A3 (en) | 2015-09-17 |
| WO2015092665A2 (en) | 2015-06-25 |
| US20160309157A1 (en) | 2016-10-20 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US10956749B2 (en) | Methods, systems, and media for generating a summarized video with video thumbnails | |
| US10957358B2 (en) | Reference and non-reference video quality evaluation | |
| US9426409B2 (en) | Time-lapse video capture with optimal image stabilization | |
| US9715903B2 (en) | Detection of action frames of a video stream | |
| US10062412B2 (en) | Hierarchical segmentation and quality measurement for video editing | |
| US6940910B2 (en) | Method of detecting dissolve/fade in MPEG-compressed video environment | |
| US20120105728A1 (en) | Methods and apparatus for reducing structured noise in video | |
| US8363726B2 (en) | Electronic apparatus, motion vector detecting method, and program therefor | |
| CN108184165A (en) | Video playing method, electronic device and computer-readable storage medium | |
| US10230955B2 (en) | Method and system to detect and utilize attributes of frames in video sequences | |
| US9232118B1 (en) | Methods and systems for detecting video artifacts | |
| CN113038124A (en) | Video encoding method, video encoding device, storage medium and electronic equipment | |
| JP2003061038A (en) | Video content editing support apparatus and video content editing support method | |
| US8542975B2 (en) | Method to stabilize video stream using on-device positional sensors | |
| JP6871727B2 (en) | Imaging equipment, image processing methods, and programs | |
| JP4100205B2 (en) | Scene change detection method and apparatus | |
| US9595292B2 (en) | Image processing apparatus | |
| EP3855350A1 (en) | Detection of action frames of a video stream | |
| JP4837909B2 (en) | Device and method for determining loss block characteristics of image processing system | |
| JP4013024B2 (en) | Movie processing apparatus, movie processing method, and recording medium | |
| JP4915860B2 (en) | Video classification device | |
| JP4110731B2 (en) | Movie processing apparatus, movie processing method, and recording medium | |
| WO2025107200A1 (en) | Processing method, processing device, and storage medium | |
| CA2799008C (en) | Method to stabilize video stream using on-device positional sensors | |
| JP2005347842A (en) | Luminance change detection method and apparatus |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: RIVERSILICA TECHNOLOGIES PVT LTD, INDIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MUTHU, ESSAKI P;KAMATH, JAGADISH K;BUDIHAL, JAYASHREE;REEL/FRAME:038957/0899 Effective date: 20160615 |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
| MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, MICRO ENTITY (ORIGINAL EVENT CODE: M3551); ENTITY STATUS OF PATENT OWNER: MICROENTITY Year of fee payment: 4 |