WO2007057868A2 - Method and device for detecting pulldown operations performed on a video sequence - Google Patents

Method and device for detecting pulldown operations performed on a video sequence Download PDF

Info

Publication number
WO2007057868A2
WO2007057868A2 PCT/IB2006/054342 IB2006054342W WO2007057868A2 WO 2007057868 A2 WO2007057868 A2 WO 2007057868A2 IB 2006054342 W IB2006054342 W IB 2006054342W WO 2007057868 A2 WO2007057868 A2 WO 2007057868A2
Authority
WO
WIPO (PCT)
Prior art keywords
field
block
pulldown
video sequence
fields
Prior art date
Application number
PCT/IB2006/054342
Other languages
French (fr)
Other versions
WO2007057868A3 (en
Inventor
Stephan Mietens
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Publication of WO2007057868A2 publication Critical patent/WO2007057868A2/en
Publication of WO2007057868A3 publication Critical patent/WO2007057868A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/01Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
    • H04N7/0112Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level one of the standards corresponding to a cinematograph film standard
    • H04N7/0115Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level one of the standards corresponding to a cinematograph film standard with details on the detection of a particular field or frame pattern in the incoming video signal, e.g. 3:2 pull-down pattern
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/142Detection of scene cut or scene change
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/147Scene change detection

Definitions

  • the present invention relates to the field of pulldown detection in a video sequence and, more particularly, to a pulldown detection method for detecting pulldown operations performed on a video sequence. It also relates to a pulldown detection device for carrying out said method, and to a compression device for compressing a video sequence and comprising such a pulldown detection device.
  • a television image signal generally employs an interlaced video format. Interlaced video is based on a decomposition of each picture frame into two fields commonly known as the top field and the bottom field.
  • a display monitor comprises a series of horizontal lines and vertical lines. The top fields are displayed on odd numbered, i.e. 1, 3, 5,..., horizontal lines and the bottom fields are displayed on even numbered, i.e. 2, 4,..., horizontal lines.
  • NTSC National Television Standards Committee
  • PAL Phase Alternating Line
  • Video signal may originate from various sources.
  • a video material may have originated from a film source, or may have been recorded using an interlaced video camera.
  • film data may be shot at 24 frames per second or 25 frames per second in Europe. Therefore the film data are scaled in frequency from 24 or 25 frames per second to the NTSC or PAL standard, i.e. 60 or 50 fields per second.
  • 2-2 pulldown is a method for transferring film material that is at 25 frames per second to PAL video at 25 frames per second.
  • 3-2 pulldown and 2-3 pulldown are methods for transferring film material that is at 24 frames per second to NTSC video at 30 frames per second. That is, 24 film frames in 30 video frames requires that four film frames be converted to five video frames.
  • FIG. 1 is an illustration of a 3-2 pulldown conversion method known in the art.
  • Film data 10 are made of film frames (A, B, C, D). Twenty-four film frames are recorded each second.
  • the pulldown operation allows converting four film frames (A, B, C, D) into five interlaced video frames (I 1 , I 2 , I 3 , I 4 , h) of video data 12.
  • one film frame (A, C) is mapped into three fields (A 1 , A 2 , A 3 , C 1 , C 2 , C 3 ) and the following film frame (B, D) is mapped into two fields (B 1 , B 2 , D 1 , D 2 ).
  • a 2-3 pulldown operation a first film frame is mapped into two fields and the following film frame is mapped into three fields.
  • a 2-3 pulldown may be considered as a 3-2 pulldown with a different pulldown phase.
  • Each field may be a top field (A t , B t , Ct, Dt) or a bottom field (Ab, Bb, Cb, Db).
  • the top fields are to be displayed in odd horizontal lines and the bottom fields are to be displayed on even horizontal lines.
  • two of the three fields are similar. Two successive fields allow providing a video frame.
  • a first video frame Ii comprises a first top field l t and a first bottom field Ib.
  • the first video frame Ii comprises the top and bottom fields (At, Ab) of the first film frame A.
  • a second video frame I 2 comprises a second top field 2 t from the first film frame A and a second bottom field 2b from a second film frame B.
  • Four film frames (A, B, C, D) are mapped into five video frames (I 1 , 1 2 , 1 3 , 1 4 , 15).
  • the four film frames (A, B, C, D) and the five video frames (I 1 , 1 2 , 1 3 , 1 4 , 15) may be displayed during one sixtieth of a second to preserve the length of the material to be converted.
  • Each video frame (I 1 , I 2 , 1 3 , 1 4 , 15) consequently lasts one thirtieth of a second.
  • Video data at a video standard may for example be broadcasted, downloaded or stored on a medium, e.g. a DVD (Digital Versatile Disk). It may be desirable to compress a video sequence. For that purpose, it may be useful to detect if those video data result from a pulldown operation. Taking the example of a 3-2 pulldown, some video frames, e.g. the first top field l t and the second top field 2 t respectively of the first film frame Ii and the second film frame I 2 , are similar. A volume reduction operation comprising removing some duplicated fields, e.g. the second top field 2 t , would allow to save bandwidth or memory without loss of information. A pulldown flag may subsequently be set to indicate the removing of the duplicated fields.
  • a pulldown operation comprising removing some duplicated fields, e.g. the second top field 2 t , would allow to save bandwidth or memory without loss of information.
  • a pulldown flag may subsequently be set to indicate the removing of the duplicated fields.
  • FIG. 2 illustrates an example of a method for processing pictures data.
  • a film sequence made of film frames (A, B,...) is converted (step 21, or PLD) into a video sequence of video fields (l t , Ib, 2 t , 2b,).
  • the video sequence complies with the television standards, e.g. NTSC, and is broadcast (step 22, or BRD).
  • the video sequence of video fields (l t , Ib, 2 t , 2b,...) is received (step 23, or RVD).
  • a detection of a pulldown operation is performed (step 24, or PDT). If a pulldown is detected, duplicated fields are removed (step 25, or DFR). If no pulldown is detected, one can presume that the video sequence has been recorded with an interlaced video camera. In both cases, an encoding step is performed (step 26, or CMP).
  • the compressed video data may subsequently be stored (step 27, or STO).
  • the pulldown conversion may be followed by a setting of a flag to provide information about the source type.
  • a flag On the receiver side, no pulldown detection operation is theoretically required, since the flag indicates if a pulldown operation has been performed.
  • the knowledge at the receiver side about the source type may be unavailable.
  • information about the source type is unavailable with analog broadcast.
  • digital broadcast With digital broadcast, the presence of a set flag may fail to prevent false film to video conversion in case of phase misalignment. The absence of the flag does not necessarily mean that the source was not film. As a consequence, a pulldown detection may be performed on unknown data.
  • Pulldown detection may be followed by a removal of the duplicated fields, as shown in figure 2.
  • the pulldown detection may allow to restore an original 24 frames sequence by using field interleaving, thus allowing to obtain a relatively good image quality.
  • the pulldown detection may also allow to detect video frames comprising fields providing from distinct film frame. The effect caused by the pulldown conversion may subsequently be corrected.
  • the US Patent application 2004/0189877 describes a method for detecting a pulldown operation based on motion detection.
  • the US Patent application 2005/0062891 describes a method for detecting a 3-2 pulldown operation based on pixels comparisons. There is a need for a method and an apparatus allowing a faster pulldown detection.
  • the present invention provides a pulldown detection method for detecting pulldown operations performed on a video sequence.
  • a current field of the video sequence For a current field of the video sequence, a plurality of block parameters are calculated, each block parameter being calculated from pixel values of a corresponding block of pixels of the current field.
  • Each block parameter of the plurality of block parameters of the current field is then compared to a corresponding block parameter of a plurality of block parameters of a previous field.
  • a content-change strength (CCS) is computed from results of the comparisons, and duplicated fields are detected based at least on the computed content-change strength.
  • the pulldown detection is performed on comparing fields via block parameters, i.e. each field is represented by a plurality of block parameters. Each block parameter is obtained from pixels of the corresponding block.
  • the blocks of pixels may for example comprise 64 or 256 pixels.
  • the pulldown detection is performed by comparing fields on a pixel-by-pixel basis.
  • the method according to an aspect of the present invention hence allows to perform comparisons between fields and therefore pulldown detection at a faster speed than in Prior Art.
  • the pulldown detection methods of the Prior Art are based on pixel-by- pixel comparisons. It is therefore necessary to store at least two fields to allow the comparison between those two fields. At least two fields need to be accessed at once from memory in order to compute differences.
  • the comparisons are performed from two pluralities of blocks parameters. Since each block parameter corresponds to one block, i.e. a relatively high number of pixels, the present invention allows to store less data than in Prior Art. Typically, a first plurality of block parameters corresponding to the previous field, the current field and a second plurality of block parameters corresponding to the current field may be stored.
  • each frame or field need to be accessed at once. Furthermore, each frame or field is accessed twice: a first time as a previous field and a second time as a current field.
  • the method according to an aspect of the invention allows a single access, for block parameters calculating. As a consequence, for a same memory as used in Prior Art, several pluralities of blocks parameters may be stored, thus allowing to perform different kinds of fields comparisons, e.g. a current field is compared with the penultimate field immediately preceding the current field and with the field prior to the penultimate field.
  • the present invention hence allows to extract more information from the video sequence.
  • each block parameter comprises a single bit, called "block bit” in the present description.
  • a feature e.g. edge detection, may be computed per block.
  • the resulting block bit indicates the existence or absence of the tested feature in the block.
  • block parameters may be re -used for other purposes, e.g. content analysis.
  • edge detection may be performed by an already existing device such as sharpness enhancer device.
  • each plurality of block parameters hence comprises a bit plane.
  • Bit planes are quite easy and quite simple to compare.
  • the comparisons between two bit planes may allow to obtain a difference bit plane.
  • a bit of the difference bit plane having a value equal to ' 1 ' indicates a block bit change between the two compared fields, i.e. a certain amount of pixels of the block corresponding to the non-zero bit have different values in the current field and in the previous field.
  • the content-change strength (CCS) may for example be computed by counting a number of non-zero bits within the difference bit plane. If the resulting sum is below a predetermined threshold, e.g. eight non-zero bits, the CCS may be equal to zero, thus indicating little change between the two fields.
  • a predetermined threshold e.g. eight non-zero bits
  • each block parameter may provide more information than a single bit and be encoded over several digits.
  • the pulldown detection method allows to detect duplicated fields. Fields may be repeated in the video sequence because of the pulldown, or because the video sequence shows little movement. The video sequence may even show still pictures.
  • the detected duplicated fields may all be removed, thus allowing an efficient volume reduction.
  • a flag may be set to indicate a removal of a given field or a removal of a given sequence of fields.
  • a pattern of a pulldown may be detected, e.g. a 3-2 pulldown, and the field removing may be automatic, e.g. one field is removed every five fields.
  • a 3-2 flag may for example be set to indicate such an automatic field removing.
  • Such volume reductions cause no loss of information.
  • the comparing steps comprise field by field comparisons, i.e. each block parameter is obtained from pixels values of a portion of a single field.
  • Field by field comparisons allow to extract more information than frame by frame comparisons.
  • the previous field to which the current field is compared comprises a penultimate field immediately preceding the current field.
  • Such comparisons allow to provide detailed information about content changes. False cuts, i.e. a video frame made of fields from distinct film frames, may be easily detected.
  • the pulldown detection is followed by an encoding for compression of the fields, if a false cut is detected, the relevant fields may be separated before encoding, thus allowing a more efficient compression.
  • the encoding is indeed particularly efficient for frames having fields originating from a same film frame.
  • any kind of pulldown e.g. 2-2 pulldown, 2-3 pulldown etc., and pulldown phase may be detected without having to make assumptions as in Prior Art.
  • data relating to two fields only, e.g. the current field and a bit plane from the previous field, need to be stored.
  • the previous field is a top field and the current field is a bottom field. Since the comparing of those fields is performed via block parameters, each block comprising pixels from several lines, the comparisons between fields is relatively easy.
  • the previous field comprises the field prior to the penultimate field immediately preceding the current field. The compared fields hence have a same nature.
  • the comparing steps comprise frame by frame comparisons.
  • Block parameters for blocks of the current field are obtained from pixels of a portion of the frame the current field belongs to.
  • Block parameters for blocks of the previous field are obtained from pixels of a portion of the frame the previous field belongs to.
  • the computed CCSs may be binary. For example, the computed CCSs return a '0' value when little or no content change is detected between two frames, and a ' 1 ' value when substantial content change is detected.
  • the computed CCSs are digital numbers having a number of digits greater than unity, thus allowing to provide a relatively accurate information about content change between two frames or two fields.
  • Particularly subtantial content changes such as a scene cut made by an editor, may subsequently be detected.
  • the editor's cut may be made on an already video converted sequence, i.e. the editor may end up a scene at any frame.
  • the detection of such scene cuts allows to prepare the sequence before a further processing. For example, a flag indicating an editor cut allows avoiding an encoding of two scenes together.
  • the present invention provides a pulldown detection device, for detecting pulldown operations performed on a video sequence, comprising calculating means, provided for calculating a plurality of block parameters for a current field of the video sequence, from pixel values of a corresponding block of pixels of the current field. Comparing means are then provided for comparing block parameters of the plurality of block parameters of the current field to a corresponding block parameter of a plurality of block parameters of a previous field. Computing means, connected to the comparing means, are provided for computing a content-change strength (CCS), and detecting means connected to the computing means are provided for detecting duplicated fields.
  • Said pulldown detection device according to an aspect of the invention allows to perform a relatively fast pulldown detection and duplicated fields detection.
  • the present invention provides a compression device for compressing a video sequence, comprising a pulldown detection device according to the second aspect of the present invention.
  • the pulldown detection device is provided for detecting duplicated fields of the video sequence.
  • Removing means then allow to remove the detected duplicated fields, so as to obtain a video sequence with reduced field rate.
  • Encoding means allow to encode the video sequence with reduced field rate.
  • the pulldown detection may allow to compress the video sequence without loss of information, by removing the detected duplicated fields. This field removal may be followed by an encoding.
  • the encoding means are connected to the computing means of the pulldown detection device.
  • CCSs for using CCSs during an encoding of a video sequence.
  • MPEG encoding may be performed with use of CCSs so as to define groups of pictures structures.
  • Using the computed CCSs to detect a pulldown operation is relatively simple and allows to perform the pulldown detection at a relatively high speed.
  • the pulldown detection method according to an aspect of the invention may be easily implemented in an existing compression device.
  • the present invention provides a video recorder comprising a compression device according to the third aspect of the invention.
  • the pulldown detection device may alternatively be connected to another device.
  • the present invention provides a video display comprising a pulldown detection device according to the third aspect of the invention.
  • a restoring device connected to an output of the pulldown detection device may allow to restore an original 24 frames sequence by using field interleaving, thus allowing to obtain a relatively good image quality.
  • the pulldown detection may also allow to detect video frames comprising fields providing from distinct film frames. The effect caused by the pulldown conversion may subsequently be corrected by correcting means connected to an output of the pulldown detection device.
  • the video sequence may for example comply with a PAL or NTSC standard, but other standards may be used.
  • the present invention provides a computer program product comprising a computer readable medium having computer program code embodied therein for delivering byte code, the computer readable medium comprising computer program code configured to cause a computer to :
  • CCS content-change strength
  • the detecting being based at least on the computed content-change strengths.
  • the computer program product according to an aspect of the invention contains instructions for executing the method according to the first aspect of the invention.
  • the method according to the first aspect of the invention may therefore be implemented within a computer or embedded into video processing devices such as a video recorder or a video display.
  • FIG. 1 is an illustration of a 3-2 pulldown conversion method according to Prior Art.
  • FIG. 2 illustrates an example of a method for processing pictures data with a pulldown detection step, according to Prior Art.
  • FIG. 3 illustrates an example of an algorithm to be executed by a pulldown detection device according to the present invention.
  • FIG. 4 schematically illustrates an example of a method according to an embodiment of the present invention.
  • FIG. 5 schematically illustrates an example of a method according to an embodiment of the present invention.
  • FIG. 6 illustrates an example of a compression device according to an embodiment of the present invention.
  • FIG. 7 illustrates an example of video display according to an embodiment of the present invention.
  • FIG.3 An example of a pulldown detection method to be executed by a pulldown detection device according to the present invention is illustrated in FIG.3.
  • a plurality of block parameters is calculated (step 31, or CBP) for a current field of a video sequence.
  • Each block parameter is calculated from values of pixels of a corresponding block of pixels of the current field.
  • the current field is then compared (step 32, or CFF) to a previous field via the corresponding pluralities of block parameters, thus allowing a relatively fast comparing.
  • a content-change strength (CCS) is computed (step 33, or CMU) from results of the comparisons.
  • the CCS may be computed to return a '0' value only if the plurality of block parameters of the current field and the plurality of block parameters of the previous field are similar. Alternatively, the CCS may be computed to return a '0' value even if the pluralities of block parameters are slightly different.
  • Duplicated fields may subsequently be detected (step 34, or DDF) based at least on the computed CCS. For example, if a CCS is below a predetermined threshold, the corresponding current field is detected as duplicated. Alternatively, the detecting may be based on a sequence of computed CCSs. If the sequence of computed CCSs exhibits a periodic characteristic, a determined kind of pulldown may be detected.
  • FIG. 4 schematically illustrates an example of a method according to an embodiment of the present invention.
  • a 3-2 pulldown for example may have been detected on a video sequence and an assumption is made concerning the pattern of pulldown.
  • the current fields for which a plurality of block parameters is calculated comprise only one field every five fields.
  • Each current field (F n , F n+5 ) is compared to a previous field.
  • the previous field (F n _ 2 , F n+3 ) is the field prior to a penultimate field. If the comparison shows little or no difference, it may be assumed that the current field and the previous field originate from a same film field and that the video sequence goes on resulting from a pulldown.
  • a plurality of block parameters (bp n , bp n+5 ) is calculated.
  • Each block parameter is calculated from pixel values of a corresponding block of pixels of the current field (F n , F n+5 ).
  • a PAL video frame may comprise 720 columns of pixels and 576 lines, i.e. each field comprises 720 columns of pixels and 288 lines of pixels.
  • Each block may for example comprise pixels of 16 columns and 16 lines.
  • each block parameter is computed from pixels of 16 columns and 8 lines of the current field, i.e. 128 pixels.
  • each block parameter is computed from pixels of 13 columns and 10 lines of the current field, i.e. 130 pixels.
  • each block parameter is computed from pixels of 13 columns and 10 lines of the current field and from pixels of 13 columns and 10 lines of the other field of the frame, i.e. 260 pixels.
  • each block parameter is binary.
  • the pluralities of block parameters may be bit planes (bp n , bp n+5 ).
  • each field of the video sequence is compared to a previous field. Therefore, each calculated plurality of block parameters of a current field may be stored to be used for a further comparison as a plurality of block parameters of a previous field. Each plurality of block parameters is compared twice.
  • each calculated bit plane in case of binary block parameters, is compared only once, as a bit plane of a previous field or as a bit plane of a current field.
  • the block parameters may be obtained by a block-wise counting of non- constant features like detected edges, e.g. a sky to house transition.
  • Each block parameter may be for instance a simple block classification that detects horizontal or vertical edges.
  • Other measures may be based for example on luminance, motion vectors etc.
  • the current field is compared to a previous field via the bit planes. For example, each block bit of the bit plane of the previous field is subtracted to a corresponding block bit of the bit plane of the current field, thus providing a difference bit plane (dbp n , dbp n+ s).
  • a CCS may be obtained by counting a number of non-zero bits in the difference bit plane. If, as shown if FIG. 4, the difference bit plane (dbp n , dbp n +s) comprises only zero bits, the CCS returns a zero.
  • the illustrated pulldown detection method further comprises detecting duplicated fields based on the computed CCS.
  • the value of the CCS is compared to zero.
  • a non-zero value of the CCS would indicate that the video sequence no longer originates from a 3-2 pulldown.
  • a zero value allows to assume that the current field and the previous field originate from a same film field and that the video sequence goes on resulting from a pulldown operation.
  • a flag 3-2_detected may be set so as to indicate that a 3-2 pulldown is detected.
  • FIG. 5 schematically illustrates an example of a method according to an embodiment of the present invention.
  • each field of a video sequence is compared to a previous field.
  • Each field (l t , Ib, 2 t , 2b,...) is processed so as to obtain a bit plane (bpl, bp2, bp3, bp4,).
  • Each bit plane has a relatively small volume compared to the corresponding field.
  • Each block bit of a bit plane is indeed obtained from a relatively high number of pixel values.
  • a first content-change strength CCSl and a second content- change strength CCS2 are computed.
  • the first content-change strength CCSl is obtained by comparing a current field, e.g. a second top field F n , to a penultimate field immediately preceding the current field, e.g. a first bottom field F n- 1.
  • the second content-change strength CCS2 is obtained by comparing a current field, e.g. the second top field F n , to the field prior to the penultimate field, e.g. a first top field F n-2 .
  • the video sequence originates from a film sequence and an editor cut has been performed on the video sequence.
  • the origin of each field of the video sequence is represented for a clarity purpose.
  • the first content-change strength CCSl of the first bottom field Ib has a relatively low value, i.e. 3, since the first bottom field Ib and the previous field to which it is compared, i.e. the first top field l t , originate from a same film frame A.
  • the first content-change strength CCSl of the second top field 2 t also has a relatively low value, i.e. 3, since the second top field 2 t and the previous field to which it is compared, i.e. the first bottom field Ib, originates from a same film frame A.
  • the second content-change strength CCS2 of the second top field 2 t has a particularly low value, i.e. zero in this example, thus indicating that the second top field 2 t is a duplicated field.
  • the second top field 2 t and the previous field to which it is compared, i.e. the first top field l t indeed come from a same film field.
  • the detected duplicated field may possibly be removed, thus allowing a compression without loss of information.
  • the CCSs By processing the CCSs, it is therefore possible to detect that the first top field It, the first bottom field Ib and the second top field 2 t originate from a same film frame A.
  • the first content-change strength CCS 1 of a second bottom field 2b has a relatively high value, i.e. 5, that may indicate a film frame change.
  • the second bottom field 2b and the previous field to which it is compared, i.e. the second top field 2 t come from distinct film frames. It is therefore possible to prepare a distinct encoding for the second bottom field 2b and the second top field 2 t , thus allowing a more efficient encoding.
  • the first content-change strength CCSl of a third bottom field 3b has a relatively high value, i.e. 6, that may indicate a film frame change.
  • the relatively high values of the second content-change strengths CCS2 of the fields 2b, 3 t , 3b, 4 t between the second bottom field 2b and the fourth top field 4 t indicate that film frames (B, C) are mapped into two video fields.
  • the particularly high values of the first content-change strength CCS 1 of a fourth bottom field 4b, i.e. 60, the second content-change strength CCS2 of the fourth bottom field 4b, i.e. 58, and the second content-change strength CCS2 of the fifth top field 5 t , i.e. 61, allow to detect an editor cut between the fourth top field 4 t and the fourth bottom field 4b.
  • the fourth frame comprising the fourth top field 4 t and the fourth bottom field 4b should therefore be split before any encoding.
  • the first content-change strengths CCS 1 of the fifth fields 5 t , 5b have relatively low values. Furthermore, the second content-change strength CCS2 of the fifth bottom field 5b has a particularly low value, i.e. 1. It may subsequently be assumed that the fourth bottom field 4b and the fifth fields 5 t , 5b originate from a same film frame C.
  • the computed second content-change strength CCS2 of the fifth bottom field 5b may have a non zero value, due to noise. Appropriate thresholds should therefore be set for an efficient detection.
  • each CCS is compared to several thresholds. For example, a CCS value below 2 indicates that the two corresponding fields correspond to a same film field. A CCS below 4 indicates that the two compared fields originate from a same film frame. A CCS above 30 indicates that the two compared fields belong to distinct scenes.
  • FIG. 6 illustrates an example of a compression device according to an embodiment of the present invention.
  • the compression device 60 which may comprise a computer or another video processing device, e.g. a DVD recorder, comprises a pulldown detection device 61, removing means 62 and encoding means 63.
  • the pulldown detection device 61 allows to detect duplicated fields. Each detected duplicated field may be removed by the removing means 62, thus providing an efficient compression without loss of information.
  • a flag may be set for each removed field.
  • a duplicated field may for example be due to a pulldown, or because the video sequence show little motion. In this example, each duplicated field is removed. Alternatively, only duplicated fields resulting from a pulldown, e.g. one field every five fields, may be removed.
  • the removing means 62 allow to obtain a volume reduced video sequence.
  • the encoding means 63 allow to encode the volume reduced video sequence.
  • the encoding allows to still reduce the volume of the already volume reduced video sequence.
  • the encoding may or may not cause a loss of information.
  • the encoding may be for example an MPEG encoding.
  • a CCS may be computed for each field of the video sequence, the CCS being used both in pulldown detection and MPEG encoding.
  • Calculating means 67 allow to calculate a plurality of block parameters for each field (l t , Ib, 2 t) (7) of the video sequence. Once a current plurality of block parameters is calculated, it is transferred to comparing means 66 and to a memory 68.
  • the comparing means 66 allow to compare block parameters of the transferred plurality of block parameters with corresponding block parameters of a previously stored plurality of block parameters.
  • the previously stored plurality of block parameters, received from the memory 68 corresponds to the previous field to which the current field is to be compared via block parameters.
  • Computing means 64 allow to compute a content change strength (CCS) for each current field, from results of the block parameters comparisons, received from comparing means 66.
  • the computed CCSs may be sent to and used by the encoding means 63 to sort the frames of the video sequence as reference frame or as a bidirectionnally predicted frame.
  • the computed CCSs may also be sent to and used by processing means 65 to detect the duplicated fields.
  • the CCSs may for example be compared to one or more thresholds.
  • a sequence of CCSs may also be analyzed to detect a pulldown.
  • the pulldown detection method according to an aspect of the present invention may be implemented into existing compression devices with a negligible hardware or software effort, since the existing compression devices already comprise the calculating means 67, the comparing means 66 and the computing means 64.
  • the compression device 60 may be comprised in one or more processors or units, e.g. microcontrollers, DSPs, FPGAs, etc.
  • FIG. 7 illustrates an example of video display according to an embodiment of the present invention.
  • the illustrated video display comprises a home-theater projector 70 comprising a pulldown detection device 71 and a restoring device 72.
  • the pulldown detection device 71 and the restoring device 72 allows to provide a sequence having a 24 Hz progressive format.

Abstract

The invention relates to a pulldown detection method and device for detecting pulldown operations performed on a video sequence. The detecting is based on frame to frame or field to field comparisons, each frame or each field being represented by a plurality of block parameters. Each block parameter is calculated (31) from pixel values of a corresponding block of pixels. Comparing (32) two frames or two fields via the corresponding pluralities of block parameters allows to compute (33) a content-change strength (CCS). The pulldown detection method further comprises detecting (34) duplicated fields based at least on the computed content-change strength.

Description

METHOD AND DEVICE FOR DETECTING PULLDOWN OPERATIONS PERFORMED ON A VIDEO SEQUENCE
FIELD OF THE INVENTION
The present invention relates to the field of pulldown detection in a video sequence and, more particularly, to a pulldown detection method for detecting pulldown operations performed on a video sequence. It also relates to a pulldown detection device for carrying out said method, and to a compression device for compressing a video sequence and comprising such a pulldown detection device.
DESCRIPTION OF PRIOR ART
A television image signal generally employs an interlaced video format. Interlaced video is based on a decomposition of each picture frame into two fields commonly known as the top field and the bottom field. A display monitor comprises a series of horizontal lines and vertical lines. The top fields are displayed on odd numbered, i.e. 1, 3, 5,..., horizontal lines and the bottom fields are displayed on even numbered, i.e. 2, 4,..., horizontal lines.
In North America, video displaying on television complies with NTSC standard (National Television Standards Committee), i.e. a 60 Hz interlaced format. The interlaced video is displayed at 30 frames, i.e. 60 fields, per second. In Europe, TV systems complying with PAL standard (Phase Alternating Line) are based on a 50 Hz interlaced format.
Video signal may originate from various sources. For example, a video material may have originated from a film source, or may have been recorded using an interlaced video camera. In the former case, film data may be shot at 24 frames per second or 25 frames per second in Europe. Therefore the film data are scaled in frequency from 24 or 25 frames per second to the NTSC or PAL standard, i.e. 60 or 50 fields per second. 2-2 pulldown is a method for transferring film material that is at 25 frames per second to PAL video at 25 frames per second. 3-2 pulldown and 2-3 pulldown are methods for transferring film material that is at 24 frames per second to NTSC video at 30 frames per second. That is, 24 film frames in 30 video frames requires that four film frames be converted to five video frames.
Figure 1 is an illustration of a 3-2 pulldown conversion method known in the art. Film data 10 are made of film frames (A, B, C, D). Twenty-four film frames are recorded each second. The pulldown operation allows converting four film frames (A, B, C, D) into five interlaced video frames (I1, I2, I3, I4, h) of video data 12. To achieve this, one film frame (A, C) is mapped into three fields (A1, A2, A3, C1, C2, C3) and the following film frame (B, D) is mapped into two fields (B1, B2, D1, D2). Therefore ten fields (A1, A2, A3, B1, B2, C1, C2, C3 D1, D2) are output from four film frames (A, B, C, D), thus allowing to provide five video frames (I1, 12, 13, 14, 15). In a not represented 2-3 pulldown operation, a first film frame is mapped into two fields and the following film frame is mapped into three fields. A 2-3 pulldown may be considered as a 3-2 pulldown with a different pulldown phase.
Each field (A1, A2, A3, B1, B2, C1, C2, C3, D1, D2) may be a top field (At, Bt, Ct, Dt) or a bottom field (Ab, Bb, Cb, Db). The top fields are to be displayed in odd horizontal lines and the bottom fields are to be displayed on even horizontal lines. When three fields are output from a single frame, two of the three fields are similar. Two successive fields allow providing a video frame. For example, a first video frame Ii comprises a first top field lt and a first bottom field Ib. The first video frame Ii comprises the top and bottom fields (At, Ab) of the first film frame A. A second video frame I2 comprises a second top field 2t from the first film frame A and a second bottom field 2b from a second film frame B. Four film frames (A, B, C, D) are mapped into five video frames (I1, 12, 13, 14, 15). The four film frames (A, B, C, D) and the five video frames (I1, 12, 13, 14, 15) may be displayed during one sixtieth of a second to preserve the length of the material to be converted. Each video frame (I1, I2, 13, 14, 15) consequently lasts one thirtieth of a second.
Video data at a video standard, e.g. NTSC or PAL, may for example be broadcasted, downloaded or stored on a medium, e.g. a DVD (Digital Versatile Disk). It may be desirable to compress a video sequence. For that purpose, it may be useful to detect if those video data result from a pulldown operation. Taking the example of a 3-2 pulldown, some video frames, e.g. the first top field lt and the second top field 2t respectively of the first film frame Ii and the second film frame I2, are similar. A volume reduction operation comprising removing some duplicated fields, e.g. the second top field 2t, would allow to save bandwidth or memory without loss of information. A pulldown flag may subsequently be set to indicate the removing of the duplicated fields.
FIG. 2 illustrates an example of a method for processing pictures data. A film sequence made of film frames (A, B,...) is converted (step 21, or PLD) into a video sequence of video fields (lt, Ib, 2t, 2b,...). The video sequence complies with the television standards, e.g. NTSC, and is broadcast (step 22, or BRD). On a receiver side, the video sequence of video fields (lt, Ib, 2t, 2b,...) is received (step 23, or RVD). A detection of a pulldown operation is performed (step 24, or PDT). If a pulldown is detected, duplicated fields are removed (step 25, or DFR). If no pulldown is detected, one can presume that the video sequence has been recorded with an interlaced video camera. In both cases, an encoding step is performed (step 26, or CMP). The compressed video data may subsequently be stored (step 27, or STO).
The pulldown conversion may be followed by a setting of a flag to provide information about the source type. On the receiver side, no pulldown detection operation is theoretically required, since the flag indicates if a pulldown operation has been performed. However, the knowledge at the receiver side about the source type may be unavailable. In particular, information about the source type is unavailable with analog broadcast. With digital broadcast, the presence of a set flag may fail to prevent false film to video conversion in case of phase misalignment. The absence of the flag does not necessarily mean that the source was not film. As a consequence, a pulldown detection may be performed on unknown data.
Pulldown detection may be followed by a removal of the duplicated fields, as shown in figure 2. Alternatively, the pulldown detection may allow to restore an original 24 frames sequence by using field interleaving, thus allowing to obtain a relatively good image quality. The pulldown detection may also allow to detect video frames comprising fields providing from distinct film frame. The effect caused by the pulldown conversion may subsequently be corrected. The US Patent application 2004/0189877 describes a method for detecting a pulldown operation based on motion detection. The US Patent application 2005/0062891 describes a method for detecting a 3-2 pulldown operation based on pixels comparisons. There is a need for a method and an apparatus allowing a faster pulldown detection.
SUMMARY OF THE INVENTION
In a first aspect, the present invention provides a pulldown detection method for detecting pulldown operations performed on a video sequence. For a current field of the video sequence, a plurality of block parameters are calculated, each block parameter being calculated from pixel values of a corresponding block of pixels of the current field. Each block parameter of the plurality of block parameters of the current field is then compared to a corresponding block parameter of a plurality of block parameters of a previous field. A content-change strength (CCS) is computed from results of the comparisons, and duplicated fields are detected based at least on the computed content-change strength.
The pulldown detection is performed on comparing fields via block parameters, i.e. each field is represented by a plurality of block parameters. Each block parameter is obtained from pixels of the corresponding block. The blocks of pixels may for example comprise 64 or 256 pixels. In Prior Art, the pulldown detection is performed by comparing fields on a pixel-by-pixel basis. The method according to an aspect of the present invention hence allows to perform comparisons between fields and therefore pulldown detection at a faster speed than in Prior Art. Furthermore, the pulldown detection methods of the Prior Art are based on pixel-by- pixel comparisons. It is therefore necessary to store at least two fields to allow the comparison between those two fields. At least two fields need to be accessed at once from memory in order to compute differences.
With the method according to an aspect of the present invention, the comparisons are performed from two pluralities of blocks parameters. Since each block parameter corresponds to one block, i.e. a relatively high number of pixels, the present invention allows to store less data than in Prior Art. Typically, a first plurality of block parameters corresponding to the previous field, the current field and a second plurality of block parameters corresponding to the current field may be stored.
In Prior Art, two frames or two fields need to be accessed at once. Furthermore, each frame or field is accessed twice: a first time as a previous field and a second time as a current field. The method according to an aspect of the invention allows a single access, for block parameters calculating. As a consequence, for a same memory as used in Prior Art, several pluralities of blocks parameters may be stored, thus allowing to perform different kinds of fields comparisons, e.g. a current field is compared with the penultimate field immediately preceding the current field and with the field prior to the penultimate field. The present invention hence allows to extract more information from the video sequence.
Advantageously, each block parameter comprises a single bit, called "block bit" in the present description. A feature, e.g. edge detection, may be computed per block. The resulting block bit indicates the existence or absence of the tested feature in the block.
It should be noted that the block parameters may be re -used for other purposes, e.g. content analysis. For example, edge detection may be performed by an already existing device such as sharpness enhancer device.
With single bit block parameters, each plurality of block parameters hence comprises a bit plane. Bit planes are quite easy and quite simple to compare. The comparisons between two bit planes may allow to obtain a difference bit plane. A bit of the difference bit plane having a value equal to ' 1 ' indicates a block bit change between the two compared fields, i.e. a certain amount of pixels of the block corresponding to the non-zero bit have different values in the current field and in the previous field. The content-change strength (CCS) may for example be computed by counting a number of non-zero bits within the difference bit plane. If the resulting sum is below a predetermined threshold, e.g. eight non-zero bits, the CCS may be equal to zero, thus indicating little change between the two fields. The way the comparisons are performed and the way the computing of the CCS is performed do not limit the scope of the present invention. Alternatively, each block parameter may provide more information than a single bit and be encoded over several digits.
The pulldown detection method allows to detect duplicated fields. Fields may be repeated in the video sequence because of the pulldown, or because the video sequence shows little movement. The video sequence may even show still pictures.
The detected duplicated fields may all be removed, thus allowing an efficient volume reduction. A flag may be set to indicate a removal of a given field or a removal of a given sequence of fields. Alternatively, a pattern of a pulldown may be detected, e.g. a 3-2 pulldown, and the field removing may be automatic, e.g. one field is removed every five fields. A 3-2 flag may for example be set to indicate such an automatic field removing. Such volume reductions cause no loss of information.
Advantageously, the comparing steps comprise field by field comparisons, i.e. each block parameter is obtained from pixels values of a portion of a single field. Field by field comparisons allow to extract more information than frame by frame comparisons.
Advantageously, the previous field to which the current field is compared comprises a penultimate field immediately preceding the current field. Such comparisons allow to provide detailed information about content changes. False cuts, i.e. a video frame made of fields from distinct film frames, may be easily detected. When the pulldown detection is followed by an encoding for compression of the fields, if a false cut is detected, the relevant fields may be separated before encoding, thus allowing a more efficient compression. The encoding is indeed particularly efficient for frames having fields originating from a same film frame.
Furthermore, if the current field immediately follows the previous field, any kind of pulldown, e.g. 2-2 pulldown, 2-3 pulldown etc., and pulldown phase may be detected without having to make assumptions as in Prior Art.
Also, data relating to two fields only, e.g. the current field and a bit plane from the previous field, need to be stored.
Immediately following fields have different natures. For example, the previous field is a top field and the current field is a bottom field. Since the comparing of those fields is performed via block parameters, each block comprising pixels from several lines, the comparisons between fields is relatively easy. Alternatively, the previous field comprises the field prior to the penultimate field immediately preceding the current field. The compared fields hence have a same nature.
Alternatively, the comparing steps comprise frame by frame comparisons. Block parameters for blocks of the current field are obtained from pixels of a portion of the frame the current field belongs to. Block parameters for blocks of the previous field are obtained from pixels of a portion of the frame the previous field belongs to.
The computed CCSs may be binary. For example, the computed CCSs return a '0' value when little or no content change is detected between two frames, and a ' 1 ' value when substantial content change is detected.
Advantageously, the computed CCSs are digital numbers having a number of digits greater than unity, thus allowing to provide a relatively accurate information about content change between two frames or two fields. Particularly subtantial content changes, such as a scene cut made by an editor, may subsequently be detected. The editor's cut may be made on an already video converted sequence, i.e. the editor may end up a scene at any frame. The detection of such scene cuts allows to prepare the sequence before a further processing. For example, a flag indicating an editor cut allows avoiding an encoding of two scenes together.
In a second aspect, the present invention provides a pulldown detection device, for detecting pulldown operations performed on a video sequence, comprising calculating means, provided for calculating a plurality of block parameters for a current field of the video sequence, from pixel values of a corresponding block of pixels of the current field. Comparing means are then provided for comparing block parameters of the plurality of block parameters of the current field to a corresponding block parameter of a plurality of block parameters of a previous field. Computing means, connected to the comparing means, are provided for computing a content-change strength (CCS), and detecting means connected to the computing means are provided for detecting duplicated fields. Said pulldown detection device according to an aspect of the invention allows to perform a relatively fast pulldown detection and duplicated fields detection.
In a third aspect, the present invention provides a compression device for compressing a video sequence, comprising a pulldown detection device according to the second aspect of the present invention. The pulldown detection device is provided for detecting duplicated fields of the video sequence. Removing means then allow to remove the detected duplicated fields, so as to obtain a video sequence with reduced field rate. Encoding means allow to encode the video sequence with reduced field rate.
The pulldown detection may allow to compress the video sequence without loss of information, by removing the detected duplicated fields. This field removal may be followed by an encoding.
Advantageously, the encoding means are connected to the computing means of the pulldown detection device. Indeed, it is known in the art to compute CCSs for using CCSs during an encoding of a video sequence. Typically MPEG encoding may be performed with use of CCSs so as to define groups of pictures structures. Using the computed CCSs to detect a pulldown operation is relatively simple and allows to perform the pulldown detection at a relatively high speed. Furthermore, the pulldown detection method according to an aspect of the invention may be easily implemented in an existing compression device.
In a fourth aspect, the present invention provides a video recorder comprising a compression device according to the third aspect of the invention.
The pulldown detection device may alternatively be connected to another device. In a fifth aspect, the present invention provides a video display comprising a pulldown detection device according to the third aspect of the invention. A restoring device connected to an output of the pulldown detection device may allow to restore an original 24 frames sequence by using field interleaving, thus allowing to obtain a relatively good image quality. The pulldown detection may also allow to detect video frames comprising fields providing from distinct film frames. The effect caused by the pulldown conversion may subsequently be corrected by correcting means connected to an output of the pulldown detection device.
The video sequence may for example comply with a PAL or NTSC standard, but other standards may be used.
In a sixth aspect, the present invention provides a computer program product comprising a computer readable medium having computer program code embodied therein for delivering byte code, the computer readable medium comprising computer program code configured to cause a computer to :
- calculate, for a current field of the video sequence, a plurality of block parameters, each block parameter being calculated from pixel values of a corresponding block of pixels of the current field;
- compare each block parameter of the plurality of block parameters of the current field to a corresponding block parameter of a plurality of block parameters of a previous field;
- compute a content-change strength (CCS) from results of the comparisons; and
- detect duplicated fields, the detecting being based at least on the computed content-change strengths.
The computer program product according to an aspect of the invention contains instructions for executing the method according to the first aspect of the invention. The method according to the first aspect of the invention may therefore be implemented within a computer or embedded into video processing devices such as a video recorder or a video display.
These and other aspects of the invention will be described in greater detail hereinafter with reference to the drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is an illustration of a 3-2 pulldown conversion method according to Prior Art.
FIG. 2 illustrates an example of a method for processing pictures data with a pulldown detection step, according to Prior Art.
FIG. 3 illustrates an example of an algorithm to be executed by a pulldown detection device according to the present invention.
FIG. 4 schematically illustrates an example of a method according to an embodiment of the present invention.
FIG. 5 schematically illustrates an example of a method according to an embodiment of the present invention. FIG. 6 illustrates an example of a compression device according to an embodiment of the present invention.
FIG. 7 illustrates an example of video display according to an embodiment of the present invention.
DETAILED DESCRIPTION
An example of a pulldown detection method to be executed by a pulldown detection device according to the present invention is illustrated in FIG.3. A plurality of block parameters is calculated (step 31, or CBP) for a current field of a video sequence. Each block parameter is calculated from values of pixels of a corresponding block of pixels of the current field. The current field is then compared (step 32, or CFF) to a previous field via the corresponding pluralities of block parameters, thus allowing a relatively fast comparing.
A content-change strength (CCS) is computed (step 33, or CMU) from results of the comparisons. The CCS may be computed to return a '0' value only if the plurality of block parameters of the current field and the plurality of block parameters of the previous field are similar. Alternatively, the CCS may be computed to return a '0' value even if the pluralities of block parameters are slightly different.
Duplicated fields may subsequently be detected (step 34, or DDF) based at least on the computed CCS. For example, if a CCS is below a predetermined threshold, the corresponding current field is detected as duplicated. Alternatively, the detecting may be based on a sequence of computed CCSs. If the sequence of computed CCSs exhibits a periodic characteristic, a determined kind of pulldown may be detected.
FIG. 4 schematically illustrates an example of a method according to an embodiment of the present invention. In this embodiment, a 3-2 pulldown for example may have been detected on a video sequence and an assumption is made concerning the pattern of pulldown. The current fields for which a plurality of block parameters is calculated comprise only one field every five fields. Each current field (Fn, Fn+5) is compared to a previous field. The previous field (Fn_2, Fn+3) is the field prior to a penultimate field. If the comparison shows little or no difference, it may be assumed that the current field and the previous field originate from a same film field and that the video sequence goes on resulting from a pulldown.
For each current field (Fn, Fn+5), a plurality of block parameters (bpn, bpn+5) is calculated. Each block parameter is calculated from pixel values of a corresponding block of pixels of the current field (Fn, Fn+5). Typically, a PAL video frame may comprise 720 columns of pixels and 576 lines, i.e. each field comprises 720 columns of pixels and 288 lines of pixels. Each block may for example comprise pixels of 16 columns and 16 lines. When field to field comparison is performed, each block parameter is computed from pixels of 16 columns and 8 lines of the current field, i.e. 128 pixels.
It should be understood, however, that the sizes of the blocks do not limit the scope of the present invention. For example, the blocks may for example comprise pixels of 13 columns and 20 lines. Therefore, for a field to field comparison, each block parameter is computed from pixels of 13 columns and 10 lines of the current field, i.e. 130 pixels. For a frame to frame comparison (not represented), each block parameter is computed from pixels of 13 columns and 10 lines of the current field and from pixels of 13 columns and 10 lines of the other field of the frame, i.e. 260 pixels. In the represented embodiment, each block parameter is binary. The pluralities of block parameters may be bit planes (bpn, bpn+5).
In a preferred embodiment, illustrated in FIG. 5, each field of the video sequence is compared to a previous field. Therefore, each calculated plurality of block parameters of a current field may be stored to be used for a further comparison as a plurality of block parameters of a previous field. Each plurality of block parameters is compared twice.
In the embodiment illustrated in FIG. 4, only one field every five fields is compared to a previous field, i.e. previous fields (Fn_2, Fn+3) also need to be processed so as to provide corresponding pluralities of block parameters. Each calculated bit plane, in case of binary block parameters, is compared only once, as a bit plane of a previous field or as a bit plane of a current field.
The block parameters may be obtained by a block-wise counting of non- constant features like detected edges, e.g. a sky to house transition. Each block parameter may be for instance a simple block classification that detects horizontal or vertical edges. Other measures may be based for example on luminance, motion vectors etc.
The current field is compared to a previous field via the bit planes. For example, each block bit of the bit plane of the previous field is subtracted to a corresponding block bit of the bit plane of the current field, thus providing a difference bit plane (dbpn, dbpn+s). A CCS may be obtained by counting a number of non-zero bits in the difference bit plane. If, as shown if FIG. 4, the difference bit plane (dbpn, dbpn+s) comprises only zero bits, the CCS returns a zero.
The illustrated pulldown detection method further comprises detecting duplicated fields based on the computed CCS. In the illustrated embodiment, the value of the CCS is compared to zero. A non-zero value of the CCS would indicate that the video sequence no longer originates from a 3-2 pulldown. A zero value allows to assume that the current field and the previous field originate from a same film field and that the video sequence goes on resulting from a pulldown operation. A flag 3-2_detected may be set so as to indicate that a 3-2 pulldown is detected.
FIG. 5 schematically illustrates an example of a method according to an embodiment of the present invention. In the illustrated embodiment, each field of a video sequence is compared to a previous field. Each field (lt, Ib, 2t, 2b,...) is processed so as to obtain a bit plane (bpl, bp2, bp3, bp4,...). Each bit plane has a relatively small volume compared to the corresponding field. Each block bit of a bit plane is indeed obtained from a relatively high number of pixel values.
In this example, a first content-change strength CCSl and a second content- change strength CCS2 are computed. The first content-change strength CCSl is obtained by comparing a current field, e.g. a second top field Fn, to a penultimate field immediately preceding the current field, e.g. a first bottom field Fn- 1. The second content-change strength CCS2 is obtained by comparing a current field, e.g. the second top field Fn, to the field prior to the penultimate field, e.g. a first top field Fn-2.
In the represented example, the video sequence originates from a film sequence and an editor cut has been performed on the video sequence. The origin of each field of the video sequence is represented for a clarity purpose. The first content-change strength CCSl of the first bottom field Ib has a relatively low value, i.e. 3, since the first bottom field Ib and the previous field to which it is compared, i.e. the first top field lt, originate from a same film frame A. The first content-change strength CCSl of the second top field 2t also has a relatively low value, i.e. 3, since the second top field 2t and the previous field to which it is compared, i.e. the first bottom field Ib, originates from a same film frame A.
Furthermore, the second content-change strength CCS2 of the second top field 2t has a particularly low value, i.e. zero in this example, thus indicating that the second top field 2t is a duplicated field. The second top field 2t and the previous field to which it is compared, i.e. the first top field lt, indeed come from a same film field. The detected duplicated field may possibly be removed, thus allowing a compression without loss of information.
By processing the CCSs, it is therefore possible to detect that the first top field It, the first bottom field Ib and the second top field 2t originate from a same film frame A. One can assume that at least a portion of the video sequence results from a pulldown.
The first content-change strength CCS 1 of a second bottom field 2b has a relatively high value, i.e. 5, that may indicate a film frame change. The second bottom field 2b and the previous field to which it is compared, i.e. the second top field 2t, come from distinct film frames. It is therefore possible to prepare a distinct encoding for the second bottom field 2b and the second top field 2t, thus allowing a more efficient encoding.
Similarly, the first content-change strength CCSl of a third bottom field 3b has a relatively high value, i.e. 6, that may indicate a film frame change. The third bottom field 3b and the previous field to which it is compared, i.e. the third top field 3t, come from distinct film frames.
The relatively high values of the second content-change strengths CCS2 of the fields 2b, 3t, 3b, 4t between the second bottom field 2b and the fourth top field 4t indicate that film frames (B, C) are mapped into two video fields.
The particularly high values of the first content-change strength CCS 1 of a fourth bottom field 4b, i.e. 60, the second content-change strength CCS2 of the fourth bottom field 4b, i.e. 58, and the second content-change strength CCS2 of the fifth top field 5t, i.e. 61, allow to detect an editor cut between the fourth top field 4t and the fourth bottom field 4b. The fourth frame comprising the fourth top field 4t and the fourth bottom field 4b should therefore be split before any encoding.
The first content-change strengths CCS 1 of the fifth fields 5t, 5b, have relatively low values. Furthermore, the second content-change strength CCS2 of the fifth bottom field 5b has a particularly low value, i.e. 1. It may subsequently be assumed that the fourth bottom field 4b and the fifth fields 5t, 5b originate from a same film frame C.
Even if the fourth bottom field 4b and the fifth bottom field 5b actually come from a same film field, the computed second content-change strength CCS2 of the fifth bottom field 5b may have a non zero value, due to noise. Appropriate thresholds should therefore be set for an efficient detection.
For example, experiments show that a delta of 1% of the bits of the difference bit plane seems sufficient to identify identical fields and frames. Therefore, when comparing bit planes, a CCS returning a zero value means that less than 1% of the bits of the difference bit plane show non-constant features. For example, PAL video field based analysis based on 16x16 blocks leads to a maximum number of 720*288/256, i.e. 810 blocks. For each block, one feature, e.g. "block contains edge" is counted. Counting less than 8 non-constant features when comparing two fields means that the two fields correspond to a same film field.
Alternatively, as illustrated in FIG. 5, no threshold is used to compute the CCSs, but each CCS is compared to several thresholds. For example, a CCS value below 2 indicates that the two corresponding fields correspond to a same film field. A CCS below 4 indicates that the two compared fields originate from a same film frame. A CCS above 30 indicates that the two compared fields belong to distinct scenes.
FIG. 6 illustrates an example of a compression device according to an embodiment of the present invention. The compression device 60, which may comprise a computer or another video processing device, e.g. a DVD recorder, comprises a pulldown detection device 61, removing means 62 and encoding means 63. The pulldown detection device 61 allows to detect duplicated fields. Each detected duplicated field may be removed by the removing means 62, thus providing an efficient compression without loss of information. A flag may be set for each removed field. A duplicated field may for example be due to a pulldown, or because the video sequence show little motion. In this example, each duplicated field is removed. Alternatively, only duplicated fields resulting from a pulldown, e.g. one field every five fields, may be removed.
The removing means 62 allow to obtain a volume reduced video sequence. The encoding means 63 allow to encode the volume reduced video sequence. The encoding allows to still reduce the volume of the already volume reduced video sequence. The encoding may or may not cause a loss of information. The encoding may be for example an MPEG encoding.
A CCS may be computed for each field of the video sequence, the CCS being used both in pulldown detection and MPEG encoding.
Calculating means 67 allow to calculate a plurality of block parameters for each field (lt, Ib, 2t)...) of the video sequence. Once a current plurality of block parameters is calculated, it is transferred to comparing means 66 and to a memory 68. The comparing means 66 allow to compare block parameters of the transferred plurality of block parameters with corresponding block parameters of a previously stored plurality of block parameters. The previously stored plurality of block parameters, received from the memory 68, corresponds to the previous field to which the current field is to be compared via block parameters.
Computing means 64 allow to compute a content change strength (CCS) for each current field, from results of the block parameters comparisons, received from comparing means 66.The computed CCSs may be sent to and used by the encoding means 63 to sort the frames of the video sequence as reference frame or as a bidirectionnally predicted frame.The computed CCSs may also be sent to and used by processing means 65 to detect the duplicated fields. The CCSs may for example be compared to one or more thresholds. A sequence of CCSs may also be analyzed to detect a pulldown.
The pulldown detection method according to an aspect of the present invention may be implemented into existing compression devices with a negligible hardware or software effort, since the existing compression devices already comprise the calculating means 67, the comparing means 66 and the computing means 64.The compression device 60 may be comprised in one or more processors or units, e.g. microcontrollers, DSPs, FPGAs, etc.
FIG. 7 illustrates an example of video display according to an embodiment of the present invention. The illustrated video display comprises a home-theater projector 70 comprising a pulldown detection device 71 and a restoring device 72. The pulldown detection device 71 and the restoring device 72 allows to provide a sequence having a 24 Hz progressive format.

Claims

1. A pulldown detection method for detecting pulldown operations performed on a video sequence, comprising for a current field of the video sequence, calculating a plurality of block parameters, each block parameter being calculated from pixel values of a corresponding block of pixels of the current field; comparing each block parameter of the plurality of block parameters of the current field to a corresponding block parameter of a plurality of block parameters of a previous field; computing a content-change strength from results of the comparisons; and detecting duplicated fields based at least on the computed content-change strength.
2. The method according to Claim 1, wherein the previous field (Fn_i) comprises a penultimate field immediately preceding the current field (Fn).
3. The method according to Claim 1, wherein the previous field (Fn_2) comprises the field prior to a penultimate field immediately preceding the current field (Fn).
4. The method according to Claim 1, wherein the computed content-change strengths are digital numbers having a number of digits greater than unity.
5. A pulldown detection device, for detecting pulldown operations performed on a video sequence, comprising : calculating means, provided for calculating a plurality of block parameters for a current field of the video sequence, from pixel values of a corresponding block of pixels of the current field; comparing means, provided for comparing block parameters of the plurality of block parameters of the current field to a corresponding block parameter of a plurality of block parameters of a previous field; computing means connected to the comparing means, provided for computing a content-change strength (CCS); and detecting means connected to the computing means, provided for detecting duplicated fields.
6. A compression device for compressing a video sequence, comprising : a pulldown detection device according to claim 5, provided for detecting duplicated fields of the video sequence; removing means to remove the detected duplicated fields, so as to obtain a video sequence with reduced field rate; and encoding means to encode the video sequence with reduced field rate.
7. The compression device according to claim 6, wherein the encoding means are coupled to the computing means of the pulldown detection device.
8. A video recorder comprising a compression device according to claim 7.
9. A video display comprising a pulldown detection device according to claim
5.
10. A computer program product comprising a computer readable medium having computer program code embodied therein for delivering byte code, the computer readable medium comprising computer program code configured to cause a computer to calculate, for a current field of a video sequence, a plurality of block parameters, each block parameter being calculated from pixel values of a corresponding block of pixels of the current field; compare each block parameter of the plurality of block parameters of the current field to a corresponding block parameter of a plurality of block parameters of a previous field; compute a content-change strength (CCS) from results of the comparisons; and detect duplicated fields based at least on the computed content-change strengths.
PCT/IB2006/054342 2005-11-21 2006-11-20 Method and device for detecting pulldown operations performed on a video sequence WO2007057868A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP05300952.8 2005-11-21
EP05300952 2005-11-21

Publications (2)

Publication Number Publication Date
WO2007057868A2 true WO2007057868A2 (en) 2007-05-24
WO2007057868A3 WO2007057868A3 (en) 2007-09-07

Family

ID=37946735

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2006/054342 WO2007057868A2 (en) 2005-11-21 2006-11-20 Method and device for detecting pulldown operations performed on a video sequence

Country Status (1)

Country Link
WO (1) WO2007057868A2 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0994626A1 (en) * 1998-10-12 2000-04-19 STMicroelectronics S.r.l. Detection of a 3:2 pulldown in a motion estimation phase and optimized video compression encoder
US6058140A (en) * 1995-09-08 2000-05-02 Zapex Technologies, Inc. Method and apparatus for inverse 3:2 pulldown detection using motion estimation information
US20040227852A1 (en) * 2003-03-05 2004-11-18 Darren Neuman Pulldown field detector
US20050078213A1 (en) * 2003-08-22 2005-04-14 Masatoshi Sumiyoshi Television/cinema scheme identification apparatus and identification method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6058140A (en) * 1995-09-08 2000-05-02 Zapex Technologies, Inc. Method and apparatus for inverse 3:2 pulldown detection using motion estimation information
EP0994626A1 (en) * 1998-10-12 2000-04-19 STMicroelectronics S.r.l. Detection of a 3:2 pulldown in a motion estimation phase and optimized video compression encoder
US20040227852A1 (en) * 2003-03-05 2004-11-18 Darren Neuman Pulldown field detector
US20050078213A1 (en) * 2003-08-22 2005-04-14 Masatoshi Sumiyoshi Television/cinema scheme identification apparatus and identification method

Also Published As

Publication number Publication date
WO2007057868A3 (en) 2007-09-07

Similar Documents

Publication Publication Date Title
US7620261B2 (en) Edge adaptive filtering system for reducing artifacts and method
US5245436A (en) Method and apparatus for detecting fades in digital video sequences
US7054367B2 (en) Edge detection based on variable-length codes of block coded video
KR100497606B1 (en) Block noise detector and block noise eliminator
KR100326993B1 (en) Methods and apparatus for interlaced scan detection and field removal
KR101056096B1 (en) Method and system for motion compensated frame rate up-conversion for both compression and decompression video bitstreams
KR100865248B1 (en) Detecting subtitles in a video signal
US20160065991A1 (en) Method and system for motion-compensated frame-rate up-conversion for both compressed and decompressed video bitstreams
US9693078B2 (en) Methods and systems for detecting block errors in a video
JP2000506696A (en) Method and apparatus for removing block distortion occurring in an image
CN110312163B (en) Video static frame detection method and system
US8724705B2 (en) Detecting repetition in digital video
US8244061B1 (en) Automated detection of source-based artifacts in an information signal
KR100327649B1 (en) Method and apparatus for interlaced detection
JPH03167985A (en) High efficiency coding device
US8767831B2 (en) Method and system for motion compensated picture rate up-conversion using information extracted from a compressed video stream
EP0470772B1 (en) Digital video signal reproducing apparatus
WO2007057868A2 (en) Method and device for detecting pulldown operations performed on a video sequence
Glavota et al. Pixel-based statistical analysis of packet loss artifact features
KR101829262B1 (en) Method for transmitting videos including text and graphics over ip packets and the apparatus thereof
US8319890B2 (en) Arrangement for generating a 3:2 pull-down switch-off signal for a video compression encoder
CN114339431B (en) Time-lapse coding compression method
JP2002354431A (en) Image signal converting apparatus and method therefor
JP4102795B2 (en) Video data correction apparatus and method, video output apparatus and method, playback apparatus and method, program, and recording medium
US10057601B2 (en) Methods and apparatuses for filtering of ringing artifacts post decoding

Legal Events

Date Code Title Description
NENP Non-entry into the national phase in:

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 06821506

Country of ref document: EP

Kind code of ref document: A2

122 Ep: pct application non-entry in european phase

Ref document number: 06821506

Country of ref document: EP

Kind code of ref document: A2