WO2001013647A9

WO2001013647A9 - Method and apparatus for telecine detection

Info

Publication number: WO2001013647A9
Application number: PCT/US2000/040598
Authority: WO
Inventors: Steven W Rodgers; Yendo Hu; Bryan Willson
Original assignee: Tiernan Communications Inc
Priority date: 1999-08-17
Filing date: 2000-08-08
Publication date: 2002-08-08
Also published as: AU7758300A; WO2001013647A1

Abstract

A method for detecting telecine mode in a sequence of video fields includes comparing a first video field and a second video field of the video sequence to determine if one of the first and second fields is a repeat field and declaring telecine mode if a sequence of repeat field corresponds to a telecine pattern. The comparing includes comparing the first field and the second field to generate a difference indication and declaring a repeat field if the difference indication is less than a threshold. The declaration of telecine mode is monitored and the threshold is increased unless stable telecine mode detection occurs or a threshold ceiling is reached. An apparatus for detecting telecine mode in a sequence of video fields includes repeated field detection logic for comparing a first video field and a second video field of the video sequence to determine if one of the first and second fields is a repeat field and telecine pattern detection logic for declaring telecine mode if a sequence of repeat fields corresponds to a telecine pattern.

Description

METHOD AND APPARATUS FOR TELECINE DETECTION

BACKGROUND OF THE INVENTION

Most films and movies are shot at a frame rate of 24 frames per second, noninterlaced while conventional television and digital video exists primarily at 60 fields per second. For progressive formats, the word "field" is virtually interchangeable with the word "frame" in that each field is a separate picture, while for interlaced formats, two fields are merged together to form a single frame. In either type of format, a fundamental incompatibility exists between film-based video and conventional television or digital video. When film-based video is to be encoded for transmission, the 24 frame per second rate of the incoming video is "upconverted" to the 60 field per second rate of most North American transmission standards (e.g., NTSC) by duplicating fields in a known pattern. The resulting video is referred to as telecine mode video.

FIGS. 1A and IB show the telecine pattern of field duplication for interlaced formats and progressive formats, respectively. As shown, both types of formats create 5 fields for every 2 frames of the input film based material. However, the manner in which the fields are created is fundamentally different.

' In the interlaced format shown in FIG. 1 A, four film frames (FA, FB, FC, FD) are upconverted to ten video fields (top field 1, bottom field 1, top field.2, bottom field 2, top field 2, bottom field 3, top field 3, bottom field 4, top field 4, bottom field 4). This conversion is referred to as 3:2 pulldown. Each field contains only one half of the total information needed to represent the entire picture. Therefore, without frame rate conversion, two input film frames could be expected to create 4 fields, for a total unconverted field rate of 48 fields per second. To upconvert to 60 fields per second means that out of every ten fields, two are redundant. In this case, the redundant fields are top field 2 and bottom field 4. This gives a 20% field redundancy for interlaced formats. In FIG. 1 A, the labels U, S, and R stand for unique, source, and repeat fields, respectively, and are used to indicate that for correct field polarity, the duplicated field is actually a repeat of the field before last, rather than the immediately prior field. Note that the S field is also unique, but is labeled differently from the other unique fields to show that it is the source of the R field. Fields must be duplicated in this way or else two bottom or top fields in a row would be produced, which would alter the field polarity of the incoming video stream. For the progressive format shown in FIG. IB, there is no splitting of the incoming film-based frames into separate fields, so that each incoming frame becomes an outgoing field. Therefore, without frame rate conversion, a direct translation from film to a progressive format would result in a field rate of 24 fields per second. To upconvert to 60 fields per second now requires that out of every 5 fields, 3 are redundant, for a total of 60% redundancy. Therefore, even though both interlaced and progressive methods produce the same output field to incoming frame ratio, there is significantly more redundancy when converting film to progressive formats than there is for interlaced formats.

While the above schemes allow film-based video to exist at a field rate compatible with transmission standards, they are wasteful, since unnecessary bandwidth is used to transmit the redundant fields. Of course, once the video has been compressed using MPEG (Moving Picture Experts Group) or any other compression standard, much of the redundant information is dropped; however, a significant amount of bandwidth may still be used to encode the duplicate fields. Although this is more true of progressive formats due to the higher redundancy, interlaced formats may also use a larger than expected bandwidth, since 40% of the reconstructed video frames contain fields from two separate input film frames (e.g., video frame NC comprises top field 2, bottom field 3; video frame ND comprises top field 3, bottom field 4). The fields of these frames are less correlated than would be true for a standard interlaced two-field frame, rendering the subsequent MPEG compression less efficient.

Many MPEG encoders include the capability to detect telecine mode for SDTN (standard definition television). This detection is fairly straightforward for an encoder, since it already has access to external field buffers for performing a pixel- by-pixel comparison of current and prior fields. However, encoding for higher video data rates such as HDTN (high-definition television) may entail using multiple encoders to encode the entire picture. Such a multiple encoder approach is disclosed in U.S. Patent Application No. 09/054,427 which is incorporated herein by reference. In that approach, the video images are divided into overlapping regions with each region being assigned a dedicated encoder. There is a danger that if each encoder is left to detect telecine sequences individually, some of the encoders could fall out of sync with each other in terms of telecine detection. This can happen because the portions of an HDTV image covered by some encoders may be stationary, while for others, there may be motion in the picture. Consider, for example, an HDTN image of an airplane flying across the sky. Most of the multiple encoders will encode only the stationary sky portion of the image, but some will encode the airplane. In that case, the encoders with motion in their respective portion of the image would detect telecine first, while the others would not detect telecine until later. This has the potential of producing an unsatisfactory image which is jerky. Similarly, the occurrence of non-telecine mode, e.g., a video-based commercial occurring in the middle of film material, can result in multiple encoders falling out of sync with each other.

SUMMARY OF THE INVENTION The present approach attempts to identify whether the incoming video is in telecine mode, and if so, to lock to the telecine pattern, so that redundant fields can be identified. These identified fields can then be dropped from the video bit stream.

Accordingly, a method for detecting telecine mode in a sequence of video fields includes comparing a first video field and a second video field of the video sequence to determine if one of the first and second fields is a repeat field and declaring telecine mode if a sequence of repeat fields corresponds to a telecine pattern. The comparing includes comparing the first field and the second field to generate a difference indication and declaring a repeat field if the difference indication is less than a threshold. The declaration of telecine mode is monitored and the threshold is increased unless stable telecine mode detection occurs or a threshold ceiling is reached.

An apparatus for detecting telecine mode in a sequence of video fields includes repeated field detection logic for comparing a first video field and a second video field of the video sequence to determine if one of the first and second fields is a repeat field and telecine pattern detection logic for declaring telecine mode if a sequence of repeat fields corresponds to a telecine pattern.

According to an aspect of the invention, the repeated field detection logic includes an accumulator for summing the pixels of the first field to generate a first pixel sum and for summing the pixels of the second field to generate a second pixel sum. The repeated field detection logic further includes a comparator for comparing the first and second pixel sums to generate a pixel sum difference indication and for declaring a repeat field if the pixel sum difference indication is less than a pixel sum threshold. In a preferred embodiment, the video fields are divided into subfields and a subfield difference indication is generated for each subfield. A repeat field is declared if all subfield difference indications are less than a corresponding subfield threshold.

According to another aspect of the invention, each pixel includes a luma value and a chroma value. First and second luma pixel values are summed to generate respective first and second luma pixel sums and first and second chroma pixel values are summed to generate respective first and second chroma pixel sums. The first and second luma pixel sums are compared to generate a luma difference indication and the first and second chroma pixel sums are compared to generate a chroma difference indication. A repeat field is declared if the luma difference indication is less than a luma threshold and the chroma difference indication is less than a chroma threshold.

According to yet another aspect of the invention, a first pixel grid is applied to the first video field to provide a selection of first pixels; the pixel grid is applied to the second video field to provide a selection of second pixels; each pixel of the selection of first pixels is compared with the corresponding pixel of the selection of second pixels to generate a corresponding first pixel difference indication; and a repeat field is declared if all of the first pixel difference indications are less than a pixel threshold. In a preferred embodiment, a second pixel grid that is centered about a field and is smaller than the first pixel grid is applied in parallel to the video fields along with the first pixel grid to provide corresponding second pixel difference indications. A repeat field is declared if all of the first and second pixel difference indications are less than the pixel threshold. It should be understood that additional pixel grids can be employed in alternate embodiments. The telecine pattern detection logic includes a first state machine for tracking occurrence of repeat fields to generate a first signal indicative of acquisition of telecine mode and a second state machine for tracking occurrence of repeat fields to generate a second signal indicative of loss of telecine mode. The telecine pattern detection logic further includes a switch responsive to the first and second signals to set and reset, respectively, a telecine detection signal indicative of telecine mode status.

According to an aspect, the system looks at a full HDTN image and instructs multiple encoders simultaneously when to drop fields. This helps ensure a seamless transition into and out of telecine mode.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 A illustrates telecine field duplication for interlaced scan video format. FIG. IB illustrates telecine field duplication for progressive scan video format. FIG. 2 is a schematic block diagram of a high-definition television encoding system which includes a telecine detection circuit.

FIG. 3 is a schematic block diagram of the telecine detection circuit of FIG. 2 which includes repeated field detection and telecine pattern detection logic. FIG. 4 illustrates an example of a repeated field pattern in a progressive scan video format.

FIGs. 5 A and 5B are schematic block diagrams of an embodiment of the repeated field detection logic of FIG. 3.

FIG. 6 is a schematic block diagram of an embodiment of the telecine pattern detection logic of FIG. 3.

FIGs. 7 A and 7B show processing of input luma and chroma streams in accumulators provided in the repeated field detection logic of FIG. 5 A.

FIG. 8 shows a main pixel grid and a middle pixel grid provided in the repeated field detection logic of FIG. 5 A. FIG. 9 shows a window comparison block provided in the repeated field detection logic of FIG 5 A.

FIG. 10 is a state diagram for an acquisition state machine for progressive scan video format.

FIG. 11 is a state diagram for a tracking state machine for progressive scan video format.

FIG. 12 is a state diagram for an acquisition state machine for interlaced scan video format.

FIG. 13 is a state diagram for a tracking state machine for interlaced scan video format. FIG. 14 is a flow diagram of an automatic windowing approach.

The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention.

DETAILED DESCRIPTION OF THE INNENTION

An exemplary HDTV encoding system embodying the principles of the invention is shown in FIG. 2. The system includes a telecine detection circuit 10, video splitter 11, microprocessor 12, and encoders 14-1 to 14-N. The telecine detection circuit has inputs for external digital video signal 22 and a window signal 16 from microprocessor 12. The telecine detection circuit attempts to identify whether the incoming video is in telecine mode, and if so, to lock to the telecine pattern, so that redundant fields can be identified. These identified fields can then be dropped from the video bit stream.

The telecine detection is performed on the input video stream 22 prior to encoding. That is, the input video stream as seen by the telecine detection circuit represents the source video before compression, and before it is split by video splitter 11 into independent video streams which are sent to the individual encoders. A field_drop signal 18 and telecine_history signal 20 are output from the telecine detection circuit 10 and fed into microprocessor 12. The telecinejhistory signal 20 and the window_value signal 16 are used to enable automatic telecine detection in varying quality video sequences. This aspect is discussed in more detail further herein. The field_drop signal 18 can be used to create an interrupt or the microprocessor can poll this signal periodically, e.g., at the start of every field. An interrupt is preferred, since the microprocessor only needs to sample this signal once every five fields (interlaced) or three times out of five fields (progressive) if the video is truly in telecine mode. Seeing this signal high, the microprocessor can then write to registers within the MPEG encoders 14-1 to 14-N which instruct the encoders to perform the task of dropping and/or reordering the incoming fields. The field_drop signal 18 is output only when the telecine detection circuit 10 has determined that the incoming video is film-based, i.e., the video is in telecine mode. Therefore, only repeat fields from a valid telecine sequence which have the proper phase will cause a field to be dropped.

It should be noted that the terms "redundant", "repeat" and "duplicate" are used interchangeably throughout the specification. As shown in FIG. 3, the telecine detection circuit 10 for detecting incoming telecine video includes two major functional blocks: repeated field detection logic 30 and telecine pattern detection logic 32. The repeated field detection logic 30 receives window_value signal 16 and the digital video input 22 (FIG. 2) as composite video signals (luma signal 22Y and chroma signal 22C) and synchronization signals (vertical sync 22N and horizontal sync 22H). The detection logic 30 provides a field_match signal 31 to the pattern detection logic 32 which generates the field_drop signal 18. The pattern detection logic also provides the telecine_history signal 20.

Of the two functions, the detection of repeated fields is by far the most problematic, especially for the case where the incoming video stream is noisy or has been encoded and decoded multiple times, hi such cases, fields which are supposed to be repeats of each other may actually have a fair number of differences between them. Determining how much difference to allow before declaring a field match is an iterative process that may have to be adjusted for different quality input video. The following describes several approaches to repeated field and telecine pattern detection.

Repeated Field Detection

An optimal method for repeated field detection, referred to herein as the field buffer approach, uses one or more field buffers to store all of the pixels within given fields. The contents of the current field are then compared to the contents of the previous or next to last field (depending on whether the target format is progressive or interlaced) on a pixel-by-pixel basis. Then, a sum of pixel differences or low pass filtering is used to reduce inter-field noise. This noise can be produced from a variety of sources, one of which is prior multiple encode-decode passes on the input video as noted herein. However, for fields which are true repeats, the differences should be small, since large pixel differences would actually become visible in the video as it is displayed.

The optimal method is impractical for HDTN images due to the large field buffer requirements. A less accurate method that is realizable in programmable logic, e.g., a field programmable gate array (FPGA), and does not require external field buffers comprises adding together all pixels in the current field, storing the sum in an accumulator, and then comparing it with the sum from the last field. This is referred to herein as the pixel accumulation method. Fields which are duplicates should have somewhat similar sums. However, a problem with this method is that small pixel differences can add up very quickly (e.g., there are over 2 million luma (Y) and chroma (C) values in a 1080i field), thereby producing large variations between fields, even those which are supposed to be repeats.

One way around this problem is to low-pass filter the pixels before they are fed into the accumulator. However, in an FPGA with limited space, the multipliers and adders needed to implement even a simple filter can use a prohibitive amount of logic. With some experimentation, it is possible to verify the amount of difference which can be tolerated between repeat fields without requiring low pass filtering. The difference can be set as a window value. This windowing approach can provide repeated field detection that is about 95% accurate. The accuracy of this method increases as the field is divided into smaller subgroups or sections, with pixel accumulation for each subgroup occurring independently. In the limit, each subgroup reduces to the individual pixel size, which returns to the full field buffer approach.

Another method realizable in an FPGA is to use the limited RAM available in the FPGA to perform a partial pixel comparison that in effect is a scaled down version of the field buffer approach. This partial pixel comparison method attempts to strategically pick the pixels which are most likely to be in the path of motion, and compares those pixels with the associated pixels from the previous field. Since most motion will occur in the center of an image, the pixels to be compared might typically be in a grid which spans a large percentage of the image. An additional grid can be used that is smaller and located in the middle of the image where most of the motion will occur. In a widely spaced grid (which may be necessary, due to the limited amount of RAM), it is possible that motion of small objects may not cross one of the grid pixels, in which case a non-duplicate field may be falsely detected as a duplicate. Therefore, it is important to pick the grid boundaries so that motion has a maximum likelihood of hitting a grid pixel, given a limited number of grid pixels. When performing the task of partial pixel comparison, several choices arise for handling noise between pixels from different fields. A common way to handle this task is to use the sum of differences approach:

Σ N ,■ _-_! | x - y I < Threshold

where Ν is the number of pixels in a field, and x and y are pixels from the current and prior fields. While this is an effective method for statistically comparing all pixels within a field, it is not the optimum method when only a limited pixel grid is available. Consider the situation where a video sequence shows a man staring at a computer screen intently, slowly tapping his nose with his index finger. In this case, the differences between fields in progressive mode or alternate fields in interlaced mode are minimal, consisting of only the slow motion of a finger. For this type of video sequence, a sum of differences approach may indicate all fields as matched, since the amount of motion is small. This small accumulated pixel difference may fall well within the sum of differences threshold value, especially if there is little inter-field noise in the rest of the image.

A more accurate method in this case is to assign each pixel a threshold, and if any of the pixels exceeds its assigned threshold, the field match test fails. This approach can detect very small amounts of motion as in the case cited above, because with almost any type of motion, there will be at least one pixel which is significantly altered from its value in the previous field. In the case cited above, a pixel in the path of the moving finger would indicate a large change from its value in the prior field. Thus, the field match test fails even if every other pixel in the grid indicates a perfect match.

A particular embodiment of the repeated field detection logic 30 uses both the pixel accumulation and partial pixel comparison methods to increase the likelihood of an accurate field match comparison. In this embodiment, the pixel accumulation includes breaking the field into four subgroups or sections and storing four different sets of accumulator values in order to increase the accuracy of the comparison. The partial pixel comparison uses two pixel grids, one large and widely spaced pixel grid which attempts to cover a large percentage of the field, and a separate, smaller grid centered about the middle of the field to increase motion detecting capability in the center of the image. Taken together, the two methods can realize over 99.9% accuracy in detecting repeat fields within telecine sequences.

Telecine Pattern Detection

The telecine pattern detection logic 32 determines whether the input video is derived from film by searching for the proper telecine pattern among the repeated fields detected by the repeated field detection logic 30. The pattern detection logic tests to see first that several valid telecine patterns are received before allowing fields to be dropped. The field_drop signal 18 is enabled only for fields which fall into the proper phase of the established telecine pattern, so that fields from stationary video sequences are not also dropped inadvertently unless they have the proper telecine phase.

As described with reference to FIGs. 1 A and_. IB above, the telecine pattern is different for interlaced and progressive formats, since it is governed by the difference between the 24 field per second input rate and the output field rate. The two patterns are now described further.

60 Hz Progressive Format

For 60Hz progressive formats, the telecine pattern to track in the incoming sequence of video fields is as follows: repeat unique repeat unique repeat repeat unique repeat unique repeat....

If the field_match signal 31 takes on a high value when the field is a duplicate, then the pattern can be represented more concisely as follows:

10101101011010110101101011010110101....

where each five field sequence of 10101 represents one full telecine pattern.

Unfortunately, two factors complicate the detection of the telecine pattern: stationary video and occasional field dropouts (i.e. mis-identified fields). For the case of stationary video, the pattern may appear as a long string of l's. For the case of the occasional error in identifying repeat fields, an occasional 0 may occur where a 1 was expected (i.e., the field detected was not a repeat when it should have been). In actuality, with the proper window value, most of the telecine video observed takes on the ideal telecine pattern, but due to the above mentioned non-ideal characteristics, some video may take on the example pattern shown in FIG. 4 which includes repeated fields (stationary video), valid fields, missing repeats and some duplicate fields.

The telecine pattern detection logic 32 (FIG. 3) must be able to handle both stationary video (the most common case), and occasional mis-identified fields. Stationary video is handled by maintaining the current setting during stationary sequences, i.e., if telecine mode has not yet been detected, then do not enter it, and if telecine mode has been detected and entered, do not exit it. In other words, stationary video has no effect on the current state of the system, since it is an indeterminate state which does not provide enough information about the input video to determine whether it is a telecine sequence.

The occasional misidentified field is handled by assuming it is real when not in telecine mode, and allowing for one such field every N valid sequences when in telecine mode. In other words, if the system is not yet in telecine mode but is searching for the pattern, a 0 where a 1 should be located (i.e,. a unique field where a repeat was expected) causes the logic to fall out of the current pattern, and the sequence is declared invalid. If a 0 is received where a 1 was expected but the system is in telecine mode, then this error is disregarded if several more valid telecine sequences are subsequently received. In this way, the telecine detection 5 logic is made immune to small errors, but a continual pattern of invalid sequences causes the system to fall out of telecine mode. Note that, even if it is decided to wait for several invalid telecine patterns before declaring loss of telecine mode, fields will not be dropped as long as the field_match signal 31 (FIG. 3) detects that the fields are different. Therefore, for video with normal amounts of motion, a several 10 sequence delay in exiting telecine mode will not cause video artifacts, since even while telecine mode is on, fields which are detected to be not matched will not be dropped. In the case of stationary video, some non-redundant fields may be dropped when transitioning from telecine to normal video. However, because the video is stationary, these dropped fields do not cause any noticeable change in the image.

15 60 Hz Interlaced Format

As mentioned above, there are two major differences between interlaced and progressive formats with regard to telecine pattern detection. First, the duplicate field pattern is different, taking on the pattern of one duplicate in every five fields, rather than three out of five duplicate fields for progressive formats. Second, in 20 interlaced formats, a duplicate field is a repeat of the field before last, as opposed to the immediately previous field in progressive formats.

Referring again to FIG. 1 A, which shows the redundant field pattern for interlaced systems, the repeat field pattern can be represented by the following:

UUSURUUSURUUSURUUSUR

5 where U represents a unique field, S represents the source field, and R represents the repeat field, which is a duplicate of the source field. The source field is also a unique field, but it is designated differently here in order to show that the repeat field is a repeat of the field before last. As mentioned previously, this is done to ensure that the proper field polarity is retained, since if the duplicate field was a repeat of the immediately previous field, there would be two top or bottom fields in a row.

A standard way to detect the telecine pattern and account for the one field skip between repeat fields is to maintain two field buffers, thereby incurring a two field delay. This approach allows downstream logic to compare the contents of the current field with that of two fields previous. However, there are other ways to account for the skip between duplicate fields in interlaced formats without requiring external field buffer memory. A first option is to use two sets of accumulator registers, and to use only half the area covered by the pixel grids, in order to be able to store the accumulator values and pixel grid values from two fields simultaneously. This option essentially provides a two field delay for detection purposes, although the fields are not delayed as they move through the system. A second option is to operate the detection logic such that latching occurs only on alternate fields, thereby ignoring intermediate fields. According to this option, a full telecine pattern would take 10 field periods to manifest itself to the detection logic, since if only alternate fields are latched, it takes 10 fields to repeat the telecine pattern.

A drawback with the first option is that it requires that the area covered by the pixel grids be cut in half, in order to make room for the results from two fields simultaneously. The pixel grids may akeady be widely spaced due to limited RAM associated with the detection logic and are optimized to detect motion given the amount of RAM available. Cutting such pixel grids in half may degrade the accuracy of the field match logic. On the other hand, the second option would maintain the accuracy of the field match logic, but would take twice as long to lock to the telecine pattern, or to detect loss of lock.

A preferred approach relates to the first option. Note that for completeness, the pixel grids used in an embodiment for progressive formats store both the luma and chroma values at all grid points, essentially requiring two bytes of memory for each pixel (4:2:2 input video). However, since it is true that almost all motion contains a luma component, it is possible to store only the luma portion of the pixel grids with a negligible effect on accuracy. This allows two fields worth of pixel grids having the same original grid dimensions to be entirely stored in the RAM of the detection logic, so that same polarity fields can be compared. It should be noted that storing only the luma components in progressive mode allows for a doubling of the original grid dimensions. However, it has been observed that doubling the grid does not seem to affect the accuracy of the pixel match function in a noticeable manner. The detection logic is also simplified if the grid dimensions for all formats are kept identical, changing only the number of bytes per pixel stored.

In an embodiment described further herein, two sets of accumulator registers are kept in logic, in order to be able to store the accumulator values of all fields, and two sets of pixel grid points are also kept. Top and bottom fields from two preceding field periods can then be compared against the current field. Each polarity field has its own control logic, so that there is a separate state machine for both top and bottom fields, which simplifies the state diagrams and other logic. The telecine detection on both top and bottom fields contributes equally to the acquisition and tracking mode operation.

Given the above discussion, the acquisition and tracking logic in this embodiment searches for the following telecine pattern when the input format is interlaced:

0 0 0 0 1 0 0 0 0 1 0 0 0 0 1.

where 0 indicates no field match, and 1 indicates a field match. It is important to note that the above pattern is in two field-period increments, i.e., the pattern 00001 spans a 10 field duration (166.67ms at 60 Hz field rate). This is because the top and bottom field acquisition and tracking state machines only sample every other incoming field. Therefore, a more accurate representation which takes into account both top and bottom fields is as follows: 0 X 0 X 0 X 0 X 0 X 1 X 0 X 0 X 0 X 0 X 1

where the X's represent the intermediary bottom fields which are ignored by the top field logic, and conversely, the intermediary top fields which are ignored by the bottom field logic. Stationary video is handled as it is for the progressive case, i.e., it causes no change in the current state of either the acquisition or tracking state machines.

FIGs. 5 A and 5B show schematic block diagrams of a particular embodiment of the repeated field detection logic 30 (FIG. 3) which can be implemented in an FPGA. The repeated field detection logic includes accumulator control 120, pixel accumulation section 122, partial pixel comparison section 124 and field match section 126. Each of these sections is now described in detail.

The accumulator control 120 provides a latching signal denoted latch natch 121 which is a timing signal derived from the horizontal and vertical sync signals 22H, 22N. The latchjnatch signal 121 provides a signal transition corresponding to each field period for latching results of the windowing blocks 110- A to 110-D and 118 described further herein.

The pixel accumulation section 122 includes accumulators A to D (102-A to 102-D) which provide pixel accumulation for repeated field detection; registers 104, 106; multiplexers 108 and windowing blocks 110-A to 110-D. As described in further detail hereinafter, accumulator A 102-A takes every other pair of luma pixels starting at luma pixels 1 and 2, and accumulator B 102-B takes the alternate pairs of luma pixels, starting at pixels 3 and 4. Accumulators C and D 102-C, 102-D perform the identical function for the chroma pixels. Providing multiple accumulators and splitting the pixels between them serves two purposes: 1) the accuracy of the field match logic is enhanced by splitting the field into subgroups; and 2) system timing is enhanced by ping-ponging pixels between accumulators, thereby allowing two clock cycles for the accumulator output to become valid.

The accumulator results are stored in a two stage pipeline comprising registers 104 and 106, in order to hold the pixel sums from the two previous fields. For interlaced formats (e.g., 1080i), the accumulated output from field N-l in register 106 and the current field 103 are used as inputs to the windowing block (110- A to 110-D), while for progressive formats (e.g., 720p) the output from field N in register 104 and the current field 103 are used as inputs. The selection between formats is made in multiplexer 108 based on an interlacedprogressive selection signal 101. This selection signal 101 allows the appropriate field match detect signal described further herein to be formed by comparing the current field with the immediately prior field or the field before that, depending on whether the input format is interlaced or progressive. The windowing block 110-A to 110-D detects whether the pixel sum of the current field falls within +/- Win/2 of the pixel sum of the previous or next to previous field, and if it does, a field match for that particular subgroup is declared. The value Win/2 is provided to the repeated field detection logic 30 (FIG. 3) on window_value signal 16. There are four field match signals denoted as Amatch, Bmatch, Cmatch, and Dmatch which represent the progressive luma even and odd and chroma even and odd field match indicators, respectively. Similar signals (denoted Atopmatch, Btopmatch, Ctopmatch, Dtopmatch and Abotmatch, Bbotmatch, Cbotmatch, Dbotmatch) are used to signal top and bottom field matches, respectively, when the input format is interlaced. All the accumulator match signals are formed in the same manner, the only difference is that the interlaced field match signals are formed by comparing the results from two field times in the past, and are broken up into top and bottom field match signals, so they can be fed into their respective state machines described further herein.

The partial pixel comparison section 124 includes main grid and middle grid pixel compare blocks 126-1 and 126-2, respectively. Each pixel compare block 126- 1, 126-2 includes a pixel grid control 112, round block 114, respective grid AMs 116-1, 116-2 and windowing block 118. Each pixel in the respective grid is compared against the associated pixel from the previous field. If the pixel of the current field falls within +/- Pix_Win/2 of the associated pixel from the previous field, then a pixel match for that pixel is declared. Field match signals denoted FldPixMch and FldMidMch are used to indicate field match for the main and middle grid pixel compare blocks 126-1, 126-2, respectively for progressive formats. Likewise, for interlaced foπnats, field match signals TopFldPixMch, BotFldPixMch, TopFldMidMch, BotFldMidMch are used to signal top and bottom field matches, respectively, for the corresponding main and middle grid compare blocks.

In the partial pixel comparison operation, the field match signals are brought high at the beginning of every field period. If any of the pixels in the current field do not match that from the previous field, then the field match signal for the associated grid is brought low (no match) until the begimiing of the next field period. Therefore, even one non-matched pixel (after thresholding) will cause the field match test to fail. A rationale for using this approach as opposed to the sum of differences approach is described above. Note that a chroma_enable signal is output from the pixel grid control logic 112 to enable the chroma portion of each grid pixel to be written only for progressive formats. The pixel grid control 112 also provides read and write control of the respective RAMs 116-1, 116-2. Both the main and middle grid blocks 126-1, 126-2 work identically, but use different positioning constants and different RAM banks 116-1 and 116-2, respectively.

FIG. 5B shows the field match section 126 which includes AND gates 128-1, 128-2 and 128-3. The final field jnatch signals denoted FMjprog 130-1, Top_FM 130-2 and Bot_FM 130-3 correspond to the field jnatch signal 31 (FIG. 3). Each signal is the logical AND of six individual field match signals corresponding to the pixel accumulator section 122 and the partial pixel comparison section 124. The progressive format signals are coupled to gate 128-1 while the top and bottom field signals for interlaced format are coupled to gates 128-2 and 128-2, respectively. For example, the field match signal FM_prog 130-1 is the logical AND of signals

Amatch, Bmatch, Cmatch, Dmatch, FldPixMch and FldMidMch. The match signals FMjprog, Top_FM and Bot_FM are used as inputs to the progressive and interlaced acquisition/tracking state machines described further herein.

FIG. 6 shows a block diagram of the acquisition and tracking logic. The logic includes acquisition section 202 and tracking section 204. The acquisition section 202 includes interlaced acquisition state machine 206 and progressive acquisition state machine 208. The tracking section 204 includes interlaced tracking state machine 218 and progressive tracking state machine 226. For interlaced formats, the acquisition and tracking state machines are functions of both the top and bottom field logic, which both contribute equally to the decision making process. They are split up in this way in order to simplify the state logic, which then has to look only for the pattern 0 0 0 0 1 to detect telecine mode, where 1 is a field match. The interlaced acquisition state machine 206 includes top and bottom field blocks 210 and 212, respectively, which receive corresponding input field match signals Top_FM 130-2 and Bot_FM 130-3. The output of each field block 210, 212 is input to a counter 214. The output of counter 214 provides an input to multiplexer 216. The progressive acquisition state machine 208 receives input field match signal FMjprog 130-1 and provides an output to multiplexer 216. The multiplexer 216 selects between the respective outputs of the interlaced and progressive acquisition state machines 206, 208 responsive to frit/Prog select signal 101 to provide pattern_det signal 217.

The interlaced tracking state machine 218 includes top and bottom field blocks 220 and 222, respectively, which receive corresponding input field match signals Top_FM 130-2 and Bot_FM 130-3. The output of each field block 220, 222 is input to a counter 224. The output of counter 224 provides an input to multiplexer 228. The progressive tracking state machine 226 receives input field match signal FMjprog 130-1 and provides an output to multiplexer 228. The multiplexer 228 selects between the respective outputs of the interlaced and progressive tracking state machines 218, 226 responsive to Int/Prog select signal 101 to provide pattern oss signal 229.

The pattern_det signal 217 from the acquisition section 202 and the pattemjloss signal 229 from the tracking section 204 are input to the respective S and R inputs of SR flip-flop device 230. The Q output of SR flip-flop device 230 provides a telecine iet signal 231. The field lrop signal 18 is the logical AND of telecine ϊet 231 and pattern_sync signal 233. The telecine let signal 231 when high indicates establishment of telecine mode, i.e., an indication that the input video is film-based, and is set by the acquisition state machine. The pattern_sync signal 233 is only output at the proper phase in the telecine pattern, so that the combination of the telecine iet and pattern_sync signals ensures that only repeat fields of the proper phase in film-based video can be dropped from the stream.

The individual blocks in FIGs. 5 A, 5B and 6 are now described in more detail.

Accumulators

FIGs. 7 A and 7B show how the pixel streams 22 Y, 22C are split among accumulators A to D (102-A to 102-D) in an embodiment. The first pair of luma (Ya, Yb) and chroma (Ca, Cb) pixels is input to accumulators A and C, respectively, whereas the second pair (luma Yc, Yd and chroma Cc, Cd) is input to accumulators B and D, respectively. Likewise, subsequent pairs alternate between accumulators. Pixels are grouped into pairs because upstream interface logic (not shown) moves luma and chroma pixels tlirough the pipeline in groups of two in order to reduce the 74MHz input clock rate to a more manageable 37 MHZ. Each pixel is 10 bits in an embodiment. Each pair of 10 bit pixels is first added together to form an 11 bit result (not shown), and the 11 bit sum is then passed to the accumulators. It should be noted that other ways of forming and processing subgroups of pixels can used. In a particular embodiment, the accumulators are 32 bits, which is calculated by noting that a 1080i field consists of 1920 * 540 = 1,036,800 pixels. Since there are half this number of luma or chroma pixels per accumulator due to ping-ponging between two accumulators, tins number reduces to 518,400. Therefore, the total number of bits needed per accumulator is:

Original Bit Width + log2(Number of Pixels) = 11 + log2(518,400) = 30.

This amount can be rounded up to 32 bits to provide extra headroom and operation at a power of 2. Pixel Comparison Grids

FIG. 8 shows the pixel grids used in the pixel comparison portion of the repeated field detection logic 30 (FIG. 3). There are two grids used: a main grid 302 which occupies a vertical length of 500 lines and a horizontal width of 1000 pixels, and a middle grid 304, which occupies a vertical length of 250 lines and a horizontal width of 1000 pixels. Both grids have a spacing of 20 pixels in the horizontal direction 318 and 25 lines in the vertical direction 320. In other words, both grids contain points every 20 pixels horizontally and every 25 lines vertically. Note that the middle grid 304 is no more dense than the main grid 302; however, the middle grid is offset from the main grid both horizontally and vertically. That is, even though pixels are still spaced 20 pixels apart horizontally, the middle grid starts 10 pixels to the right of the main grid start pixel 306, thereby filling in the horizontal spaces. Likewise, even though the middle grid still has a vertical spacing of 25 lines, it is offset from the main grid by 12 lines. The end result is that the middle 250 lines of the field denoted at 322 are more densely covered due to the overlapping of the main and middle grids. This is desirable, since most motion will occur at or near the center of the image.

Although the horizontal and vertical dimensions as well as grid spacing are the same for both 1080i and 720p formats, the offsets must be different (i.e., the horizontal and vertical start and stop points for the grids), in order to fully center them. The main grid start and end pixels are denoted at 306, 308, respectively, and the start and end lines are denoted at 310, 316. For the middle grid, the start and end pixels are offset by 12 pixels from the main grid as noted above. The start and end lines for the middle grid are denoted at 312, 314. In 1080i format, there are 540 lines in a field, so the 500 line main grid starts at line (540 - 500)/2 = line 20 in order to fully center it. The 250 line middle grid starts at line (540 - 250)/2 = line 145. Since there are 1920 pixels in a 1080i line, both 1000 pixel grids start at pixel (1920 - 1000) 12 = pixel 480. In the 720p format, the line start for the main grid is (720 - 500)/2 = line 110, and the line start for the middle grid is (720 - 250)/2 = 235. Since there are 1280 pixels in a 720p line, the pixel start boundary for both grids is at (1280 - 1000)/2 = 140.

Before pixels are stored into their associated RAMs 116-1, 116-2 (FIG. 5A), they are rounded from 10 to 8 bits in blocks 114 to minimize RAM utilization and to allow for the largest number of pixels possible to be compared. Note that in order to declare a pixel field match, all the pixels from both grids must match the associated pixels from the previous field plus or minus a user-defined pixel window.

Windowing

An embodiment of the windowing blocks 110-A to 110-D and 118 (FIG. 5 A) is shown in FIG. 9. The condition being tested for by the windowing block is the following:

Previous - Window/2 <= Current <= Previous + Window 12

That is, the test is whether the accumulator or pixel value of the current field lies within a certain negative or positive distance from the accumulator or pixel value of the previous field. Since this type of equation is cumbersome to implement in an FPGA, the actual (exactly equivalent) expression implemented is:

ABS (Previous - Current) <= Window / 2.

That is, the accumulator or pixel value of the current field is subtracted from that of the previous field in block 402. The absolute value of the difference is taken in block 404 and then this value is compared to the user defined window in block 406. Note that the user variable written to the FPGA actually represents Window / 2. There are separate window values for the accumulator and pixels window functions, since the former must be 32 bits wide, and the latter is only 8 bits wide (the pixels are rounded to 8 bits before being stored in RAM 116-1, 116-2 in FIG. 5A). Telecine Pattern Detection Logic

As described above with reference to FIG. 6, the telecine pattern detection logic 32 (FIG. 3) is broken up into two separate functions; acquisition and tracking. This is done because even though both functions are looking for the same telecine sequence, they take different actions if the pattern is not seen. Since the telecine pattern is different for progressive and interlaced formats, the acquisition and tracking logic is further split into interlaced and progressive logic modules. The following describes acquisition and tracking for both types of formats.

Acquisition in Progressive Formats FIG. 10 shows a state diagram for the acquisition state machine 208 (FIG. 6) used for progressive formats. The acquisition state machine determines whether the incoming video is in telecine mode and increments a counter (not shown) each time a valid telecine sequence is seen. If the counter reaches a user-defined value, then the pattern let signal 217 (FIG. 6) is brought high which in turn sets the telecine let signal 231. The notation for transitions between states is shown as

Fieldjnatch/Inc_cntr, Clr_cntr. As discussed previously, the progressive telecine pattern follows the sequence:

1 0 1 0 1 1 0 1 0 1 1 0 1 0 1....

where a 1 indicates a positive field match signal, and a full telecine sequence is the pattern 10101 (five fields). In FIG. 10, states SO through S4 represent the case in which a valid telecine sequence is seen. That is, to get from state SO to SI, the field match signal FMjprog 130-1 (FIG. 5B) has to be 1, to get from SI to S2 it has to be 0, to get from S2 to S3 it has to be 1, to get from S3 to S4 it has to be 0, and to increment the valid sequence counter in S4, it has to be a 1. This represents an entire telecine pattern of 10101. The other states in FIG. 10 represent deviations from the ideal telecine pattern. States S5 through S7 represent the case where one of the fields which was supposed to be unique (i.e., non-repeat) was detected to be a repeat field, in which case there may be stationary video. In that case, there is not enough information to determine if this group of fields belongs to a telecine sequence. Therefore, the valid counter is not cleared if it has already been incremented, nor is it incremented. From this operation it can be seen that telecine mode will not be entered into until enough motion occurs in the video to produce several valid sequences.

States S8 through S 10 are entered into if a non-repeat field is detected where a repeat field should be located. Unlike the opposite case which can be explained by a stationary video sequence, there is no explanation for receiving a non-duplicate field where a duplicate should be other than that the video is not in telecine, or that the logic is locked onto the wrong phase of the sequence. On the assumption that the latter is true, the acquisition state machine "inserts" extra states into the sequence (states S8 through S10), in order to try to force a pattern shift. The operation of this logic can be seen by considering the following sequence, where the acquisition state machine starts by being locked to the wrong phase:

FieldMatch 1 0 1 0 1 1 0 1 0 1 1 0 1 0 1 1 0 1 0 1

Acq. State X 0 0 1 2 3 7 8 9 100 1 2 3 4 0 1 2 3 4

As can be understood, the logic originally assumed that the third 1 in the sequence was actually the first, but when it got to state S7 and FMjprog signal was a 0, it entered the "invalid path" of states S8 through S10, so that when it got back to state SO, it was actually at the proper synchronization point for the real telecine pattern. The last two telecine patterns have the proper state sequencing, i.e., 0, 1,2, 3, 4, 0.... This resynchronization also works if the logic has locked onto the last 1 in the sequence as shown in the following sequence: Field Match 1 0 1 0 1 1 0 1 0 1 1 0 1 0 1 1 0 1 0 1

Acq. State X X X 0 0 1 5 8 9 10 0 1 2 3 4 0 1 2 3 4

Tracking in Progressive Formats

FIG. 11 shows the state diagram for the progressive telecine tracking state machine 226 (FIG. 6). This machine is only started once the system is in telecine mode, and has the task of verifying that valid telecine sequences continue to be received. The field_sync signal indicated above is used to synchronize the tracking state machine with the start of the telecine pattern, and will only go high at the beginning of the first telecine sequence to be received in telecine mode (i.e., the beginning of the first telecine sequence when the telecine detect bit is high). The notation for transitions between states is shown as Telecine, FieldSync, FieldMatch/IncInvalid, IncNalid. Again, there is a separate path for stationary video, which produces a pattern of all l's on the field jnatch signal FMjprog. This state machine stays in state SO as long as the telecine signal is low, and can only transition to state SI once the acquisition state machine has set the telecine signal. Once it goes to state SI, it will not return to state SO until the telecine signal again drops to 0. Therefore, in progressive telecine mode, the "normal" sequence of events is state transitions SI, S2, S3, S4, S5, SI .... This represents a valid telecine sequence. If a repeat field is ever detected where a non-repeat was expected, the state machine enters states S6 through S8, which represent the stationary video path. If a repeat was expected but a non-repeat is received, the state machine enters states S9 through S12, which represents the invalid telecine sequence path.

When an invalid telecine sequence is received, a counter is incremented. If four valid telecine sequences in a row are subsequently received, the counter is cleared, on the assumption that it was an aberration. Otherwise, if another invalid sequence is seen before having seen four consecutive valid ones, the telecine signal is cleared and acquisition mode is re-entered. Note that if the input stream has truly gone back to normal video, the delay in transitioning out of telecine mode will not cause video artifacts. This is because even while the system is still in telecine mode, a field match indication of 0 (not matched) will cause the field not to be dropped. However, once the system has transitioned out of telecine mode, no fields will be dropped regardless of the status of the field match signal FMjprog.

Note that the state of the telecine detect signal is only sampled in states S5, S8, or SI 2, which represent the end of a valid, uncertain, or invalid sequence, respectively. This means that when dropping out of telecine mode, the tracking state machine will increment the invalid sequence counter one last time, which several clock cycles later has the effect of dropping the telecine detect signal low. However, by that time, the tracking state machine has akeady transitioned back to state 1 to start the next sequence. It is only when the tracking state machine again reaches state S5, S8, or S12 with the telecine detect signal low that it will return to state 0 and re-enter acquisition mode. Although this has the effect of adding a five field delay for the tracking state machine, the telecine detect signal still transitions immediately (without the five field delay).

To reduce the complexity of the state diagram, the pattern_sync signal is not shown in FIG. 11. This signal takes on the value of field match when the tracking state machine is in one of the tenninal states S5 or S8. It is used externally to form the field_drop signal 18 (FIG. 6) which is only brought high at the proper point in the telecine pattern.

Acquisition in Interlaced Formats FIG. 12 shows the state diagram for the acquisition mode state machine 202

(FIG. 6) used for interlaced formats. There are actually two identical acquisition state machines, one for top fields 210, and one for bottom fields 212, both of which contribute equally to the acquisition function. Only one state machine diagram is shown in FIG. 12. The notation for transitions between states is shown as Fieldjnatch Inc_cnrr, Clr_cntr. The acquisition state logic for interlaced formats is somewhat less complicated than for progressive formats, due to the fact that locking to the wrong phase of the sequence is not possible. This is true because there is only one repeated field in every five field sequence, so if a repeat is seen and the incoming sequence is not stationary video (i.e., only one field out of five is a repeat), the pattern must be at the last field in the five field sequence. It should be noted that the states in FIG. 12 correspond to top or bottom fields only, so that it actually takes ten field periods to cycle through states SI through S5.

In FIG. 12, a transition from state SO to SI occurs only when a repeat field is detected. The state machine then looks for pattern of no repeats for the next four top fields, followed by a repeat (states SI through S5). If this pattern is seen, a counter 214 (FIG. 6) is incremented, and a transition is made back to state SI to look for the next pattern. If field match is 0 in state S5, this is not a valid telecine pattern, so the state logic transitions back to state SO to await the next repeat field. As with progressive mode, stationary video has its own separate path, as represented by states S6 - S9. Stationary video sequences do not increment the valid pattern counter, since there is not enough information to determine if this is a telecine pattern. However, if in state S9 a field match value of 0 is received, the valid pattern counter is cleared, since it is determined that the current sequence is not a valid telecine pattern.

Tracking In Interlaced Formats

FIG. 13 shows the state diagram for the interlaced tracking state machine 204 (FIG. 6). The state machine diagram shown is actually duplicated for both top and bottom fields; however, only one state diagram is shown in FIG. 13. As with the acquisition state machine for interlaced formats, each state transition actually represents two field periods, since each state machine will only handle either top or bottom fields. If either state machine detects a non-repeat field where a repeat field is expected, a counter increment signal is output. There are two counter increment signals, one each from the top and bottom field tracking state machines. Either signal will increment the pattern loss counter, and it is only cleared after four valid telecine patterns have been detected from both the top and bottom tracking state machines. Therefore, if two unmatched fields are detected without at least four valid sequences detected in between, telecine loss of lock is declared, and acquisition mode is re-entered. The notation for transitions between states is shown as Telecine, FieldSync, FieldMatch/IncInvalid, IncNalid.

The tracking state machine remains in state SO as long as the telecine signal is 0, i.e., during acquisition mode. When the telecine_det signal goes high and the field sync signal goes high (indicating that the system is in telecine mode and this field is the first one in the pattern), then there is a transition to state SI to start the pattern. States SI through S5 represent the valid telecine pattern of 0 0 0 0 1. If the field match signal Toρ_FM or Bot_FM 130-2, 130-3 (FIG. 5B) is 0 when state S5 is reached, the invalid counter is incremented; otherwise, the valid counter is incremented. The invalid pattern counter is only cleared when the valid counter reaches 4, i.e., when four valid consecutive sequences are then seen. If the invalid counter reaches 2, then the telecine detect signal is cleared, and the next time the tracking state machine hits state S5, it will transition back to state SO, which represents acquisition mode.

If during any point in states SI through S4 the field match signal Top_FM or BotJFM (FIG. 5B) is 1, then stationary video is assumed, as represented by states S6 through S9. If in state S9 the field match signal is 1, then the logic transitions back to state SI without incrementing the counter, since the sequence is uncertain. If in state S9 the field match signal is 0, then the invalid counter is incremented, just as it is in state S5. If the telecine detect signal is low in state S9, the sequence transitions back to state SO to begin acquisition mode. To reduce the complexity of the diagram, the pattern_sync signal is not shown in FIG. 13. This signal takes on the value of field match when the tracking state machine is in one of the terminal states S5 or S9. It is used externally to form the field kop signal 18 (FIG. 6), so that field drop is only brought high at the proper point in the telecine pattern. Automatic Windowing Using Feedback

A common factor in each of the detection approaches described above is that allowances must be made for small differences between fields, even those which are supposed to be duplicates of each other. Unfortunately, the amount of difference which can be tolerated before declaring a field match is not a constant, but is directly dependent on the quality of the video. This presents a problem for an automatic telecine detection system, since it is not necessarily known beforehand what amount of difference between repeat fields should be considered tolerable. If too large a window is programmed, then most fields will appear to be repeated, and the proper telecine sequence will not be detected. If too small a window is programmed, few fields will be detected as repeats, and again the proper telecine sequence will not be detected. It has been observed that given the proper window value, telecine sequences can be detected very easily and the telecine pattern clearly stands out to the pattern detection logic. However, the amount of this window can vary significantly between video sequences from different sources. Referring to FIG. 2 again, an automatic system includes a feedback from the telecine detection circuit 10 to the microprocessor so that the microprocessor can increase the window value 16 until a stable telecine pattern is detected.

The automatic windowing approach samples transitions on the telecine detect signal every ten fields. This ten field sample period is derived from the fact that if the system is akeady in telecine mode, it takes two invalid sequences to fall out of telecine mode, which at 5 fields per sequence, yields a total often fields. When transitioning into telecine mode, the number of valid sequences necessary to declare telecine mode is a user-programmable number; however, it is assumed that this number should not be less than 2. Therefore, the telecine bit should not change more often than every ten fields.

The automatic windowing approach is now described with reference to the flow diagram shown in FIG. 14. Upon initialization, the microprocessor 12 at block 502 writes a nominal window value to the telecine detection circuit 10, which begins to increment an internal high counter (not shown). The high count represents the number often-field increments for which the telecine detect signal was high (telecine detected). After a period of time, e.g., 10 seconds at block 504, the telecine detect circuit 10 latches the counter value into a holding register, and interrupts the microprocessor (while clearing the counter in preparation for the next 10 second interval). The microprocessor 12 then reads the counter value. If the telecine high counter is 60 at block 506 (meaning every 10 field increment had the telecine signal high), then there is no need to increment the window any further, and the system can just start tracking to make sure the telecine signal does not transition. Otherwise, if there were some sequences where the telecine signal was not set, the microprocessor increases the window value by some amount at block 512 and waits for the next interrupt. This continues until one of two events happens: 1) the system detects a stable telecine pattern (i.e., the telecine high counter reads 60) or 2) a predetermined window ceiling is reached at block 508 in which case, the microprocessor "rolls" the window value back to the bottom at block 510, and starts over. If the incoming video is in telecine mode, some window value will eventually be reached which will result in a stable pattern. If the video is not in telecine mode, the microprocessor will end up continually cycling through a range of window values without seeing the telecine detect bit high, since no window value will result in a telecine pattern being detected if the video is truly not in telecine mode. Note that it is very important that the "rollover" window value is chosen carefully, and should not be too high. If a very high window value is written while the telecine detect bit is high, then most subsequent fields will appear to be repeated. If the video then transitions to non-telecine mode (such as during a commercial), the telecine detection circuit may not detect the transition quickly, since most fields will falsely appear to be repeated. This will result in unsatisfactory video. The proper "rollover" window value can be determined by careful testing.

From the above description, if the window value requked is the nominal value (as determined from experimentation with various video sources), a steady telecine mode will be entered into very quickly. However, if the video is especially noisy, it may take a few seconds for the software to settle on the conect window value.

The embodiments described herein are applicable to both HDTV and SDTV formats, with modification of system parameters for pixel comparison functions which keep pixel grids centered about the middle of the field.

A primary design consideration is that the telecine detection be provided entirely within programmable logic using no external memory. While this requires some significant optimizations and tradeoffs from the traditional multiple field buffer approach, it nevertheless is quite accurate, having a better than 99.9% accuracy in detecting matched fields within a telecine sequence (given the proper threshold window, as determined from software feedback). The cost savings incurred by using only a single programmable logic device to perform the telecine detection are especially apparent with HDTV systems, where two 4:2:2 1080i field buffers require over 4 Mbytes of memory. Performing statistical analysis across multiple fields requires even more memory, which may become prohibitive in HDTV systems. In addition, using field buffers as delay elements has the effect of adding some amount of latency to the system.

While this invention has been particularly shown and described with references to prefened embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.

UNITED STATES PATENT AND TRADEMARK OFFICE DOCUMENT CLASSIFICATION BARCODE SHEET

iiπiiiiiiiiiiiiiiiiiiiii

New International Application

Claim(s)

Index 1.1.5.2 Version 1.0

Claims

-32-CLAIMSWhat is claimed is:

1. A method for detecting telecine mode in a sequence of video fields, comprising: comparing a first video field and a second video field of the video sequence to determine if one of the first and second fields is a repeat field; declaring telecine mode if a sequence of repeat fields corresponds to a telecine pattern.

2. The method of Claim 1 wherein the first video field includes first pixels each having a first pixel value and the second video field includes second pixels each having a second pixel value and wherein comparing includes: summing the first pixel values to generate a first pixel sum; summing the second pixel values to generate a second pixel sum; comparing the first and second pixel sums to generate a pixel sum difference indication; and declaring a repeat field if the pixel sum difference indication is less than a pixel sum threshold.

3. The method of Claim 2 wherein the first pixels each include a first luma pixel value and a first chroma pixel value and the second pixels each include a second luma pixel value and a second chroma pixel value; wherein summing the first pixel values includes summing the first luma pixel values to generate a first luma pixel sum and summing the first chroma pixel values to generate a first chroma pixel sum; summing the second pixel values includes summing the second luma pixel values to generate a second luma pixel sum and summing the second chroma pixel values to generate a second chroma pixel sum; comparing the first and second pixel sums includes comparing the first -33-

and second luma pixel sums to generate a luma difference indication and comparing the first and second chroma pixel sums to generate a chroma difference indication; declaring includes declaring a repeat field if the luma difference indication is less than a luma threshold and the chroma difference indication is less than a chroma threshold.

4. The method of Claim 1 wherein the first video field includes first pixels each having a first pixel value and the second video field includes second pixels each having a second pixel value and further comprising dividing the first and second video fields into plural first and second subfields, respectively, the number and position of first pixels for each first subfield conesponding to the number and position of second pixels for each second subfield; for each first subfield, summing the first subfield pixel values to generate a first subfield pixel sum; for each second subfield, summing the second subfield pixel values to generate a second subfield pixel sum; comparing each first subfield pixel sum with a conesponding second subfield pixel sum to generate a conesponding subfield difference indication; and declaring a repeat field if all subfield difference indications are less than a conesponding subfield threshold.

5. The method of Claim 1 wherein the first and second fields are consecutive fields of the video sequence and the video sequence is in accordance with a progressive scan format.

6. The method of Claim 1 wherein the first and second fields are separated from each other in the video sequence by a single field and the video sequence is in accordance with an interlaced scan format. -34-

7. The method of Claim 1 wherein the first video field includes first pixels and the second video field includes second pixels; wherein the first video field precedes the second video field in the video sequence and wherein comparing includes: applying a first pixel grid to the first video field to provide a first selection of first pixels; storing the first selection of first pixels in memory; applying the first pixel grid to the second video field to provide a first selection of second pixels; comparing each pixel of the first selection of first pixels with the conesponding pixel of the first selection of second pixels to generate a conesponding first pixel difference indication; and declaring a repeat field if all of the first pixel difference indications are less than a pixel threshold.

8. The method of Claim 7 further comprising: applying a second pixel grid to the first video field to provide a second selection of first pixels; storing the second selection of first pixels in memory; applying the second pixel grid to the second video field to provide a second selection of second pixels; comparing each pixel of the second selection of first pixels with the conesponding pixel of the second selection of second pixels to generate a conesponding second pixel difference indication; and declaring a repeat field if all of the first and second pixel difference indications are less than the pixel threshold.

9. The method of Claim 8 wherein the second pixel grid is centered about a field and is smaller than the first pixel grid. -35-

10. The method of Claim 9 wherein the second pixel grid is offset from the first pixel grid.

11. The method of Claim 1 wherein the first video field includes first pixels and the second video field includes second pixels and the first video field precedes the second video field in the video sequence, and wherein comparing includes: summing the first pixels to generate a first pixel sum; summing the second pixels to generate a second pixel sum; comparing the first and second pixel sums to generate a pixel sum difference indication; applying a first pixel grid to the first video field to provide a first selection of first pixels; applying the first pixel grid to the second video field to provide a first selection of second pixels; comparing each pixel of the first selection of first pixels with the conesponding pixel of the first selection of second pixels to generate a conesponding first pixel difference indication; and declaring a repeat field if all of the first pixel difference indications are less than a pixel tlireshold and the pixel sum difference indication is less than a pixel sum threshold.

12. The method of Claim 11 wherein declaring telecine mode includes: tracking occunence of repeat fields in a first state machine to generate a first signal indicative of acquisition of telecine mode; tracking occunence of repeat fields in a second state machine to generate a second signal indicative of loss of telecine mode; and using the first and second signals to set and reset, respectively, a telecine detection signal indicative of telecine mode status. -36-

13. The method of Claim 1 wherein comparing includes comparing the first field and the second field to generate a difference indication and declaring a repeat field if the difference indication is less than a threshold.

14. The method of Claim 13 further comprising monitoring the declaration of telecine mode and increasing the threshold unless stable telecine mode detection occurs or a threshold ceiling is reached.

15. The method of Claim 13 wherein declaring telecine mode includes generating a telecine detection signal having a telecine state indicative of telecine mode and a non-telecine state indicative of non-telecine mode, the method further comprising: setting the threshold to an initial threshold value; at a first interval, sampling the telecine detection signal and incrementing a counter if the signal sample equals the telecine state; at a second interval greater than the first interval, reading the counter to provide a counter value and incrementing the threshold if the counter value is less than a counter level.

16. The method of Claim 15 further comprising resetting the threshold to the initial value if the cunent threshold value is greater than or equal to a threshold ceiling value.

17. The method of Claim 1 wherein declaring telecine mode includes: tracking occunence of repeat fields in a first state machine to generate a first signal indicative of acquisition of telecine mode; tracking occunence of repeat fields in a second state machine to generate a second signal indicative of loss of telecine mode; and using the first and second signals to set and reset, respectively, a telecine detection signal indicative of telecine mode status. -37-

18. The method of Claim 17 further comprising generating a pattern signal representative of the telecine pattern and generating a field drop signal responsive to the telecine detection signal and the pattern signal to indicate repeat fields of the video sequence that can be dropped.

19. The method of Claim 1 wherein declaring telecine mode includes generating a telecine detection signal indicative of telecine mode status, the method further comprising generating a pattern signal representative of the telecine pattern and generating a field drop signal responsive to the telecine detection signal and the pattern signal to indicate repeat fields of the video sequence that can be dropped.

20. A method for detecting telecine mode in a sequence of video fields, comprising: receiving a sequence of video fields; comparing a cunent field and a prior field in the video sequence to generate a difference indication; declaring a field match if the difference indication is less than a threshold, otherwise declaring a field mismatch, thereby generating a match sequence; declaring telecine mode if the match sequence conesponds to a telecine pattern.

21. Apparatus for detecting telecine mode in a sequence of video fields, comprising: repeated field detection logic for comparing a first video field and a second video field of the video sequence to determine if one of the first and second fields is a repeat field; telecine pattern detection logic for declaring telecine mode if a sequence of repeat fields conesponds to a telecine pattern. -38-

22. The apparatus of Claim 21 wherein the first video field includes first pixels each having a first pixel value and the second video field includes second pixels each having a second pixel value and wherein the repeated field detection logic includes: an accumulator for summing the first pixel values to generate a first pixel sum and for summing the second pixel values to generate a second pixel sum; and a comparator for comparing the first and second pixel sums to generate a pixel sum difference indication and for declaring a repeat field if the pixel sum difference indication is less than a pixel sum threshold.

23. The apparatus of Claim 21 wherein the first video field includes first pixels each having a first pixel value and the second video field includes second pixels each having a second pixel value and wherein the repeated field detection logic includes: means for dividing the first and second video fields into plural first and second subfields, respectively, the number and position of first pixels for each first subfield conesponding to the number and position of second pixels for each second subfield; a plurality of subfield accumulators each for summing the first subfield pixel values to generate a first subfield pixel sum and for summing the second subfield pixel values to generate a second subfield pixel sum; a plurality of subfield comparators for comparing each first subfield pixel sum with a conesponding second subfield pixel sum to generate a conesponding subfield difference indication; and logic means for declaring a repeat field if all subfield difference indications are less than a conesponding subfield threshold. -39-

24. The apparatus of Claim 21 wherein the first and second fields are consecutive fields of the video sequence and the video sequence is in accordance with a progressive scan format.

25. The apparatus of Claim 21 wherein the first and second fields are separated from each other in the video sequence by a single field and the video sequence is in accordance with an interlaced scan format.

26. The apparatus of Claim 21 wherein the first video field includes first pixels and the second video field includes second pixels; wherein the first video field precedes the second video field in the video sequence and wherein the repeated field detection logic includes: means for applying a first pixel grid to the first video field to provide a first selection of first pixels; means for applying the first pixel grid to the second video field to provide a first selection of second pixels; means for comparing each pixel of the first selection of first pixels with the conesponding pixel of the first selection of second pixels to generate a conesponding first pixel difference indication; and means for declaring a repeat field if all of the first pixel difference indications are less than a pixel threshold.

27. The apparatus of Claim 26 further comprising: means for applying a second pixel grid to the first video field to provide a second selection of first pixels; means for applying the second pixel grid to the second video field to provide a second selection of second pixels; means for comparing each pixel of the second selection of first pixels with the conesponding pixel of the second selection of second pixels to generate a conesponding second pixel difference indication; and -40-

means for declaring a repeat field if all of the first and second pixel difference indications are less than the pixel threshold.

28. The apparatus of Claim 27 wherein the second pixel grid is centered about a field and is smaller than the first pixel grid.

5 29. The apparatus of Claim 28 wherein the second pixel grid is offset from the first pixel grid.

30. The apparatus of Claim 21 wherein the telecine pattern detection logic includes: a first state machine for tracking occunence of repeat fields to 10 generate a first signal indicative of acquisition of telecine mode; a second state machine for tracking occunence of repeat fields to generate a second signal indicative of loss of telecine mode; and a switch responsive to the first and second signals to set and reset, respectively, a telecine detection signal indicative of telecine mode status.

15 31. The apparatus of Claim 21 wherein the repeated field detection logic and the telecine pattern detection logic are implemented in a programmable logic device.

32. An encoding system for encoding a video signal comprising a sequence of video fields, the system comprising: 0 at least one encoder having an input for receiving the video sequence for encoding and a control input for receiving a control signal; a telecine detection circuit comprising: repeated field detection logic for comparing a first video field and a second video field of the video sequence to determine if one of the first and 5 second fields is a repeat field; and

. -41-

telecine pattern detection logic for declaring telecine mode if a sequence of repeat fields conesponds to a telecine pattern and for generating a field drop signal to indicate repeat fields that can be dropped; and a controller for controlling the at least one encoder, the controller 5 being responsive to the field drop signal to provide a control signal to the at least one encoder to indicate repeat fields of the video sequence that are to be dropped.

33. The system of Claim 32 wherein the telecine detection circuit is implemented in a programmable logic device.

10 34. The system of Claim 32 wherein the video signal is an HDTV signal.