MOVING IMAGE RESTORATION
This invention relates to processes and apparatus for the restoration of moving images and in the most important example to the restoration of video archive material.
There exists a very considerable amount of material in video archives and there are commercial imperatives in making as much as possible of this material available for broadcast or other distribution. Archive material tends, however, to suffer from a range of picture quality defects which are visually unacceptable when judged against current display standards. There exists therefore a real need for video archive restoration. Manual techniques exist for the correction of archive defects and for certain defects - which are generally rare - manual correction is the optimum solution. It is neither practicable nor economic, however, for entire archives to be restored manually. A degree of automation is essential. The volume of material requiring restoration demands that the bulk of defects are corrected automatically and at rates which are real-time or close to real-time. It is then possible for sufficient time to be devoted to the manual repair of the most heavily damaged sections.
The categories of defects that affect video archives vary considerably in nature and in the strategies required for their detection and correction. The defects include: Dirt
Sparkle or digital bit errors Video-tape drop outs Noise Film grain
Film and video-tape scratches
Unsteadiness
Brightness flicker
Accurate restoration will in the case of many of these defects demand temporal processing with motion estimation. This will enable advantage to be taken of the temporal redundancy in successive images and will benefit the correction of both impulsive events (such as dirt, sparkle and dropouts) and continuous distortions (such as noise). The video processing burden of motion estimation is high, however, and generally too high for reliance to be placed on general purpose data or video processors. Dedicated hardware is necessary for the more intensive processing functions. Practical and cost constraints demand, however, that the hardware is as simple as possible and is employed as efficiently as possible, consistent always with maintaining the highest levels of performance.
It is an object of certain aspects of the present invention to provide improved processes and apparatus for use in the restoration of video archive or other moving image material. Accordingly, the present invention consists in one aspect in a process for the restoration of a defective video signal, comprising the steps of identifying large area motion vectors, utilising said large area motion vectors in a first processing round to remove unsteadiness from the video signal and thereby generate a steadied video signal, identifying and assigning pixel motion vectors in the steadied video signal; and in a second processing round conducting motion compensated noise reduction.
Advantageously, the first processing round further comprises removal of brightness flicker, and the second processing round further comprises one or both of scratch removal and dirt concealment. In a preferred example, the first block will perform flicker and unsteadiness correction. Considerable hardware economies can be made through the realisation that both flicker and unsteadiness can be corrected with global (or at least block-based) motion vectors. The assignment of motion vectors to individual pixels is therefore not required. This allows a very significant reduction in the motion estimation hardware. Video corrected for flicker and unsteadiness is then passed to a second processing block which performs scratch removal, dirt
concealment and noise reduction. These corrections are applied in parallel, in that detection of both scratches and dirt is conducted on the signal entering the processing block. Spatial and temporal processing is conveniently separated with a spatially restored video signal being presented to a temporal processing unit.
The arrangements according to the preferred embodiments of this invention provide a number of important advantages.
Providing unsteadiness correction in a first processing round, with a dedicated motion estimator, allows the independent selection of modes (such as field and frame mode) in motion estimation for unsteadiness correction, with no regard to the motion estimation requirements for scratch, dirt and noise removal. The saving in hardware through the avoidance of vector assignment in the first processing round has already been noted. Moreover, since motion measurement in the second processing block is conducted on a video signal which has been steadied, there is no need for correction of those motion vectors to compensate for unsteadiness offsets.
Conducting flicker removal in the first processing round, means that the assignment of vectors in the second motion estimation process is more reliable. The matching process will not (or is much less likely to) be confused by the unnatural variation in brightness levels that is flicker.
The invention will now be described by way of example with reference to the accompanying drawings, in which:-
Figure 1 is a block diagram of apparatus according to this invention for video archive restoration; Figure 2 is a block diagram showing in more detail the 1st processing block of the apparatus of Figure 1 ;
Figure 3 is a block diagram showing in more detail part of the apparatus of Figure 2 designated in Figure 2 by a dotted outline;
Figure 4 is a diagram illustrating the function of the flicker removal unit of the apparatus shown in Figure 2;
Figure 5 is a block diagram showing in more detail the 2nd processing block of the apparatus of Figure 1 ;
Figure 6 is a block diagram showing in more detail the motion estimator of the apparatus of Figure 5; and Figure 7 is a block diagram showing in more detail the spatial filter and motion compensation of the apparatus of Figure 5.
Referring initially to Figure 1 , an incoming video signal undergoes motion estimation in block 102 and the video signal, together with a motion vector signal from the motion-estimation block 102, is presented to an unsteadiness-and- flicker-removal block 104. The operation of block 104 will be discussed in more detail below. What should be stressed here is that no attempt is made to assign the motion vectors in block 104 to pixels; vectors are taken as global in character or, at most, restricted to block level. The steadied video signal output from the unsteadiness-and-flicker- removal block 104 then undergoes a second motion estimation process in block 106, with motion vectors and the steadied video signal passing to a scratch- removal-dirt-concealment-and-noise reduction block 108. This block includes a motion compensated process by which motion compensated fields or frames are made available for temporal processing.
Before discussing in more detail the function of processing block 104, it may be helpful to review briefly the nature of the defects to be corrected.
Two forms of unsteadiness are corrected for in apparatus according to this embodiment of the invention: hop/weave unsteadiness and twin lens unsteadiness. Hop/weave unsteadiness arises from a variety of sources including camera shake, sprocket hole damage, printing mis-registration and mechanical misalignments in telecine equipment used for the conversion of cinematographic film to video. A further problem arises with so-called twin-lens telecine equipment in which separate optical paths are provided for each of the two interlaced video fields to be derived from a single film frame. Misalignment between these two optical paths can lead to severe horizontal and vertical
vibrations at the television frame rate, sometimes referred to as twitter. Although twin-lens telecine equipment is no longer in use, a considerable amount of film- originating video archive is believed to have been converted using this technology. Hop/weave unsteadiness and twitter can be very unsettling to a viewer, particularly in cases where film-originating video is displayed alongside "true" video or electronically generated graphics or captions.
Image flicker is a common artefact in old film sequences. It can be defined as unnatural temporal fluctuations in perceived image intensity whether globally or over regions of the picture. Not only is image flicker disturbing to the viewer, it may also hamper motion estimation or - more particularly - the pixel assignment of motion vectors. Image flicker can have a number of sources including ageing of film stock, dust, and variations in chemical or optical processing. Reference is now directed to Figure 2. The input video signal is received in pre-processor 202, which performs a number of operations, including interfacing, aperture correction, and synchronisation. It is advantageous for the pre-processor also to have a video analysis function, in which it detects scene changes, monitors whether the video is film originated and, if so, detects the phase of any 2:2 or 3:2 pulldown sequence. The pre-processor 202 may also take measurements of the noise in the signal for use in later noise reduction processing and may itself conduct some initial processing such as echo concealment and programmable horizontal filtering.
The video-format-and-control block 204 serves to pre-filter and format the video signal supplied to the motion estimation block 206 which, in the preferred example, employs phase correlation. Thus, luminance video is vertically and horizontally filtered, sub-sampled and formatted into 128 by 64 overlapping blocks. The video-format-and-control-block 204 also provides a video output to the unsteadiness-removal block 208, this video signal being delayed so as to be co-timed with the vector output from the motion-estimation block 206.
Additionally, the video format and control block 204 sends and receives control
data to and from the other blocks in the diagram and interfaces - where appropriate - with external equipment.
The preferred form of the motion-estimation block 206 performs a two dimensional Fast Fourier Transform (FFT) on the block-formatted video and correlates against the 2D FFT data generated from the previous field. The correlation data is subject to an inverse 2D FFT before peak hunting and interpolation to derive one or more motion vectors for each block, to sub-pixel accuracy. The height of the associated peak is also measured as an indicator of confidence in the motion vector. (The phase correlation motion estimation technique is well published and reference is directed, for example, to US-A- 4,890,160.) The block based motion vectors are further processed to extract global motion vectors and to derive control parameters for the unsteadiness (both film unsteadiness and twin-lens twitter).
As can be seen more clearly in Figure 3, global vectors from a phase correlation unit 304 are passed to a weave analysis unit 308. It should be explained that the phase correlation unit 304 can operate in both a frame mode - in which it compares frames or corresponding fields from two frames, and a field mode - in which it compares the two fields from a single frame. The global vectors are frame based. Weave-analysis unit 308 serves to avoid the correction of intentional camera pan and tilt, to prevent motion vectors from exceeding a set measurement range and to ensure that the accumulated control signal converges to zero in the absence of motion. Global motion due to unsteadiness is distinguished from "real" motion through consideration of temporal frequency. "Real" global motion will generally be smoothly varying with time, whilst unsteadiness will generally have high temporal frequencies.
Block based vectors, in the field mode of the motion estimation, are provided to a reliability-control unit 306 and then to a model-fitting unit 310. Since the fields under comparison have been derived from the same film frame, any motion vector can be attributed to a distortion. The model-fitting unit 310 attempts to fit a linear transformation to each field in such a way as to remove the assumed distortion and make equal the two fields of the frame. In an interpolate-
parameters block 312, the results of the weave analysis and the coefficients of the linear transformations are used to derive a re-positioning map which is applicable to an entire frame. This map effectively comprises one re-positioning vector per pixel. Using this map, the frame is then re-positioned in block 314. The steadied video signal passes to the flicker-removal block 210 shown in Figure 2. The operation of this block will now be described with reference to the more detailed block diagram of Figure 4.
The amount and distribution of flicker is estimated using the guidelines that flicker is generally of higher frequency than actual luminance and chrominance variation and that flicker is limited in range. The approach taken is to equalise, locally, the mean and the variance in a temporal sense.
The received video signal passes along parallel paths each containing a low pass filter 402 and a compute-variance-and-mean block 404. The upper path contains a field delay 406. The lower path contains a pan-compensation unit 408 which utilises global motion vectors from the phase-correlation unit 304. In the compute-variance-and-mean blocks 404, operating on low pass filtered signals, mean and variance values are computed for overlapping blocks. In the compute-α,β block 410, current mean and variance values are compared with values from the previous field or frame. From this comparison, there are derived an intensity flicker gain parameter α and an intensity flicker offset parameter β. The parameters α,β will not be valid in regions in which there is local motion. Accordingly, for those regions in which motion is detected, the parameter values from surrounding stationary regions are employed. This is achieved in the interpolate-α,β block which receives a control signal from the motion-detect block 414.
The resulting arrays of parameters are smoothed in a filter 416 and then up-sampled in up-sample-bilinear block 418 to full frame resolution using bi-linear interpolation. Each individual pixel of the frame is then corrected in the correct- frame block 420, using the gain and offset parameters ,β to provide the flicker removed output.
A description will now be given of the second processing block in Figure 1 , which serves to remove scratches, conceal dirt and reduce noise. Reference is directed initially to Figure 5.
The steadied and flicker-reduced video signal is received in 4:2:2 format in pre-processor 502. This serves broadly the same function as pre-processor 202 shown in Figure 2. The pre-processed signal then passes to scratch detection block 504. Film scratches are detected by looking for horizontal discontinuities with a large vertical and temporal correlation. Helical video-tape scratches are detected by looking for vertical discontinuities with a medium horizontal correlation and large temporal correlation. Quadruplex video-tape scratches are detected by looking for vertical discontinuities with a small horizontal correlation, medium temporal correlation and a characteristic periodicity.
Primary scratch detection is conducted prior to the motion estimator 506 so that the results of scratch detection can be used for motion vector repair (or more robust motion estimation). The scratch key signal derived in block 504 is carried forward, through the motion estimator 506, so that the scratch can be repaired in the motion compensation unit 510. Wherever possible, scratches will be repaired by replacement from the motion compensated previous field or frame. Wherever the required part of the previous frame is invalid, due to revealed background or a shot change, spatial interpolation will be used. The function of motion estimator 506 is more easily described with reference to Figure 6 which illustrates the content of block 506 in more detail.
Video-format-and-control unit 602 serves to provide filtered and formatted video data to the phase-correlation unit 604. In particular, unit 602 provides a first signal which is vertically and horizontally filtered and sub-sampled luminance formatted into 128 by 64 overlapping blocks and a second signal which is horizontally and vertically low pass filtered video. Additionally, the video-format- and-control unit 602 provides a delayed video output for connection with the spatial filter 508, this delayed video signal being co-timed with the vector bus output from the vector-output-processor 610. The video-format-and-control block 602 also sends and receives control data to and from blocks and interfaces -
where appropriate - with external equipment.
Phase-correlation unit 604 performs a two dimensional FFT on the block- formatted video and correlates against the 2D FFT data generated from the previous field. The correlation data is subject to an inverse 2D FFT prior to peak hunting and interpolation to derive one or more motion vectors for each block, to a sub-pixel accuracy. The height of the associated peak is also measured as an indicator of confidence in the motion vector. A vector bus is provided in tandem to a forward-assignment unit 606 and a backward-assignment unit 608. The phase-correlation unit 604 also passes on to both the assignment units 606,608 and the vector-output-processor 610, the delayed, sub-sampled video signal generated in video-format-and-control block 602.
The forward assignment unit 606 and the backward assignment unit 608 operate to assign vectors using the candidate vectors supplied from the phase correlation unit and serve also to generate error signals. The forward and backward vectors, together with associated error signals, are passed in the form of video streams to the vector-output-processor 610.
It is the function of the vector-output-processor 610 to generate assigned forward and backward vector and error signals, using the vector and error signals from the forward- and backward-assignment units 606 and 608. The error signals are used to determine which vectors are used prior to three dimensional median filtering. Global vectors may be substituted for the forward and backward vectors if error signals are high. Unreliable vectors can also be replaced by projection from preceding or succeeding frames or by spatial interpolation across small areas. It is also possible to apply a constraint of local motion smoothness which must be satisfied by vectors, in addition to the requirement for low match error.
The vector-output-processor 610 receives as an input the scratch key from scratch-detection unit 504. This is used to identify vectors in need of repair. Analysis of error signals also enables the vector-output-processor to generate a "large dirt" key, for subsequent use.
The outputs of the vector-output-processor 610 are: a combined key signal, a confidence signal and a vector bus containing processed vectors ready for use in picture building.
Reference is now directed to Figure 7 which shows in more detail the spatial filter 508 and motion-compensation unit 510 of Figure 5.
The approach adopted in this portion of the apparatus is to generate a spatially filtered signal and to provide this, along with a number of further signals from earlier and later in time, to an arbiter which will select between or blend its inputs in accordance with information from various sources. In more detail, the delayed video signal from the video-format-and-control unit 602 is received by a spatial noise reduction filter 702. The output of this spatial filter 702 is then provided through field/frame delay 706 to arbiter 714. (It should be explained that processing can operate alternatively in field or frame mode and the delays are switched accordingly between field and frame delays.) For convenience, fields are referred to in the following description. The arbiter 714 receives the current field through delay 704 (which matches the delay of the spatial filter 702) and field delay 708. The arbiter 714 also receives two motion compensated fields. One of these motion compensated fields is termed the next field and, through the lack of a field delay matching delays 706 and 708, is a field in advance of the current field. This field is motion compensated using forward pointing vectors in the future-back-projection unit 710. The second motion compensated field is a recursively filtered field. For the recursive loop, a modified output field is created which is the actual output of the arbiter less the "next field" input. This avoids a possibly confusing and unnecessary combination of both forwardly and backwardly motion compensated fields.
The arbiter provides motion compensated recursive noise reduction. In contrast to conventional motion adaptive recursive noise reduction, where recursion is simply turned off in the presence of motion, use is made here of a motion compensated recursive store. Therefore, it should be necessary to turn off the recursion only in the case of shot changes and revealed background. The spatially filtered signal will be included in the temporal averaging, either
continuously or as a fallback when the temporal noise reduction is turned off. To ensure the availability of a spatially filtered signal to contribute to the recursive signal, cross-fader 716 enables a contribution from the spatially filtered signal to be added to the current field.
This approach to noise reduction, where a selection is made from between a current picture signal, a spatially noise reduced current picture signal and a motion compensated recursively filtered signal, is believed to have particular advantages. It will find application beyond the arrangement of first and second processing blocks which is illustrated in Figure 1.