WO1998047292A1

WO1998047292A1 - A method of and a system for processing digital information

Info

Publication number: WO1998047292A1
Application number: PCT/GB1998/001058
Authority: WO
Inventors: Stephen Streater; Frank Antoon Vorstenbosch; Brian David Brunswick
Original assignee: Eidos Technologies Limited
Priority date: 1997-04-11
Filing date: 1998-04-09
Publication date: 1998-10-22
Also published as: GB2326493A; GB9807795D0; AU7057798A; GB2326493B; GB9707364D0

Abstract

A plug-in unit, usable particularly for digital video compression, is composed of a number of integrated circuit devices (102, 103, 104) and is connected via a parallel port interface (117, 118) to a host computer (124) so that the unit can access memory in the host computer as if the memory were directly connected. A test interface, such as IEEE 1149.1, is then adopted to establish the state of the address pins of the devices.

Description

A Method of and a System for Processing Digital Information

TECHNICAL FIELD

The present invention relates to a method of and a system for processing digital

information. More particularly, the invention is concerned with processing information

representing video signals and/or audio signals where the processing involves compression to

reduce the quantity of information needed to reproduce the signals.

DISCLOSURE OF THE INVENTION

An object of the invention is to provide a system composed of one or more

microprocessors in a convenient plug-in unit, which can be used with any suitable existing

personal computer or network computer (and which may be termed the "host computer"), to

enable such computers to be used to process and compress video signals for storage or

transmission.

In one aspect of the invention there is provided a system for processing digital

information utilizing a unit, conveniently a plug-in unit, composed of one or more

microprocessors that normally require auxiliary memory to operate and simulation means for

creating such auxiliary memory from that of another host computer to which the unit is

connected. Microprocessor address pin connections need not be made between the

microprocessors and memory or other parts of the system but instead the state of the address

pins on the microprocessors can be determined through a test interface built into the microprocessors such as the IEEE 1149.1 Standard Test Access Port and Boundary Scan Architecture.

The system can have one or more video and/or audio inputs which can be either digital or analogue and, in the case of analogue inputs, means is provided for converting the analogue signals into digital form. The system may employ compression means for compressing the video and/or audio digital information. Such a system may be connected to a host computer which may assist in audio and/or video compression and decompression.

Preferably means to configure the system may operate in such an order that later stages of the configuration use help from the parts of the system already configured and the host computer, and in such a way that the system can be reconfigured while it is being used.

In another aspect the invention provides a system for digitizing audio by the use of a multichannel low resolution analogue-to-digital converter and an external amplifier so that one channel of the analogue-to-digital converter can digitize low amplitude audio signals with greater precision than another channel that digitizes the unamplified signal. A technique such as interpolation can be used to estimate the complete waveform.

In a further aspect the invention provides a method of processing digital information utilizing one or more microprocessors which normally require additional memory to operate which involves simulating memory using a host computer as if the memory was directly

accessible.

A system in accordance with the invention may comprise a combination of the following:

video-input means, video-digitizing means, audio-input means, audio-digitizing means, means

for effecting video compression in hardware or software or both; means for effecting audio

compression in hardware or software or both; means for further compression in hardware or

software or both; transmission means, means for effecting storage, display means, means for

storing program and configuration data; means for controlling the system in hardware or

software or both; means for simulating memory external to a microprocessor of the system in

hardware or software or both; means for communicating information to a host computer;

and/or means for communicating external memory access information to the host computer.

Another aspect of the invention is a system for digitizing and processing video and/or

audio for storage, transmission or processing in a digital computer system. In operation, the

system may process the digitized video to look for moving or changing parts of the image, or

to recognize objects in the image. This aspect of the invention may also compress the video.

Since the invention is intended for processing digital information, there may be additional

features which allow video and audio information to be accessed and processed. The video compression is achieved by splitting the image up into groups such as rectangular blocks of pixels, called "super blocks". For each of these super blocks a single U

and a single V value for colour, and a Y value for each of minimum and maximum luminance, are coded. The system examines the pixels in the super block, and decides whether each pixel

is nearer the maximum or the minimum luminance, and then codes that information in one bit

called a "shape bit". Groups of pixels that code as the same shape bit value can be compressed

further by using a single shape bit for the group of pixels, plus another indication as to whether

the group of pixels is encoded as individual shape bits or as a single bit to describe the whole group.

The image can be filtered both spatially and temporally. Spatial filtering removes noise

such as spot noise by comparing the shape bit for each pixel with its neighbours using a small

look-up table; the contents of the table can then be altered to change the filtering behaviour.

Temporal filtering is done by having a counter for each of the four super block components U,

V and minimum and maximum luminance. The counter stores historical information about the

accumulated noise in these values in order to find practical estimates of the expected values for

Y, U or V from the noisy source.

The data is subsequently recompressed into a representation which allows for four

possible shape values for each pixel: undefined, uncertain, maximum and minimum. The

additional uncertain value allows for an extra grey scale in the output (allowing for anti-aliasing

of edges) and reduces the data rate for storage or transmission by encoding pixels which fluctuate between shape 0 and shape 1. The system also counts the number of pixels in each super block that are at this uncertain value, and if this count reaches a critical level then the complete super block is transmitted or stored.

To lower the production cost of the plug-in unit, the compression can be split into two parts. The first part uses little memory but needs to operate at high speeds (synchronized with the incoming video), whereas the second part needs large tables in memory but has less severe timing constraints. An efficient way of implementing the system is by having a fast microprocessor connected to the video and/or audio inputs to process at high speed and a relatively low-speed connection to another computer which could be thought of as a host computer serving to implement the second part of the compression. Host computers can be personal computers or network computers or other devices and typically would have several megabytes of memory for storing the compression data and programs. Usually the host would also have means for storing the data, for example on disc, or for transmitting the data through network or modem connections.

The receiving computer (which can be any kind of general-purpose computer) preferably reconstructs the images by bilinearly interpolating the colour and luminance values for each pixel from the values for the current super block and its neighbours. The system can additionally enhances the contrast of edges by estimating where these were in the original image and then interpolating around these edges in such a way that leaves the contrast of the edges unchanged. BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the invention will now be described, by way of example only, with

reference to the accompanying drawings, wherein:

Figure 1 is a block schematic diagram representing the main hardware components of a

unit and at least part of a system constructed in accordance with the invention used to process

both video and audio;

Figure 2 and 3 represent the sequence of elementary processing steps carried out in the

system shown in Figure 1;

Figure 4 represents the sequence of processing steps carried out in the system on the

incoming video data to compress this into key frames;

Figure 5 represents the sequence of processing steps carried out in the system to

recompress the key frames into delta (difference) frames ready for storage or transmission; and

Figure 6 shows the sequence of processing steps carried out in the system or on a

receiving computer to display the stored or transmitted video. DESCRIPTION OF BEST MODES OF CARRYING OUT THE INVENTION

A possible embodiment of the invention may contain the following functional components.

The first component provides the means to provide digital video to the rest of the

system.

The second component provides the means to provide digital audio to the rest of the

system.

The third component provides the means for processing the digital video information.

The fourth component provides the means for processing the digital audio information.

The fifth component provides the means to interface between the unit of the invention

and the host computer.

The sixth component provides the means to interconnect the components of the unit of the invention.

The seventh component provides the means to establish the state of address pins on the microprocessor. The eighth component provides the means for storing the code and configuration

information for the various programmable devices in the system.

The ninth component provides the means for configuring the devices in the unit of the

invention.

The tenth component is an external device which provides the means for handling output

from the unit of the invention in digital form.

In one implementation of the invention, the first component or means is a Philips

SAA7110A video digitizer chip.

The second and ninth components or means are a microcontroller part number Microchip

PIC16C74A.

The third component or means is in combination a SRAM-based FPGA, such as Altera

6000 and 8000 series, and a StrongARM SA-110 microprocessor and the host computer.

The fourth component or means is in combination the PIC16C74A microcontroller and

the host computer. The fifth component or means is in combination an IEEE 1284 compatible parallel

printer port, the PIC16C74A microcontroller and the SRAM-based FPGA.

The sixth component or means is the SRAM-based FPGA.

The seventh component or means is in combination the PIC16C74A microcontroller and

the IEEE 1149.1 test interface on the StrongARM SA-110 microprocessor.

The eighth and tenth components or means are the host computer. The host computer is

typically either a personal computer or a network computer.

The main hardware components of a system and a plug-in unit constructed in accordance

with the invention are laid out in Figure 1. An explanation of the various components in Figure

1 now follows.

101 Video digitizer: this is an integrated circuit, such as a Philips SAA 7110 A, which allows

analogue video from one or more devices such as a video camera or a video tape

machine to be converted into digital information which can then be processed. In

another implementation of the invention, this is replaced by a digital camera chip, which

takes its input directly from light and so removes the need for an additional video source

such as a camera or a tape machine. 102 FPGA programmable logic: this is a SRAM based FPGA integrated circuit, such as Altera 6000 and 8000 series, which can be programmed to contain a wide range of combinations of logic gates. These logic gates perform certain operations more efficiently than a software system, but the device itself is fully programmable so that the unit of the invention retains the flexibility inherent in a software system. The FPGA performs glue logic functions such as connecting the video digitizer (device 101), StrongARM (device 103), PIC microcontroller (device 104) and the host computer (device 124) via the IEEE 1284 compatible parallel port (signals 117 and 118). In addition, it performs some processing on the data stream to assist in the processing of the digital video and/or audio information.

103 StrongARM microprocessor such as SA-110: this is typical of a new range of embedded microprocessors. These have the following features in common: low cost, fast instruction execution, low power consumption, and large internal cache memory. In this implementation, the StrongARM implements most of the compression.

104 PIC microcontroller such as microchip PIC16C74A: this digitizes the audio from the audio input via a combination of connections 121 and 122. In addition, it connects to the FPGA (device 102) and programs this on start up and to the StrongARM (device 103) and reads the addresses of any external memory requests from this device. The microcontroller subsequently requests this information from the host (device 124) via the parallel port (signals 117 and 118) before sending the data to the FPGA (device 102) to

forward to the StrongARM (device 103).

105 Audio preamplifier: this amplifies incoming audio signals.

106 Audio amplifier for low amplitude signals: the output from this component can be used

instead of the output from component 105 by means of a real-time software switch

which operates in response to the level of incoming samples.

107 14.3MHz crystal oscillator: this is used by the PIC microcontroller (device 104), the

FPGA (device 102) and indirectly through the FPGA by the StrongARM (device 103)

for their system clocks.

108 26.8MHz crystal: this is used by the video digitizer (device 101) for its system clock.

109 Video input to the SAA7110 digitizer (device 101).

110 14.3MHz clock signal for the FPGA (device 102) and PIC microcontroller (device 104).

111 This is an I²C bus, and allows the microcontroller (device 104) to initialize and control

the video digitizer integrated circuit (device 101). 112 Control signals: information flows from the digitizer (device 101) to the FPGA (device 102) containing information about line and field sync and the pixel clock.

113 YUV data: digital pixel information is transferred from the video digitizer (device 101) to the FPGA (device 102) for processing. This information will typically be in a standard format, for example 8 bits accuracy for the Y (luminance) on every pixel, and 8 bits each

of U and V (chrominance) on every pair of pixels.

114 Control signals: this is a bidirectional link. The StrongARM (device 103) reports to the FPGA (device 102) every time it accesses a non cached location and requires a simulated memory access. FPGA signals to the StrongARM to wait until the information it has requested is available. In addition, as digital video or audio data becomes available, the StrongARM interrupt lines are triggered by the FPGA to signal to the StrongARM to read this data.

115 Data bus: data such as instruction and data cache initial contents and new pixel and audio data is transferred to the StrongARM through this connection.

116 Control signals: this is a bidirectional link. The microcontroller (device 104) programs the FPGA (device 102) with its initial configuration. The FPGA signals the microcontroller to check the address lines on the StrongARM (device 103) when the StrongARM has requested an external access from the FPGA. 117 Control signals: the standard nine control lines on a parallel port are connected so as to allow the FPGA (device 102) and the microcontroller (device 104) to share control of the printer port to the host computer (device 124).

118 Data signals: this allows the FPGA (device 102), the microcontroller (device 104), and the host computer (device 124) to share the 8 parallel port data lines.

119 IEEE 1149.1 test interface: this is a bidirectional link between the StrongARM (device 103) and the microcontroller (device 104). The microcontroller requests information about the state of the I/O connections on the StrongARM, such as the state of its address pins, which is then provided by the StrongARM back to the microcontroller.

120 Clock signal: 3.57 MHz clock for StrongARM timings.

121 Preamplified audio.

122 More highly amplified audio.

123 StrongARM address lines: The system is designed in such a way as to not require any external connections to the StrongARM address lines. This reduces printed circuit board area and pin count on the FPGA (device 102), reducing electromagnetic interference,

reducing the cost and increasing the reliability of the system.

124 Host computer: This is not part of the plug-in unit, but is necessary for the unit to

perform. The host computer will be a personal computer or a network computer with an

IEEE 1284 parallel port. Such a host computer will typically include some means for

displaying, transmitting or storing the data sent to it through the parallel port interface

(signals 117 and 118). In addition, the host computer will typically contain the data

required for configuration of the FPGA (device 102) and the software for the StrongARM (device 103).

Figure 2 shows the initialisation procedure adopted in an embodiment of the invention.

The four devices labelled at the top, namely StrongARM (device 103), FPGA (device

102), microcontroller (device 104) and host computer (device 124) all have the ability to

process information and react to events. In effect this is a parallel computer system, where

each device waits for the appropriate time to be initialized or to initialize.

Figure 3 shows the memory read cycle of the StrongARM microprocessor 103. As

external memory accesses are simulated and the address lines are not connected, the various

components cooperate to ensure that execution continues smoothly despite the absence of

external memory devices in the invention. Figures 4 and 5 outline the method for compressing video information. This compression is done in several phases.

Reduce luminance to 6 bits (401, 402 and 403): Luminance is 8 bits after digitizing, of

which 7 bits are used as the index into a look-up table to give a 6-bit luminance value.

Extracting shape (403 and 404): The image is split into 8x8 blocks, called "super

blocks". These are represented as a single U and a single V value for colour, and two Y values

Ymin and Ymax for minimum and maximum luminance. A shape bit is a bit which indicates

that a pixel (or a block of 2x2, 4x4 or 8x8 pixels) is nearer the minimum or nearer the

maximum luminance in the super block, and can be thought of as a one-bit luminance value.

Temporal filtering (405): The two colour components U and V, and the two luminance

values Ymin and Ymax are filtered in a temporal way. The system uses four bits of memory

for each of the values per super block to store historical information about the accumulated

noise in these values, these in addition to the 6 bits required to store each Y value, and 6 bits

required to store each of U and V.

Spatial filtering (406): Shape, as described above, is filtered to remove spot noise, which

is noise where only one or a few pixels deviate from other local pixels. A look-up table is used

which takes five input bits, being the shape bit for a pixel and four of its nearest neighbours. This look-up table is stored in a processor register for fast access, and generates a shape bit as output which is then used as the shape for the pixel. The filtering is implemented using a 32 bit look-up table and typically performs a median function. At the edge of each super block, the filtering assumes that all pixels over the super block edge are the same shape as the central value.

Fractal compression of shape (407): The shape of all the pixels in each super block is compressed in a fractal way: a single "0" bit for a uniform super block in which all the pixels are the same luminance, or a "1" bit followed by four bits indicating the shape for subsets of 4x4 pixels. These four bits then either indicate the subset is all of the same luminance in which case a single bit follows indicating whether that is the maximum or minimum luminance, or that four more bits follow to indicate the luminance for each of the four subblocks of 2x2 pixels. These four bits again then either indicate the subset is all of the same luminance in which case a single bit follows indicating whether that is the maximum or minimum luminance, or that four more bits follow to indicate the luminance for each of the four pixels.

Compression of U and V (408): The two colour components U and V are compressed by taking advantage of spatial similarities of the colours in the image.

Key frames: the unit stores complete frames at the full resolution, e.g. 320x240 pixels, compressed as described above. These key frames are transmitted over the parallel port to the host computer. The unit does not calculate differences between frames' luminance values b this is left to the host with its much larger memory, but the U and V values are compressed

spatially to reduce frame size.

Noise reduction on the host (501 to 506): Once the data is received by the host, it is

decompressed and then recompressed giving delta (difference) frames. The source pixels are

all specified as one shape bit, indicating either a maximum or minimum luminance value.

However, after the compression on the host, they are all one of four values: undefined,

uncertain, maximum and minimum. Pixels of undefined state can be switched to either

maximum or minimum luminance by sending or storing a "1" or a "0" bit. Pixels of maximum

or minimum luminance state can be changed to uncertain by sending or storing a "1" bit.

Otherwise, a "0" bit is sent or stored. Thus pixels which fluctuate between maximum and

minimum luminance values are not re-sent if they are considered to be local noise. These

pixels are displayed on the receiving machine as the average luminance of the maximum and

minimum values, so giving the effect of anti-aliasing along noisy edges, with very low data

rate.

If the number of uncertain pixels in any super block reaches a critical level (a level which

can be changed) then all the pixels in the super block are set to the undefined state, which will

cause them to be resent as either maximum or minimum luminance. History compression (505) is a loss-free means for lowering the data rate by looking for exact matches between the encoding of the current shape and the encoding of the shape in a previous frame or frames.

Figure 6 outlines the method to decompress the video on the receiver.

Interpolation on the receiver: The image reconstructed on the receiver (either connected through some network to the transmitting machine, or playing back images that have been stored on disc) would appear quite blocky, as a consequence of the low number of bits per pixel transmitted or stored. Inteφolation of the U and V colour values at super block resolution gives an adequate image when each 4x4 pixel quadrant of the super block has its U and V values calculated by bilinear inteφolation with the U and V values for the four super blocks neighbouring the super block corner. The Ymin and Ymax luminance values are also inteφolated in a similar way, however, a Y value is taken from neighbouring super blocks only when it is are nearer in luminance to the central Y value than its complement. If this is not the case then the super block is probably on an edge in the image, and because antialiasing with a luminance value taken from the wrong side of the edge is not desirable the central Y value is

taken in those cases.

Claims

1. A system for processing digital information comprising one or more devices, particularly

microprocessors, that normally require external memory to operate connected to utilize

simulated memory provided by a separate host computer as if such simulated memory was

directly connected, wherein a test interface on at least one of the devices is used to establish

the state of address pins on the device or devices to obviate address pin connections.

2. A system according to claim 1, and adapted to process video signals, audio signals or

both, received in analogue form or in digital form.

3. A system according to claim 2 and further comprising means for compressing the

digitized information.

4. In combination a system according to claim 2 and said host computer and further

comprising means for compressing the digitized information, wherein the compression means is

partly in the system and partly in the host computer.

5. A system according to any one of claims 1 to 3 or a combination according to claim 4

and further comprising means for transmitting the processed information or storing the processed information or both, and means for displaying or playing back or decompressing the stored or transmitted information.

6. A system according to any one of the preceding claims and embodied at least partly as a plug-in unit.

7. A method of processing digital information in a system utilising one or more devices, particularly microprocessors, which normally require additional memory to operate which involves simulating directly accessible memory using a host computer connected with the devices, and providing a test interface on at least one of the devices to establish the state of address pins on the device or devices to obviate address pin connections.

8. A method according to claim 7 and used to process video and/or audio signals by compression.

9. A method according to claim 8 and further comprising transmitting the processed information or storing the processed information or both, and displaying or playing back or decompressing the stored or transmitted information.

10. A method according to claim 9, wherein the compression is performed partly in the system and partly in the host computer.

11. A method according to any of claims 9 to 10, wherein video information is processed and the compression of video information involves combining a subsequent image with a previous image or previous images.

12. A method according to claim 11 in which image video information is processed and noise is reduced by comparing the colour values and the minimum and maximum luminance values in regions of the image with corresponding values in the previous image or previous images and removing temporal noise.

13. A method according to any of claims 8 to 12 in which image video information is processed and noise is reduced or further reduced by comparing individual pixels with their spatial neighbours, and removing spot noise.

14. A method according to any of claims 8 to 13 in which video information is processed and the luminance of individual pixels of an image is encoded as a choice between the lightest or the darkest pixel in a group of pixels which includes the individual pixel.

15. A method according to claim 14 in which images which are transmitted or stored are processed on a displaying or receiving device so as to recreate more than two levels of luminance in each of the groups of pixels by inte╧åolating the luminance values between the two choices as a function of the position of the individual pixels and the values of the luminance in neighbouring groups.

16. A method according to claims 14 or 15 and ascertaining whether a particular pixel is of indeterminate luminance value, which is displayable as an extra luminance level, by checking whether said choice is varying.

17. A method according to claim 16 and updating groups of pixels depending on the number

of pixels in the group that are of indeterminate luminance value.

18. A method according to any of claims 9 to 17 in which video information is processed and further comprising adopting groups of pixels for transmission or storage in dependence on accumulated values representing change in pixels associated with the groups.

19. A method according to any of claims 7 to 18 and further comprising configuring or reconfiguring the system whilst in operation.