GB1605135A

GB1605135A - Variable image display apparatus

Info

Publication number: GB1605135A
Application number: GB2326977A
Authority: GB
Original assignee: UK Secretary of State for Industry
Current assignee: UK Secretary of State for Industry
Priority date: 1978-05-31
Filing date: 1978-05-31
Publication date: 1982-02-10

Description

(54) IMPROVEMENTS IN OR RELATING TO VARIABLE IMAGE DISPLAY APPARATUS (71) I, THE SECRETARY OF STATE FOR INDUSTRY, LONDON, do hereby declare the invention for which I pray that a patent may be granted to me, and the method by which it is to be performed, to be particularly described in and by the following statement: The present invention relates to variable image display apparatus for producing, for example, a representation of the face of an un- known person.

Known equipment for constructing a pictorial representation of the face of a person seen by a witness is described for instance in U.S. Patent No. 2 974426. The equipment comprises an arrangement for making a combination of drawings or photographs of facial features selected from a kit. The quality of representations achieved by the known equipment is limited by the fixed combinations of details appearing on the available drawings or photographs in the kit and by the mismatches in tones or shading or gaps which tend to occur at the edges of the drawings or photographs combined together.

The possibility of recording photographic portraits by scanning photographs to form digital records of the relative brightness of elementary areas of the photographs and storing the records in a digital computer store has been disclosed by L D Harmon in an article in the journal Scientific American November 1973 pp 70-82.

An article by Gillenson and Chandrasekaran published in the journal Pattern Recognition vol 7 pp 187-196 (1975) may be considered pertinent prior art; this discloses a system for modifying records of drawings of facial features by computer processing, and the display of a representation formed from the modified records and its adjustment to match a photograph.

However the records of the drawings in this system are digitised records of the individual lines in line drawings of facial features, each represented and recorded as a chain of elementary vectors. Thus it is only applicable to line drawings consisting of black lines on a contrasting background, which limits the effectiveness and realism of the derived displays. Moreover, this disclosure states " in spite of its many similarities to the police artist, the system is not intended for this or any other commercial application. The police artist sketches a face from a witness' memory, while in our system the user has the photograph of the face in front of him. Our work is more properly viewed as an experiment in artificial intelligence the current inordinate cost of the system would rule out its use for criminological applications." The stated aim of the system described is "to augment the artistic ability of a user by enabling him to create . . . . . images which he could not by himself hope to accomplish." It is described as "a system with which a nonartist can create any male Caucasian facial image from a photograph in front of him." It is an object of the present invention to provide apparatus for displaying stored images with provision for variation of the.appearance of the image by an operator.

The Comptroller considers that the invention described in the specification cannot be performed without substantial risk of infringement of Patent No. 1 546 072. The Applicants have made no investigation to see whether there are, or are not, reasonable grounds for contesting the validity of the cited patent.

According to the present invention there is provided variable image display apparatus including a computer having an information store containing a plurality of sets of digital records derived by a scan of elementary areas of each of a plurality of photographs of an object, face or pattern and/or parts thereof, means for selecting records from the store, means for controllably adjusting or modifying the selected records and combining them together by digital calculations in the computer, and means for applying the combined records to a television set or other visual display device to thereby display a pictorial representation of the object, face or pattern formed from the combination of the selected records of different photographs.

The apparatus may include means for adjusting the portions of the displayed image derived from each set of records. The means for adjusting or modifying the selected records may include means for individually adjusting the tone, contrast, size, shape or the relative position of the image features to be shown in the displayed representation, and may also include means for adjusting the contrast, tone or shape of the displayed image as a whole. The adjustments of shape may provide facilities for causing distortions of individual features or the whole object, face or pattern to be seen in the displayed image.

Where a face is to be represented it may be considered as a composite of various facial features or areas which have to be matched by a series of selections and modifications of the stored records. For instance the face may be considered as a combination of hair, eyes, nose, mouth, chin, ears, and various areas where lines or wrinkles may appear. Part of a stored record will be selected for each of these areas.

The apparatus will be coupled to a television set or other visual display device to present a display of a controllable combination of the selected records, and controls adjusted to modify the records selected for each facial feature or area in turn until the image displayed becomes a good likeness of the unknown person concerned as far as can be judged from the recollections of one or more witnesses or by comparison with any available photographs.

Controls may be provided for adjusting the boundaries of facial areas derived from different sets of records, and in the neighbourhood of such boundaries the display may be derived from a mixture of or a combination of or an interpolation between details from different sets of records. Each set of records is stored as an array of digital data values, and the calculations used simply have to modify selected values from the stored arrays by controllable functions of position co-ordinates and then form a new array of data values by combining the modified values; further modification by a controllable function of position co-ordinates may be applied to the combined values of the new array to form a final array of values specifying the image to be displayed. The controls can be used to adjust parameters of the controllable functions used, so as to improve the likeness of the displayed image.

In order that the invention may be more readily understood a preferred embodiment thereof and a study relating to the development of the invention will now be described with reference to the accompanying drawings, in which Figure 1 is a schematic illustration of a play apparatus in accordance with the inven tion.

Figure 2 is a representation of modes which can be set by the computer.

Figure 3 indicates information flow speeds in the apparatus of Figure 1.

Figures 4, 5 and 6 are diagrams showing the effect of truncating.

Figures 7a, 7b and 7c are diagrams showing the use of a point sampling grid.

Figure 8 is a diagram relating to bit sliced overlay methods.

Figure 9 is a diagram relating to a blending method.

Figures 10a and 10b are diagrams relating to filtered images.

Figure 11 is a diagram illustrating a filter weighting function.

Figures 12a and 12b relate to the application of a filter.

Figure 13 illustrates a particular image distortion.

Figure 14 is a series of diagrams which characterise image distortions.

Figure 15 is a diagram which shows the effect of image vertical shear.

Figure 16 is a diagram which illustrates distortions requiring a buffer for the whole of the distorted area.

Figure 17 is a series of three diagrams which show image displacement for a horizontal magnification within a rectangular boundary.

Figure 18 is a series of four diagrams showing the effect of various image distortions.

Figure 19 is a series of three line diagrams relating to interline interpolation.

Figure 20 is a diagram relating to the formation of intensity maps.

Figure 21 is a series of six diagrams relating to intensity maps.

Figure 22 is a diagram of a mouth and its control points.

Figure 23 illustrates networks which define a language for a high level command.

Figures 24 and 25 are diagrams which relate to the use of tablet software.

Digital representations for use in a variable image display apparatus are prepared by data capture which is a process whereby features in photographs or line drawings are converted in a digitiser into digital representations. The digitiser includes a Joyce Loebl Scandig 3 scanner which scans a photograph or drawing as an array of 512 x 512 elements and for each element an intensity in the range 0-255 is recorded on magnetic tape.

A variable image display apparatus developed for the purpose of displaying records of human faces will now be described by way of example only.

Referring to Figure 1, a computer 40, which is a PRIME 300 central processor with 64K words of 16 bit memory, drives a graphics display terminal which is designed to operate as a peripheral through a 16 bit parallel interface and is relatively independent of the computer.

The terminal comprises of seven modules, viz: a controller 20 for decoding and implementing computer instructions and feeding back dis play status information to the computer, a frame buffer 21 which comprises a 32K, 48 bit byte addressable random access MOS store which contains picture data in store but since it is accessible oy the computer 40, via the controller 20, it can be used as a powerful computing adjunct, particularly for computer generated images; any part of a currently stored picture can be read back into the computer 40, modified and re written into the frame buffer, a display processor 22 which accesses data held in the frame buffer 21 to produce a com patible 625 line CTV monitor video signal, a data highway 23 which is an addressable high way linking the frame buffer 21, the con troller 20 and display processor 22 and is capable of interconnecting up to eight addressable devices, a high quality colour TV monitor (not shown) for direct viewing in colour, or grey level, a colour cine recording equipment 43, consist ing of a flat screen high precision monitor, colour filter wheel mechanism and cine camera mounted on a light-proofed optical bench and, except for film changing, are all controlled by computer program thus ren dering the production of film sequences an automatic programmable function, and a mono TV monitor 42 which is slaved to the recording equipment 43.

Referring to Figures 2 and 3 the terminal can operate in four modes namely, a 3, 6, 12 and 24-bit mode each of which can be set by the computer 40. In operation, each mode displays a fully interlaced 512 line picture and only the points per raster line change with the mode of operation.

In the 3-bit mode, each raster line consists of 1024 picture elements (approximately 0.5 million elements per frame), each having a colour resolution of 3 bits. In this mode therefore eight colours are possible coupled with high geometric resolution.

The 6-bit mode is intended for the generation of shaded pictures in colour. Each raster line consists of 512 elements, each having a colour resolution of 6 bits, i.e. 64 levels of colour.

The 12-bit mode offers a greater colour resolu tion of 12 bits per element, but with a re duced geometric resolution of 256 elements per raster line.

The 24-bit mode which gives a greater colour resolution of 24 bits per element and with a correspondingly reduced geometric re solution of 128 elements per raster line.

A colour function table lock-up store (TLS) resides in the display processor 22 and is used in the 3 and 6 bit modes. Each element bit pattern (3 or 6 bit) is used to address the TLS from which corresponding bit patterns are read to generate a video signal. In the 3 bit mode only eight locations can be addressed, but this increases to 64 in the 6 bit mode. The colour or grey levels possible in the modes are controlled by the TLS contents and since it is an addressable store, its contents can be changed by the computer, 40. The number of colours of grey levels possible in each of the modes can therefore be substantially greater than the eight and sixty-four quoted.

The kind of models and complex scenes that the terminal is capable of handling, requires a considerable computing load and it is not possible to achieve real-time animation when dealing with complex scenes. In the recording equipment 43 each frame of a film sequence, generated at a rate far below that required for real-time viewing, can be recorded on film to be run subsequently to simulate real-time display. The equipment 43 can be fully computer controlled to display in sequence the colour planes, photographically superimpose these on films through appropriate filters and control the camera and its film and shutter mechanisms.

The following relates to the results of experiments carried out using a variable image display in accordance with the invention.

For the purposes of display, the highest resolution used is 512 x 512 element with 64 grey levels, which requires 1.5M bits of image storage. Several experiments have been performed to assess the minimum amount of image storage which may be used practically. Theoretical considerations suggest that the information content is roughly equivalent to one-half to one-bit per elements (i.e. Y4M bits per picture). However to generate a picture from this amount of data would involve considerable processing.

The effect of reducing the number of elements and the number of grey levels per element was considered. If 32 grey levels are used then the eye cannot quite distinguish between adjacent intensities; however, with only 16 levels the steps are clearly visible and a contoured effect can be seen. Spatial resolutions of 512 x 512 and 256 x 256 are almost indistinguishable, but at 128 x 128 each element can be seen quite clearly. A picture of 256 x 256 with 32 grey levels requires 5/1 6M bits of storage.

Referring to Figures 4 and 5, if instead of simply truncating the intensity of each element to one of N levels, adjacent elements are also considered, then the contour visible at 16 levels can be broken up and significant improvements made. Each element was taken in turn and the nearest of the N levels below the level of the element was taken and the element was displayed at this intensity. This is called 'truncation', and generates a small error which is visible as contouring. If instead of discarding the error, the error is added to adjacent elements before they are processed then the average error over any area tends to zero and the contents are broken up.

As can be seen from Figures 4, 5 and 6 adding the error to adjacent elements in this way, termed 'rounding', breaks up the contours generated by truncation without increasing the error at any individual element, and gives close approximation to mean intensities.

The higher the resolution at which it is used the more effective it is. For images of 256 x 256 elements with 16 grey levels Y4M bits of storage are required, and with 8 levels 3/16M bits are required.

Before the application of rounding adjacent elements are likely to have the same value.

Therefore if instead of storing every element, just the first element of such a run is stored together with the number of elements in the run, then considerable data compression can be achieved. Unfortunately the use of rounding reduces the correlation between adjacent elements, thus the data compression achieved is lower.

From an experimental study of data capture it was concluded that With no enhancement, 32 grey levels are needed; ii With enhancement, 16 or fewer levels are adequate; iii A resolution of 256 x 256 elements with 16 grey levels is adequate; iv A whole face can be stored in from 80K to 256K bits.

A visual representation of a face may be constructed by combining and blending a set of features each of which may have been distorted in a variety of ways. Two general methods of combination may be used: a Overlay Method.

b. Rubber Mask Method.

These methods operate on a library of features generated from scanned photographs and produce similar looking half-tone digital video pictures. The differences lie in the way the features are manipulated.

The following factors were considered in relation to the above methods, 1 Ease of use and control; 2 Resulting picture quality; 3 Processing and hardware requirements; 4 Range of distortions which may be used; 5 Quality and nature of blending at feature boundaries.

There are at least three ways in which an overlay system may be implemented, viz: i As a hardware multiple layer store.

It is possible to build a display system which contains several layers of memory such that each layer may contain part of a picture and when the layers are read out in parallel only the uppermost part of the picture is displayed.

This method requires a very large store.

ii As a simple overlay.

Each layer is simply written to the store, the lowest one first and so on. The upper layers will thus overwrite the lower ones and a composite will be built up in stages. If a lower layer is modified all the layers above it temporarily disappear and just be redrawn in order, and if an upper layer changes to expose parts of a lower layer suitable corrections must be made to redraw the lower layers. This problem can be overcome however with a small enhancement to the store. If a resolution of n x 256 or less is required then there is enough store to hold two images. Thus while one image is being displayed another image can be constructed.

The new image is displayed when it is ready.

It is then possible to switch between these two images instantaneously. When one is no longer required its space may be used to construct the next image. This enhanced form of the Simple Overlay System will be called the Double Buffered Simple Overlay method.

iii As a computed overlay.

The relevant features can be held in main memory and the overlay masking performed by program, a complete composite being written to the store. This is attractive and fairly easy to implement but slower than the other methods.

iv As a bit-sliced overlay.

The intensity map may be used in rather an elaborate way to simulate the multiple layer hardware of (i) above. Six bits are used to re present each element and are sliced up into sub-elements of less than six bits; each sub element corresponding to a layer. Each element is sliced to give, for example, three layers as shown in Figure 8, where the layers shown are all at 3 levels of intensity. The top layer in cludes such items as glasses and hats, the middle layer hair, lips and eyes, and the bottom layer includes skin.

A suitable intensity map is given below: Elements with value: 0 background 1 skin level 1 2 2

3 " ,' 3 4# 5 Ist level for hair etc 6 7 8 9 2nd 10 11# 12 13 3rd " 14 15.

Elements with value: 16 17 1st level for glasses etc 30 31# 32# 33 2nd " " " " " 46 47 48 49 3rd " " " " " 62 63 In building up an overlay picture, initially all elements are set to zero, which is background. The actual order of writing the features to the store is not significant. The features may be written from back to front. First, a skin layer is written. This layer uses the bottom two bits of each element and leaves all other bits unaltered. Thus when writing an element with skin level 3 the bottom two bits are set to l's. Then to write the hair layer the middle two bits are used, the others remaining unaltered.

Thus to write an element with skin level 3 the middle two bits are set to l's. Finally to write the hat level the top two bits are used. To write an element hat intensity three the top two bits are set to l's.

The simple overlay method is straightforward to use. The picture quality resulting from the simple overlay should be excellent. However, unless the store is enhanced to allow the double buffered form to be used, parts of the picture disappear while the over-writing process is taking place. This prevents the use of smooth or continuous distortions to the features. As regards distortion, the overlay methods allow completely free transformations of the most general kind to be used. As regards blending, the features are overlapped at the boundary and elements are selected from the overlapping images in a suitably random manner. This method is illustrated in Figure 9.

The computed overlay method is very similar to the simple overlay method and differs only in the following respects: Resulting picture quality As the overlay is computed before the features are written to the display there is no problem with the disappearance of parts of the picture while distortions are applied.

ii Processing and hardware requirements This method imposes a large processing load and will consequently run substantially slower.

The bit sliced overlay method has the somewhat different characteristics viz: a. Resulting picture quality The major problem with this method is the reduced number of grey levels available for modelling each feature layer. Typically two or three levels will be available for each layer.

This is not such a severe constraint as it at first appears. This is due to the fact that the inten sities scanned by individual features rarely in dude the whole range of intensities. Thus three levels available for the skin probably correspond to 12 to 16 levels available for the whole face.

b. Processing and hardware requirement With this method the processing per element is slightly more complex than with the other overlay methods. However only elements affected by distortion need to be processed.

This saving gives a significant gain on performance over the above methods.

In the rubber mask method, an initial face is constructed out of the selected set of features and then pulled around as if it were a rubber mask. An initial face may be constructed by an overlay method. The face is then distorted by applying a range of local distor tions to various parts of the picture. The processing requirements for this method should be highly responsive and not require excessive processing power, and allows a wide range of distortions.

The following methods may be used for blending features: Weighted Averages Mathematically the nearest approximation to a smooth blend would be achieved by some form of averaging between the intensities of each feature. This has two disadvantages, firstly it tends to blurr texture, and secondly it requires considerable processing at the boundary of each feature.

ii Randomized Overlap This method is illustrated in Figure 9. An overlapping region is defined on each side of the boundary and within that region elements are selected either from the left-hand or righthand feature. The selection is such that the density of elements from each feature varies gradually from 100% to 0% as the boundary is traversed. The processing requirements for this method are very small and can mostly be performed before the feature is used interactively.

The digital images are stored as a set of discrete points. Thus although they are displayed as two-dimensional images they are stored and manipulated as a grid of discrete discontinuous point samples. If special allowances are not made for this various undersirable effects occur.

When digitizing data, an image is sampled at a discrete set of points and the intensity of each point is then assigned to the whole of the corresponding element. Thus if the original image of Figure 7a is scanned with the sampling grid shown in Figure 7b then the image shown in Figure 7c is produced with a staircase effect, is due to the errors introduced by dis playing the whole element at the same intensity as the original picture had at the mid-point of that element.

Filtering is used to eliminate clear bound aries of light and dark areas so that a better quality image will result when the filtered original is sampled.

Figures 1 0a and 1 0b illustrate the blurring effect of filtering. The resultant image no longer displays a staircase effect but now has a slightly blurred boundary. When photographs are digitized careful design of the scanning mechanism (e.g. spot-size and profile) effective ly implements a suitable filtering function. How.

ever, when one sampled image is mapped on to another (e.g. when distortion is applied) consideration has to be given to performing this by program. Mathematically, filtering corresponds to taking a weighted average over an area of the picture instead of a sample at a point. A filter is characterized by the form of its weighting function. A simple and effective weighting function is a pyramid 2 elements by 2 elements, as illustrated in Figure 11. Consider the distortion d illustrated in Figure 13. If the intensity at P is used when displaying the element at P' then various effects similar to the staircase effect can occur.

The application of a filter is illustrated in Figures 12a and 12b. A pyramid of base 2 elements by 2 elements is constructed around point P' and mapped back on to the original image using the inverse distortion d l. The elements covered by this pyramid are then averaged with the weighting function. The resultant intensity is then used for the element of P. This integration may be described in terms of a weighting function W and intensity I on the original image.

Thus, I'(P') = IW(P',Q')I(d-1 (Q'))dA Q' pyramid region, where the integration is over the area of the pyramid and Q' is the free variable.

The above equation requires substantial computation and for practical use may be simplified by considering the original image as a set of discrete point samples. The I is non-zero only at a finite number of points and the integration may be replaced by a summation. Furthermore, if the pyramid region is mapped back on to the original image it becomes straightforward to perform the summation in the original image space rather than distorted image space.

Thus, I'(P') = SI(Q) W*(P',d(Q)) d(Q) e pyramid region It should be noted that the weighting function of W* is derived from W by considering the intersection of the pyramid region with each element as well as the weighting to be applied to that element. However, for simplicity W has been used rather than W*.

The above method involves considerable computation. Alternative methods are now described.

The critical problem with filtering is that the contents of each element are computed by averaging several other elements. In the stretch and fill method no computation is performed on the contents of elements. The contents of each element in the distorted image are simply copied from some element in the undistorted image. The way that the copying is performed defines the distortion.

Thus, for example, if each element is copied from its left-hand neighbour the image will move one element to the right, which is a translational distortion.

Consider a distortion d as illustrated in Figure 13. The simplest rule for the copying is, for an element at P', to copy the contents of the element whose centre is most readily distorted on toP' by d. This is shown in: I (P') = I (nearest element to d-1 (P')) This is the form of distortion used in examples described below. In certain circumstances this method leads to unpleasant patterning. This can be reduced by adding a little random noise to the distortion to remove the pattern: I' (P') = I (nearest element to r + d-t (p)) Where r is a small random vector whose components lie in the range of plus or minus half an element width.

Clearly the computational complexity of performing these distortions is dependent on the nature of the function d. The rubber mask methods require rath

Video systems require data in a linear stream, scanning a raster line at a time. If adjacent lines are required to perform a distortion then one or two lines of buffer storage are required. However, if serial processing is orthoginal to the scan or more complex 2-D processing is required then the requirement for buffer storage is increased very substantially.

In practice a face will occupy about half of a screen. The largest feature, the hair occupies about a third. A full face change involves writing about 1/8m elements, and a hair change about 1/24m elements at the highest resolution of 512 x 512 elements. It is therefore essential to minimise the processing per element. The number of elements to be pro cessed probably precludes the use of filtering and imposes very tight constraints on all other methods.

It is not usually desirable to use the actual frame buffer as working store for computing the distortions as this causes degradation of the picture while the distortion is being computed. An interlaced scan causes problems with the serial nature of the data. A TV picture is scanned in two passes, the first scan displays the even lines and the second scan displays the odd lines. However, when processing a distor tion it is usually desirable to work sequentially through all the lines in one pass. This usually dictates the use of a full frame buffer or the ability to access data via two parallel ports. If a low resolution picture, i.e. 256 lines or less, can be used then several significant savings can be made. The two scans become identical and each line is merely displayed twice, once at each scan, and the store is capable of holding 2 whole pictures. Thus half the store may be used as working store whilst computing a dis tortion, once the distortion has been computed in that half display can be switched to display the new image.

Figures 14 to 16 show distortions according to their buffer storage requirements, and assumes horizontal scan lines. Vertical scan lines would give a different classification, which would in general correspond to swopping pairs of transformations in the classification.

Figure 15 illustrates a vertical shear. Rota tions can be computed in a single step but this involves substantial processing.

Figure 16 illustrates distortions requiring buffer for the whole of the distorted area.

Rubber mask distortions differ from overlay distortions in that the distortion is applied to an area of the composite rather than to a feature. Therefore the distortion has a boundary and must be faded out smoothly towards the boundary such that elements at the boundary are not altered. These distortions may be visualised in terms of the displacement which is applied to each element when performing the distortion. Thus if a distortion is defined by a function D over the area to be distorted such that D(P) is the displacement at P. The intensity at P is given by: I'(P) = I(P-D(P)) Figure 17 shows the displacements for a horizontal magnification within a rectangular boundary. Elements in the central column remain unchanged. To the right of the central column elements are displaced to the right and to the left of it elements are displaced to the left. Around the boundary elements are not displaced. Therefore if one plots D along the central row AA one gets an S shaped curve.

Plotting D down the column BB gives a bell shaped curve. Fairly simple mathematical formulae can be used for the curves. The important properties of the curves are continuity up to the first derivative, their extent, and their maximum amplitude.

Translational distortions can be understood by examining the right-hand half of the area distorted in Figure 17. It can be seen that the centre of this area is displaced to the right, thus a magnification is effected by performing two displacements one to the left and one to the right. However, to perform a translation effectively the function D is a bell shaped curve both along the central column and the central row, thus the right-hand half of the S shaped curve magnification does not have the continuity conditions that one would want, although it is very similar.

Distortions such as the horizontal magnification shown in the Figures can be performed using only a single line of buffer. However, a vertical magnification or displacement would require the whole area to be buffered.

The four simple distortions as illustrated in Figure 18. More complex distortions such as rotation and non-linear transformations may be defined by suitable functions as well.

The lowest order polynomials which are suitable for the displacement functions are quartics for the Bell and quintics for the S shape. It is not desirable to compute these at each element as various forms of tabulated functions would need to be stored. The construction and manipulation of tables may be carried out prior to the interactive distortion of features.

The distortions described above may all be implemented using three simple procedures: i Displacement Each element is moved by (DX, D/Y) ii Horizontal stretching/shortening A segment of a raster line may be expanded or contracted.

ili Interline interpolation New lines are interpolated between, or to replace, several old ones.

Displacement may be implemented with little or no effort. The elements are merely written to a different part of the image.

Stretching or shortening has been mentioned briefly above. If a constant factor is applied along a line very simple methods may be used.

If simple non-randomised stretch and fill is re quired then an algorithm such as Bresenham's vector generator may be used. This involves a few additions and tests per element and removes the requirement for multiplications and divisions in computing the proportions involved in scaling and allows fast distortions to be implemented.

Interline interpolation is the vertical equivalent of stretch and fill. When a vertical magnification is performed new lines have to be interpolated between the displaced lines. The horizontally adjacent elements are themselves filling, so in the interpolated lines the vertically adjacent elements are used. Thus an interpolated line consists of a blend of the two vertically adjacent lines. However, when whole lines are to be dropped undesirable effects will result if the lines were simply dropped. Therefore the two lines adjacent to the dropped lines must be modified as may be seen from Figure 19.

If line C were simply dropped then unpleasant discontinuities might occur between lines B and D. Therefore B is blended with C and becomes the second line and C is blended with D and becomes the third line, thus line C has been dis persed into the two adjacent lines.

As has been mentioned earlier the data capture introduces various forms of distortion into the intensity range of the pictures. To convert a value digitized for an element into an intensity on the screen two mappings are performed.

Firstly intensity values must be assigned to digitized values and secondly intensities must be assigned to these elements of the intensity map. The two mappings can be varied independently, as is indicated in Figure 20.

It is necessary to write software to control both mappings independently to produce images of the highest quality.

There are two alternative strategies in generating the mappings; they may be defined interactively, or alternatively various heuristic functions may be used to define the mapping. The interactive control of such methods is described below. A simple heuristic for the first mapping might be to assign the intensity map numbers to digitized numbers in such a way that there are the same number of elements in the image using each element of the intensity map. The second mapping provides slightly different problems. It has a more direct effect on the visual appearance of the image than the first mapping.

It is important that for example the intensity range closely corresponds to the intensity range in the original. Thus as the first mapping described will not fully have corrected for the distortions introduced by the data capture procedure, and in fact may have introduced further distortions, the final correction is applied by the second mapping. Various simple forms of mapping are worth considering, linear ramps, inverted S shape curves, concave and convex curves etc., can all be handled very simply and provide a useful range of effects.

Changes in intensity and tone may be performed either by changing the intensity map or by modifying the data in the elements themselves. Changing the intensity map requires more computation than data modification but allows any part of the picture to be modified independently of the rest. If less than 64 levels are required then the face can be partitioned so that, for example, the hair uses numbers 0 to 15,theskin 16to31,eyes32to47,lips 48 to 63. Each range can be modified independently and the elements only need to be changed to modify smaller parts of pictures.

256 intensities of digitized data were mapped on to 64 intensities by truncating down to the nearest level below the digitized level, which corresponds to a linear ramp. To set up intensity maps a simple colour mixing computer program was used. This program was not designed, and is not really suitable, for modifying the shape of these intensity functions. When the tablet has been commissioned it is intended that it should be used to sketch in functions directly. A menu box would be available on the tablet in which the user could sketch either the whole curve, parts of the curve, or individual points. This allows great flexibility.

Figure 21 illustrates the effect of various shapes of intensity maps, by i lightening the dark areas and wiping out the highlights; ii blacking out the dark areas and compressing the highlights; iii darkening the picture without completely losing detail at each end; iv lightening the picture without completely losing detail at either end; v expanding the middle ranges and compressing the extremes; vi expanding the extremes and compressing the middle ranges.

It is easy to smoothly blend any pair selected from (i) to (vi). A small range of predefined tonal functions plus the ability to move between them provides a powerful tonal control.

In an experiment the layout of a screen was manually variable such that it could be divided up into one or more regions, where a region is a rectangular area with edges parallel to the screen edges. It was possible to break up the screen into several small regions for scanning through a library of faces or features.

The screen could also be divided up into two regions when trying to fit a new feature, one region containing the face before and the other region containing the face after the new feature had been added. The region boundaries will be defined interactively using the tablet and each region having a corresponding work area in core which models the current state of the face in that region. A current region could be selected on command so that the contents could be modified, added to or stored.

Facial features may be selected in a variety of ways such as for example i by alphanumeric name - that is the name which was assigned to the feature when it was inserted in the library; ii by position in the library - the library may be scanned sequentially to allow access of features to be examined; iii as part of a face - whole face composites may be retrieved allowing groupings of features to be examined together; iv by descriptor - if a descriptor data base is implemented then feature may be selected by specifying absolute descriptions or differences between features displayed and features desired.

Overlay distortions are controlled by a set of control points which relate to particular features. The control points may be moved in a variety of ways to apply the distortion. Although it is possible to use a keyboard to control these movements, it is more convenient to use a tablet to move the control points. The control points themselves may be moved or pseudocontrol points may be defined and manipulated. The pseudocontrol points may be linked in a variety of ways to the actual control points to provide high level and more complex controls. Thus for example a pseuod-control point may be linked to four control points of, say, a mouth, in such a way that moving the pseudo-control point to right widens it, to the left narrows it, upwards makes it longer and downwards makes it shorter.

Rubber mask distortions are rather different in nature to overlay distortions. To define a rubber mask distortion several parameters are needed, viz: the cosrdinates of the centre of the distortion; the horizontal and vertical extents of the distortion; the degree of the distortion; and the type of distortion. The parameters may be defined using a keyboard or, more conveniently, by using a tablet. Where a tablet is used, a cursor is moved around on the screen such that the centre of the cursor indicates the centre of the distortion and the height and width of the cursor illustrates the extent of the distortion. The degree of a distortion is best defined as the maximum number of elements by which any individual element is displaced when the distortion is applied and is rather harder to illustrate on the screen.

Figure 22 illustrates possible control points for a mouth feature. The shape, size, and position of the mouth feature are all defined by the positions of four control points. The control points may be manipulated in a variety of ways.

Figure 23 illustrates networks defining a language for high level commands. The commands are divided into two parts. The first network defines the language used to set up a group of points to be moved, and the second network defines language by which they are moved. The following is an example of how this can be achieved: MOVE LINK 0 1 POINTS 1 2 By (20 30) (25 45) (22 41) effect Pll Fl + 0,15) F21 P2+0,15) item P1t P1 + (0,11) P21 P2 + (0,11) The first line states that points 1 and 2 are to be moved and that the incremental movement defined in the following line is to be multiplied by a linkage of (0,1). That is the X coordinate is to be multiplied by 0 and the Y coordinate multiplied by 1. Thus points 1 and 2 will move vertically the same distance as the increment provided but will not move horizontally. The effect of providing a series of increments in the second line will be to move the top two control points of the mouth up or down together. The second line then provides three pairs of coordinates, the difference between these are interpreted as incremental moves to the points. The first pair of coordinates define an increment of (5,15) when this is multiplied by 0,1 it corresponds to a vertical movement of points 1 and 2 by a distance of 15 units. The third co-ordinate then requests a further incremental change this time moving points 1 and 2 back towards their original position. The incremental move between the third and second coordinates is four units vertically downwards thus giving an overall vertical movement of 11 units up. Used in this way keyboard languages commands are not very convenient but their power becomes evident when the tablet menu handling software is used as a front end.

Figures 24 and 25 illustrate the use of the tablet software as a front end to the above language. A simple menu has been defined with three menu boxes relating to mouth control. The first one corresponds to widening and narrowing the mouth; the second to lengthening or shortening the mouth; and the third to displacing the whole mouth. The replacement text of this first box states that points 2 and 3 are to be moved horizontally by the same increment as the probe while points 1 and 4 are to be moved horizontally in the opposite direction.

Thus specifying a positive increment (to the right) widens the mouth or negative (to the left) narrows the mouth. Similarly a second menu box defines linkages which allow the mouth to be lengthened and shortened. The third menu box links both the X coordinates and the Y co-ordinates of the increment to the probes thus allowing all four points to be moved together by the same increment as is specified. This has the effect of moving the whole mouth around. To use this menu facility an operator would probe the menu box for the distortion he wishes and would then probe into a face area on the tablet. The difference between the first two probes in. the face area would provide the amount of distortion required, that is the degree. The operator could continue probing in the face area to re-apply more or less of the same distortions as many times as he wishes. To apply a different distor tion he simply probes the appropriate box and then probes back in the face area again. Thus a user is provided with a simple, ergonomic control facility.

The following calculations relate to the performance of the apparatus, and are based on a distortion applied to a single feature covering the whole screen at a resolution of 256 x 256 implemented using four elements duplicated at a resolution of 512 x 512. The calculations can only be used as a guide of an approximate nature. The method used has been to analyse the innermost loops of the fastest known algorithms and summing the instruction times of these loops. It is then necessary to extrapolate from this time to the overall time. The code is believed to be correct in number and type of instructions, in control structure, loop counts etc., but not in address, data, operations etc.

When interpreting these figures various allowances should be made. First, as the fastest possible algorithms have been used and as these have not yet been fully implemented it is probably reasonable to assume that only 50% of the peak rate calculated will be achievable.

Thus one should probably allow twice the calculated time for the innermost loops. To this it is then necessary to add the time taken to set up data and variables for the execution of the loops. It is extremely hard to arrive at accurate estimates for this time and as a rough rule of thumb it is suggested that equal times be allowed for set-up and execution of these innermost loops. An overall factor of about 4 must be applied to the calculated times. To verify the accuracy of the peak rate calculations one small part of idealised code has been fully implemented and timed in execution. The timings agree to within 8%, which is clearly less than the error inherent in extrapolating from these peak rates, which are calculated below.

i Table Loopup Horizontal Distort The time taken to perform a simple horizon tal distort, such as magnification, using a precomputed table of distortion functions has been evaluated. This method corresponds to the simplest of distortions and is a lower limit for all other distortions. The times computed for this and other distortions are all based on a full screen distortion, though in practice the distortion would only be applied to a small part of the screen. Set up time for this distortion consists of a constant overhead plus a time proportional to the horizontal width of the feature to be distorted. The time taken to execute the distortion is proportional to the area to be distorted, and has been measured as 1.66 se- conds, using a simple overlay method.

ii Bit Slicing Overlay If the Bit Slicing Overlay approach is used there is extra computation to be performed on each element and the peak rate is 2.88 seconds per frame.

iii Rubber Mask Distortion Peak rate is 2.38 seconds per frame.

iv Using a disc operated system with a virtual memory.

When using the system there are three extra delays. Firstly each user only has access to the processor for part of the time; secondly the actual processor runs slightly slower and thirdly, due to the indirect nature of the access to the terminal, extra instructions have to be executed for each element.

Peak rate assuming the central processor unit (CPU) running at full speed and user has access to CPU full time, equals 4.8 seconds, (or 9.6 seconds allowing for half rate use of machine time.) v Time to set up Distort Table The time to set up a simple horizontal distort table is proportional to the length of the table, i.e: a horizontal size of the feature to be distorted. For a full screen distort table set-up time equals .36 seconds.

vi Run4ineEncoding If the features are stored in a feature library in a compressed form then it is worth considering whether the distortion should be applied to the compressed form before expansion and display, or to the expanded form. The run-line encoded form of data compression seems to be the most likely candidate for use, and therefore the time taken to apply a simple horizontal distortion to a feature in run-line encoded form has been evaulated so that it may be compared directly with the time calculated in (i). The time taken to perform a horizontal distort equals 1.15 + 2.5/N seconds where N is the average run length. Thus if N is greater than about 4 it is just worth distorting in the compressed format. However the gain is very small unless the run length is very substantial and even then it is not very large.

From the above analysis the time for performing a full face distortion using a Bit Slicing overlay would appear to be about 12 seconds, assuming an average depth of 1. The depth is the average number of features which impinges on an element. Thus in the region of a nose bridge the depth might be 3 or 4, while in the middle of the cheeks or in the background areas around the face it will only be 1.

Microcode, that is micro-instructions from which machine code instructions are built achieves an improvement in the above of between 2 and 12 depending on the operation; an average of about 4 is a reasonable working figure. Therefore using microcode a full face distortion would take about 3 seconds. A single feature such as the nose probably occupies about a third of the face width and about a third of the face height which corresponds to about a twenty-seventh of the screen area.

Therefore the time taken to perform a single feature distort is likely to be about half a second using assembly code, or about an eighth of a second using microcode.

By storing complete records of photographic images, the present invention can provide realis tic images, particularly of faces, reproducing many shades of grey. If colour photographs were used, the hue values recorded as well as brightnesses, the present system could be arranged to produce a coloured representation on a colour television set or some other suitable display device. By using digital records of the brightness of every relevant elementary area of each photograph used to derive the records, the present system deals with the data in a form which can be stored in available digital data storage devices and can be controllably manipulated in many ways by available minicomputer or microprocessor devices. Modern technology is steadily reducing the cost of such devices. The final array of digital data can readily be retranslated into analogue-signal form if required so as to produce the display representation on a television set; this only requires an analogue to digital signal converter and means for adding the necessary conventional line synchronisation and frame synchronisation signals.

The present invention is not limited in scope to the production of facial images but includes the production of images of objects such as, for example, motor car bodies, objets d'art and domestic equipment and also includes the production of images of patterns such as, for example, wallpaper patterns and dress fabric pattems.

Claims

WHAT I CLAIM IS:

1. Variable image display apparatus including a computer having an information store containing a plurality of sets of digital records derived by a scan of elementary areas of each of a plurality of photographs of an object, face or pattern and/or parts thereof, means for selecting records from the store, means for controllably adjusting or modifying the selected records and combining them together by digital calculations in the computer, and means for applying the combined records to a television set or other visual display device to thereby display a pictorial representation of the object, face or pattern formed from the combination of the selected records of different photographs.

2. Display apparatus as claimed in Claim 1, including means for individually adjusting portions of the displayed image derived from each set of records.

3. Display apparatus as claimed in Claim 1, wherein said means for adjusting or modifying the selected records includes means for adjusting the tone, contrast, size, shape or relative position of the image features to be shown in the displayed representation.

4. Display apparatus as claimed in Claim 1, wherein said means for adjusting or modifying the selected records includes means for adjusting the contrast, tone or shape of the displayed image as a whole.

5. Display apparatus substantially as described herein with reference to Figures 1 to 25 of the accompanying drawings.