AU2012268887A1

AU2012268887A1 - Saliency prediction method

Info

Publication number: AU2012268887A1
Application number: AU2012268887A
Authority: AU
Inventors: Clement Fredembach; Jue Wang
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2012-12-24
Filing date: 2012-12-24
Publication date: 2014-07-10

Abstract

Abstract SALIENCY PREDICTION METHOD A method (100) of creating a saliency map (170) for a digital image (110) having a plurality of 5 pixels represented by at least one colour value and a luminance value, the method comprising generating (840, 850) at least one colour contrast value from the colour value for pixels within a plurality of regions (211) of predetermined size in the digital image, said regions being downscaled representations of the digital image, the sizes of the regions being dependent upon a size distribution (710) of salient regions in the digital image; generating (830) at least one 10 luminance contrast value from the luminance value of said pixels within the plurality regions of the digital image; determining (860) a colour-luminance result from a weighted combination of the generated colour contrast value and the generated luminance contrast value; and constructing (160) from the colour-luminance result the saliency map for the digital image input image (see Fig. (see Fig. 9) 1 0a) 130 (see Fig. 2) 140 Sharpness CLS feature Position bias feature extraction extraction process calculation process process 1320 240 1030 (see Figs. 12a, 12b) 160 (see Figs. 13,14) Bias data Combination D process Saliency map Fig. 1 6988067_1 / P057178 / filed specification 241212

Description

1 SALIENCY PREDICTION METHOD TECHNICAL FIELD This invention relates to the field of image processing and in particular to processes for facilitating image understanding and manipulation. 5 BACKGROUND When viewing an image, a human observer will spend varying amount of time looking at different parts of the image. The portions of the image that most attract the observer's gaze and attention are deemed to be "salient". These salient portions are of particular importance in image processing and computer vision, as they underpin the quality of many digital imaging 10 and computer vision applications such as scene classification, content based image retrieval, autofocus, or automatic image cropping. Some existing techniques predict saliency with a very limited number of features, making them perform poorly on average. Other methods follow a machine learning approach, which can sometimes improve results but at the expense of high overhead. Moreover, such approaches are predicated on real-world viewing conditions being 15 similar to experimental ones, an often violated assumption. SUMMARY It is an object of the present invention to substantially overcome, or at least ameliorate, one or more disadvantages of existing arrangements. Disclosed are arrangements, referred to as Salient Area Size Distribution (SASD) arrangements, 20 which seek to address the above problems by utilising the size distribution of salient regions of an image as a basis for downsampling the digital image in question, in order to thereafter determine colour and luminance attributes enabling a salience map to be generated. According to a first aspect of the present invention, there is provided a method of creating at least one value of a saliency map for a digital image, the digital image having a plurality of 25 pixels each represented by at least one colour value and a luminance value, the method comprising the steps of: generating, dependent upon a distance function, at least one colour contrast value, from the at least one colour value, for pixels within a plurality of regions of predetermined size, said regions being downscaled representations of the digital image, the size of at least one of the 2 regions being different to the size of another one of the regions, the sizes of the regions being dependent upon a size distribution of saliency clusters in the digital image; generating, dependent upon another distance function, at least one luminance contrast value from the luminance value of said pixels within the plurality regions of the digital image; 5 determining a colour-luminance result from a weighted combination of the at least one generated colour contrast value and the at least one generated luminance contrast value; and constructing from the colour-luminance result the at least one value of the saliency map for the digital image. 10 According to another aspect of the present invention, there is provided an apparatus for creating at least one value of a saliency map for a digital image, the digital image having a plurality of pixels each represented by at least one colour value and a luminance value, the apparatus comprising: a processor; and 15 a non-transitory memory storing a computer executable software program for directing the processor to perform a method for creating at least one value of the saliency map, the method comprising the steps of: generating, dependent upon a distance function, at least one colour contrast value, from the at least one colour value, for pixels within a plurality of regions of predetermined size, said 20 regions being downscaled representations of the digital image, the size of at least one of the regions being different to the size of another one of the regions, the sizes of the regions being dependent upon a size distribution of saliency clusters in the digital image; generating, dependent upon another distance function, at least one luminance contrast value from the luminance value of said pixels within the plurality regions of the digital image; 25 determining a colour-luminance result from a weighted combination of the at least one generated colour contrast value and the at least one generated luminance contrast value; and constructing from the colour-luminance result the at least one value of the saliency map for the digital image. 30 According to another aspect of the present invention, there is provided a non-transitory computer readable memory medium storing a computer executable software program for directing the processor to perform a method for creating at least one value of a saliency map for a digital image, the digital image having a plurality of pixels each represented by at least one colour value and a luminance value, the program comprising: 3 computer executable code for generating, dependent upon a distance function, at least one colour contrast value, from the at least one colour value, for pixels within a plurality of regions of predetermined size, said regions being downscaled representations of the digital image, the size of at least one of the regions being different to the size of another one of the 5 regions, the sizes of the regions being dependent upon a size distribution of saliency clusters in the digital image; computer executable code for generating, dependent upon another distance function, at least one luminance contrast value from the luminance value of said pixels within the plurality regions of the digital image; 10 computer executable code for determining a colour-luminance result from a weighted combination of the at least one generated colour contrast value and the at least one generated luminance contrast value; and computer executable code for constructing from the colour-luminance result the at least one value of the saliency map for the digital image. 15 According to another aspect of the present invention, there is provided a method of creating at least one value of a saliency map for a digital image, the digital image having a plurality of pixels each represented by at least one colour value and a luminance value, the method comprising the steps of: 20 generating, dependent upon a distance function, at least one colour contrast value, from the at least one colour value, for pixels within a plurality of downscaled representations of the digital image; generating, dependent upon another distance function, at least one luminance contrast value from the luminance value of said pixels within the plurality regions of the digital image; 25 determining a colour-luminance result from a weighted combination of the at least one generated colour contrast value and the at least one generated luminance contrast value; and constructing from the colour-luminance result the at least one value of the saliency map for the digital image, wherein the distance functions are associated with a colour space (1150) producing non-isotropic isosalient contours for chrominance information. 30 Other aspects of the invention are also disclosed.

4 BRIEF DESCRIPTION OF THE DRAWINGS One or more embodiments of the invention will now be described with reference to the following drawings, in which: Fig. 1 is a schematic block diagram of a saliency prediction method according to a SASD 5 arrangement; Fig. 2 is a schematic flow diagram illustrating a method of extracting Colour- Luminance-Size (CLS) features as used in the method of Fig. 1; Fig. 3 is a schematic flow diagram illustrating a method of image scale decomposition (also referred to as downsampling) as used in the method of Fig. 2; 10 Fig. 4 is a schematic flow diagram illustrating a method of colour measurement as used in the method of Fig. 2; Fig. 5 is a schematic flow diagram illustrating a method of scale recomposition (also referred to as up-sampling) as used in the method of Fig. 2; Fig. 6a is a schematic flow diagram illustrating a method of size data acquisition as used in the 15 method of Fig. 3; Fig. 6b is an example illustrating a method of size data acquisition as used in the method of Fig. 3; Fig. 7 is a graph illustrating the output of a method of size data acquisition according to the method of Fig. 6a; 20 Fig. 8 is a schematic flow diagram illustrating a method of colour and luminance distance measurement as used in the method of Fig. 4; Fig. 9 is a schematic flow diagram illustrating a method of sharpness feature extraction as used in the method of Fig. 1; Fig. 10a is a schematic flow diagram illustrating a method of position bias calculation as used 25 in the method of Fig. 9; 5 Fig. 10b is a an example illustrating a method of position bias calculation as used in the method of Fig. 9; Fig. 1 la is schematic flow diagram illustrating a method of colour space transformation as used in the method of Fig. 8; 5 Fig. 1 lb is an illustration of a specific method of colour space transformation as used in the method of Fig. 8; Fig. 1 ic is an illustration of the non-isotropy of colour perceptual scale resulting from the method of Fig 11a and Fig I1b; Fig. 12a is an illustration of experimentally obtained accurate positional data; 10 Fig. 12b is an illustration of a model of the positional bias of Fig. 12a; Fig. 13 is an illustration of a method to combine various feature maps into a saliency map according to the method of Fig. 1; Fig. 14 is an illustration of an alternate SASD arrangement for combining various feature maps into a saliency map according to the method of Fig. 1; and 15 Figs. 15A and 15B form a schematic block diagram of a general purpose computer system upon which SASD arrangements described can be practiced. DETAILED DESCRIPTION INCLUDING BEST MODE Context Where reference is made in any one or more of the accompanying drawings to steps and/or 20 features, which have the same reference numerals, those steps and/or features have for the purposes of this description the same function(s) or operation(s), unless the contrary intention appears. It is to be noted that the discussions contained in the "Background" section and the section above relating to prior art arrangements relate to discussions of documents or devices which 25 may form public knowledge through their respective publication and/or use. Such discussions 6 should not be interpreted as a representation by the inventors or the patent applicant that such documents or devices in any way form part of the common general knowledge in the art. When looking at the world or a pictorial representation thereof in the form of a source image, not all parts of the scene (or the source image, also referred to merely as the image) attract an 5 observer's attention, nor are all parts looked at for a significant period of time by the observer. Indeed, some of the scene's elements may be totally discounted by the observer. The parts of the image that are attended to by the observer, also referred to as being fixated upon by the observer, are deemed to be "salient". Knowing what parts of an image are salient facilitates improved imaging and analysis techniques. Because the human visual and cognitive system is 10 poorly understood, any attempt to replicate its behaviour requires modelling hypotheses about what it could be. While some aspects of the human visual system are known, such as its ability to perceive brightness differences, the relative importance of these aspects in the visual system is not quantified accurately. Overview of the SASD arrangement 15 A method for measuring the saliency of parts of an image is described. In one SASD arrangement, visual features of colour, luminance, and size are combined together to form a CLS feature which is then combined with visual features such as sharpness, computational features such as positional bias to produce a measure of saliency in an input image. The SASD is an enabling technology, and is based upon features that are independent of each 20 other, as well as providing a feature combination method that is easy to augment. The following features are used in the SASD arrangements: colour, luminance, sharpness, size, and position. The inventors have discovered a strong hue dependency of colour saliency: variations along the green-magenta direction are twice less salient than corresponding variations along the cyan-red direction. This hue-dependency is not necessarily physiological but could be the result of 25 linguistic colour naming and naturalness of colours. For instance, vegetation covers a large number of shades of green, from blue-green to yellowish, making differences in green not salient. Whereas red tends to be quite chromatic and saturated, making small variations in these areas much more salient. The inventors have further discovered that when people look at an image, there is a constancy 30 in the size of the objects being fixated regardless of their content. For instance, in pictures 7 containing people, faces will be fixated. On the other hand, when shown a close-up portrait of a person, observers will fixate on regions within the face such as eyes, nose, lips. In fact, the sizes of their fixated regions peak between 3% and 5% of the image size. The majority of the fixated regions cover an area smaller than 7.8% of the image size. The size distribution is invariant to 5 image content, and is robust to different clustering thresholds. The constancy of size across all surveyed parameters justifies the use of size as a weight parameter when predicting saliency with low-level features. SASD arrangement 1 Fig. 1 is a schematic block diagram of saliency prediction method 100 according to a SASD 10 arrangement. The saliency prediction method 100 generates a saliency map 170 for an input image 110, by analysing the image 110 using visual and computational features. The values of the saliency map 170 reflect the areas of the input image 110 to which an observer is likely to direct their attention. Figs. 15A and 15B depict a general-purpose computer system 1500, upon which the various 15 SASD arrangements described can be practiced. The SASD arrangements may alternately be implemented using a general purpose electronic device including embedded components incorporated, for example, in a mobile phone, a portable media player or a digital camera, in which processing resources are limited. The SASD arrangements may alternatively be implemented in dedicated hardware such as one or more gate arrays and/or integrated circuits 20 performing the SASD functions or sub functions. Such dedicated hardware may also include graphic processors, digital signal processors, or one or more microprocessors and associated memories. If gate arrays are used, the process flow charts in Figs. 1-5, 6a, 8, 9, 10a, 1la, 13 and 14 are converted to Hardware Description Language (HDL) form. This HDL description is converted to a device level netlist which is used by a Place and Route (P&R) tool to produce a 25 file which is downloaded to the gate array to program it with the design specified in the HDL description. As seen in Fig. 15A, the computer system 1500 includes: a computer module 1501; input devices such as a keyboard 1502, a mouse pointer device 1503, a scanner 1526, a camera 1527, and a microphone 1580; and output devices including a printer 1515, a display device 1514 and 30 loudspeakers 1517. An external Modulator-Demodulator (Modem) transceiver device 1516 may be used by the computer module 1501 for communicating to and from a communications network 1520 via a connection 1521. The communications network 1520 may be a wide-area 8 network (WAN), such as the Internet, a cellular telecommunications network, or a private WAN. Where the connection 1521 is a telephone line, the modem 1516 may be a traditional "dial-up" modem. Alternatively, where the connection 1521 is a high capacity (e.g., cable) connection, the modem 1516 may be a broadband modem. A wireless modem may also be 5 used for wireless connection to the communications network 1520. The computer module 1501 typically includes at least one processor unit 1505, and a memory unit 1506. For example, the memory unit 1506 may have semiconductor random access memory (RAM) and semiconductor read only memory (ROM). The computer module 1501 also includes an number of input/output (1/0) interfaces including: an audio-video interface 10 1507 that couples to the video display 1514, loudspeakers 1517 and microphone 1580; an 1/0 interface 1513 that couples to the keyboard 1502, mouse 1503, scanner 1526, camera 1527 and optionally a joystick or other human interface device (not illustrated); and an interface 1508 for the external modem 1516 and printer 1515. In some implementations, the modem 1516 may be incorporated within the computer module 1501, for example within the interface 1508. The 15 computer module 1501 also has a local network interface 1511, which permits coupling of the computer system 1500 via a connection 1523 to a local-area communications network 1522, known as a Local Area Network (LAN). As illustrated in Fig. 15A, the local communications network 1522 may also couple to the wide network 1520 via a connection 1524, which would typically include a so-called "firewall" device or device of similar functionality. The local 20 network interface 1511 may comprise an EthernetTM circuit card, a BluetoothTM wireless arrangement or an IEEE 802.11 wireless arrangement; however, numerous other types of interfaces may be practiced for the interface 1511. The 1/0 interfaces 1508 and 1513 may afford either or both of serial and parallel connectivity, the former typically being implemented according to the Universal Serial Bus (USB) standards 25 and having corresponding USB connectors (not illustrated). Storage devices 1509 are provided and typically include a hard disk drive (HDD) 1510. Other storage devices such as a floppy disk drive and a magnetic tape drive (not illustrated) may also be used. An optical disk drive 1512 is typically provided to act as a non-volatile source of data. Portable memory devices, such optical disks (e.g., CD-ROM, DVD, Blu ray DiscTM), USB-RAM, portable, external hard 30 drives, and floppy disks, for example, may be used as appropriate sources of data to the system 1500.

9 The components 1505 to 1513 of the computer module 1501 typically communicate via an interconnected bus 1504 and in a manner that results in a conventional mode of operation of the computer system 1500 known to those in the relevant art. For example, the processor 1505 is coupled to the system bus 1504 using a connection 1518. Likewise, the memory 1506 and 5 optical disk drive 1512 are coupled to the system bus 1504 by connections 1519. Examples of computers on which the described arrangements can be practised include IBM-PC's and compatibles, Sun Sparcstations, Apple MacTM or a like computer systems. The SASD method may be implemented using the computer system 1500 wherein the processes of Figs. 1-5, 6a, 8, 9, 10a, 1 la, 13 and 14, to be described, may be implemented as one or more 10 SASD software application programs 1533 executable within the computer system 1500. In particular, the steps of the SASD method are effected by instructions 1531 (see Fig. 15B) in the software 1533 that are carried out within the computer system 1500. The software instructions 1531 may be formed as one or more code modules, each for performing one or more particular tasks. The software may also be divided into two separate parts, in which a first part and the 15 corresponding code modules performs the SASD methods and a second part and the corresponding code modules manage a user interface between the first part and the user. The SASD software may be stored in a computer readable medium, including the storage devices described below, for example. The software is loaded into the computer system 1500 from the computer readable medium, and then executed by the computer system 1500. A 20 computer readable medium having such software or computer program recorded on the computer readable medium is a computer program product. The use of the computer program product in the computer system 1500 preferably effects an advantageous apparatus for performing the SASD methods. The software 1533 is typically stored in the HDD 1510 or the memory 1506. The software is 25 loaded into the computer system 1500 from a computer readable medium, and executed by the computer system 1500. Thus, for example, the software 1533 may be stored on an optically readable disk storage medium (e.g., CD-ROM) 1525 that is read by the optical disk drive 1512. A computer readable medium having such software or computer program recorded on it is a computer program product. The use of the computer program product in the computer system 30 1500 preferably effects an apparatus for implementing the SASD arrangements. In some instances, the SASD application programs 1533 may be supplied to the user encoded on one or more CD-ROMs 1525 and read via the corresponding drive 1512, or alternatively 10 may be read by the user from the networks 1520 or 1522. Still further, the software can also be loaded into the computer system 1500 from other computer readable media. Computer readable storage media refers to any non-transitory tangible storage medium that provides recorded instructions and/or data to the computer system 1500 for execution and/or processing. 5 Examples of such storage media include floppy disks, magnetic tape, CD-ROM, DVD, Blu rayTM Disc, a hard disk drive, a ROM or integrated circuit, USB memory, a magneto-optical disk, or a computer readable card such as a PCMCIA card and the like, whether or not such devices are internal or external of the computer module 1501. Examples of transitory or non tangible computer readable transmission media that may also participate in the provision of 10 software, application programs, instructions and/or data to the computer module 1501 include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the Internet or Intranets including e-mail transmissions and information recorded on Websites and the like. The second part of the application programs 1533 and the corresponding code modules 15 mentioned above may be executed to implement one or more graphical user interfaces (GUIs) to be rendered or otherwise represented upon the display 1514. Through manipulation of typically the keyboard 1502 and the mouse 1503, a user of the computer system 1500 and the application may manipulate the interface in a functionally adaptable manner to provide controlling commands and/or input to the applications associated with the GUI(s). Other forms 20 of functionally adaptable user interfaces may also be implemented, such as an audio interface utilizing speech prompts output via the loudspeakers 1517 and user voice commands input via the microphone 1580. Fig. 15B is a detailed schematic block diagram of the processor 1505 and a "memory" 1534. The memory 1534 represents a logical aggregation of all the memory modules (including the 25 HDD 1509 and semiconductor memory 1506) that can be accessed by the computer module 1501 in Fig. 15A. When the computer module 1501 is initially powered up, a power-on self-test (POST) program 1550 executes. The POST program 1550 is typically stored in a ROM 1549 of the semiconductor memory 1506 of Fig. 15A. A hardware device such as the ROM 1549 storing 30 software is sometimes referred to as firmware. The POST program 1550 examines hardware within the computer module 1501 to ensure proper functioning and typically checks the processor 1505, the memory 1534 (1509, 1506), and a basic input-output systems software 11 (BIOS) module 1551, also typically stored in the ROM 1549, for correct operation. Once the POST program 1550 has run successfully, the BIOS 1551 activates the hard disk drive 1510 of Fig. 15A. Activation of the hard disk drive 1510 causes a bootstrap loader program 1552 that is resident on the hard disk drive 1510 to execute via the processor 1505. This loads an operating 5 system 1553 into the RAM memory 1506, upon which the operating system 1553 commences operation. The operating system 1553 is a system level application, executable by the processor 1505, to fulfil various high level functions, including processor management, memory management, device management, storage management, software application interface, and generic user interface. 10 The operating system 1553 manages the memory 1534 (1509, 1506) to ensure that each process or application running on the computer module 1501 has sufficient memory in which to execute without colliding with memory allocated to another process. Furthermore, the different types of memory available in the system 1500 of Fig. 15A must be used properly so that each process can run effectively. Accordingly, the aggregated memory 1534 is not intended to illustrate how 15 particular segments of memory are allocated (unless otherwise stated), but rather to provide a general view of the memory accessible by the computer system 1500 and how such is used. As shown in Fig. 15B, the processor 1505 includes a number of functional modules including a control unit 1539, an arithmetic logic unit (ALU) 1540, and a local or internal memory 1548, sometimes called a cache memory. The cache memory 1548 typically include a number of 20 storage registers 1544 - 1546 in a register section. One or more internal busses 1541 functionally interconnect these functional modules. The processor 1505 typically also has one or more interfaces 1542 for communicating with external devices via the system bus 1504, using a connection 1518. The memory 1534 is coupled to the bus 1504 using a connection 1519. 25 The SASD application program 1533 includes a sequence of instructions 1531 that may include conditional branch and loop instructions. The program 1533 may also include data 1532 which is used in execution of the program 1533. The instructions 1531 and the data 1532 are stored in memory locations 1528, 1529, 1530 and 1535, 1536, 1537, respectively. Depending upon the relative size of the instructions 1531 and the memory locations 1528-1530, a particular 30 instruction may be stored in a single memory location as depicted by the instruction shown in the memory location 1530. Alternately, an instruction may be segmented into a number of 12 parts each of which is stored in a separate memory location, as depicted by the instruction segments shown in the memory locations 1528 and 1529. In general, the processor 1505 is given a set of instructions which are executed therein. The processor 1105 waits for a subsequent input, to which the processor 1505 reacts to by executing 5 another set of instructions. Each input may be provided from one or more of a number of sources, including data generated by one or more of the input devices 1502, 1503, data received from an external source across one of the networks 1520, 1502, data retrieved from one of the storage devices 1506, 1509 or data retrieved from a storage medium 1525 inserted into the corresponding reader 1512, all depicted in Fig. 15A. The execution of a set of the instructions 10 may in some cases result in output of data. Execution may also involve storing data or variables to the memory 1534. The disclosed SASD arrangements use input variables 1554, which are stored in the memory 1534 in corresponding memory locations 1555, 1556, 1557. The SASD arrangements produce output variables 1561, which are stored in the memory 1534 in corresponding memory 15 locations 1562, 1563, 1564. Intermediate variables 1558 may be stored in memory locations 1559, 1560, 1566 and 1567. Referring to the processor 1505 of Fig. 15B, the registers 1544, 1545, 1546, the arithmetic logic unit (ALU) 1540, and the control unit 1539 work together to perform sequences of micro operations needed to perform "fetch, decode, and execute" cycles for every instruction in the 20 instruction set making up the program 1533. Each fetch, decode, and execute cycle comprises: * a fetch operation, which fetches or reads an instruction 1531 from a memory location 1528, 1529, 1530; e a decode operation in which the control unit 1539 determines which instruction has been fetched; and 25 e an execute operation in which the control unit 1539 and/or the ALU 1540 execute the instruction. Thereafter, a further fetch, decode, and execute cycle for the next instruction may be executed. Similarly, a store cycle may be performed by which the control unit 1539 stores or writes a value to a memory location 1532.

13 Each step or sub-process in the processes of Figs. 1-5, 6a, 8, 9, 10a, 1la, 13 and 14 is associated with one or more segments of the program 1533 and is performed by the register section 1544, 1545, 1547, the ALU 1540, and the control unit 1539 in the processor 1505 working together to perform the fetch, decode, and execute cycles for every instruction in the instruction set for the 5 noted segments of the program 1533. Returning to Fig. 1, the input image 110, preferably in a standard colour encoding format such as sRGB, is analysed separately by a CLS feature extraction process 120, performed by the processor 1505 directed by the software program 1533, a sharpness feature extraction process 130, performed by the processor 1505 directed by the software program 1533,and a position 10 bias calculation process 140, performed by the processor 1505 directed by the software program 1533. The outputs of the processes 120, 130, and 140 (depicted by respective dotted arrows 240, 1320 and 1030) are then combined using pre-determined bias data 150 by a following combination process 160, performed by the processor 1505 directed by the software program 1533, to produce the saliency map 170. Details of each aforementioned process are provided 15 hereinafter in more detail. The bias data 150 is described hereinafter in more detail with reference to Figs. 12 a and 12b. Fig. 2 shows an example of how the CLS feature extraction process 120 can be performed. The input image 110 is processed by a scale decomposition process (also referred to as a downsampling process) 210, performed by the processor 1505 directed by the software program 20 1533. The process 210, described hereinafter in more detail with reference to Fig. 3, decomposes the input image 110 into a set of downsampled representations (also referred to as downscaled representations) 211 of the input image of predetermined size at different scales based on size data 330. An example of the size data 330 is depicted in Fig. 7, and the determination of such data is described hereinafter in more detail with reference to Fig. 6a. 25 The multiple downscaled representations 211 that are the output of the scale decomposition process 210 are input to a following colour measurement process 220, described hereinafter in more detail with reference to Fig. 4, performed by the processor 1505 directed by the software program 1533, where the colour and luminance relative salience of each of the multiple representations 211 is measured and quantified according to specific predefined colour space 30 transforms and distances for each scale input. It is however noted that the transform depends on the color encoding of the input image. An output of the colour measurement process 220 is a set of distance maps (also referred to as multiple scale CLS maps) 221, one for each corresponding 14 downscaled input image 211. These multiple distance maps 221 are input to a following scale recomposition process (also referred to as an upsampling process or an upscaling process) 230, described hereinafter in more detail with reference to Fig. 5, performed by the processor 1505 directed by the software program 1533, which, depending on the size data 330, weights the 5 multiple distance maps 221 and recombines them (ie upscales them) to form a colour luminance-size (CLS) map 240. The size data 330 is thus used by both the scale decomposition process 210 and the scale recomposition process 230. Fig. 3 illustrates an example of how the scale decomposition process 210 can be performed. A size data acquisition process 320, performed by the processor 1505 directed by the software 10 program 1533, described hereinafter in more detail in regard to Figs. 6a and 6b, generates the size data 330 that represents relative frequency of occurrence of image region sizes viewed (ie deemed salient) by a variety of observers. The size data 330 can be a continuous or quasi continuous function ranging from 0 to the size of the image. In a particular SASD arrangement depicted in Fig. 7, the size data 330 can be represented as a percentage, in terms of area, that a 15 salient region covers compared to the size of the source image. The size data 330 is then sampled by a following sampling step 340, performed by the processor 1505 directed by the software program 1533, to create a discrete set 341 of sizes that will be used in the rest of the scale decomposition process 210. In a particular SASD arrangement, the size data 330 is sampled at 7 values that represent 3, 4, 5, 6, and 7 percent of 20 the input image 110 area respectively (depicted in Fig. 7). A following process 350, performed by the processor 1505 directed by the software program 1533, sends, for each of the sizes 341 outputted by the sampling step 340, the input image 110 to an image downsampling step 310, performed by the processor 1505 directed by the software program 1533, which downsamples the input image 110 according to one of the sampled sizes 341. 25 For instance, if a currently selected sample size (from the set 341) is 4 percent of the area of the input image 110, the input image 110 is downsampled by a factor of 5, i.e., hout = h / 5 ; and w_out = w / 5; where h and w are the height and width in pixels of the input image 110, and h_out and wout are the height and width of an output image of the image downsampling step 310 for a particular size in the set 341 of sizes. The downsampling factor used in this SASD 30 arrangement is N where N = sqrt (100 / S) with S being the size in percent and sqrt being the square root operator. The output of the scale decomposition process 210 is the set 211 of 15 multiple downsampled images, each of which corresponds to a size 341 resulting from the sampling step 340. Fig. 4 illustrates an example of a method for performing the colour measurement process 220. An input selector step 410, performed by the processor 1505 directed by the software program 5 1533, selects each one of the multiple downsampled images 211 in turn. For each single scale image 420 selected from the set 211, a colour and luminance distance measurement process 430, performed by the processor 1505 directed by the software program 1533, (described hereinafter in more detail in regard to Fig. 8) produces a single scale CLS map 440 that is a colour luminance perceptual difference map for a size (from the set 341) corresponding to the single 10 scale image 420. An output selector 450, performed by the processor 1505 directed by the software program 1533, selects each individual output 440 and transfers it to a collection step 460, performed by the processor 1505 directed by the software program 1533, which forms the multiple scale CLS maps 221 (ie the set of distance maps). Fig. 5 illustrates an example of a method for performing the scale recomposition (ie the 15 upsampling) process 230. Given the multiple scale CLS maps 221, the scale recomposition process 230 selects, as depicted by a step 510, performed by the processor 1505 directed by the software program 1533, each distance map from the set 221 and upsamples it in an upsample step 520, performed by the processor 1505 directed by the software program 1533, to form upsampled images 521. The upsampling process of the upsample step 520 can be a bilinear 20 interpolation that upsamples the set of 221 of input difference maps to the size of the original input image 110. In addition, for each input from the set 221 a weight determination step 530, performed by the processor 1505 directed by the software program 1533, is performed, determines a weight 531 for each individual input based on the size data 330. For instance, given the set of inputs 221 of sampled sizes 3, 4, 5, 6, and 7 percent of the input image 110 area, 25 corresponding weights can be 36.25, 31.19, 14.83, 8.38, 4.72 respectively, these being the values of the graph depicted in Fig. 7. The upsampled images 521 and their corresponding weights 531 are then combined in a weighted combination step 540, performed by the processor 1505 directed by the software program 1533, to produce the CLS feature map 240. In a particular SASD arrangement, the weighted combination step 540 is a linear combination of all 30 of the upsampled multiple inputs 521 with their corresponding respective weights 531. Fig. 6a illustrates an example of a method for performing the size data acquisition process 320. Starting with an image dataset 610 comprising several images, each image in the dataset 610 is 16 selected one at a time by a step 620, performed by the processor 1505 directed by the software program 1533, and displayed in a step 630 on a screen such as 1514. The gaze of human observers looking at the displayed image is tracked by an eye-tracker in a step 640 and eye tracking data 641 is analysed by a fixation collection step 650, removing noisy points or eye 5 saccades (these being rapid eye movements between fixation points). For example, in a particular SASD arrangement, any set of points reflecting a velocity greater than 60 degrees per second is considered a saccade and discarded. Then, temporally consecutive non-saccadic points are considered to constitute a fixation only if they lay within a certain radius for a specific length of time, e.g., 2 degrees for 100miliseconds. Resultant fixation points 651 are 10 then clustered by a clustering step 660, performed by the processor 1505 directed by the software program 1533, to form fixation clusters 661. Points that belong to the same fixation are clustered together, and if some points belong to different fixations, the fixations are merged together in a larger cluster. The saliency clusters thus obtained are analysed by a subsequent size distribution calculation step 670, performed by the processor 1505 directed by the software 15 program 1533, which calculates the covered area of each cluster of each image of the image dataset 610 viewed by each observer. The distribution of the clusters in terms of their frequency and relative area is the size data 330. An example of size data resulting from the size data acquisition process 320 is shown in Fig. 7. Figure 6b is an illustrated example of the size data acquisition process 320. Given the dataset of 20 images 610, the image selector step 620 selects an image 633 and displays it. The displayed image 633 is viewed by an observer 631 whose gaze is tracked by the eye tracker 640. The output of the fixation collection step 650 (see Fig. 6a) is a collection 635 of fixation points. The fixations 635 are convolved with an uncertainty function 636, for example a 2-D Gaussian function of mean 0 and standard deviation of 2 degrees. The result of the convolution is clusters 25 637 whose union forms a large cluster 638 of viewed image content as opposed to unattended image content 639. Repeating this procedure for all the images in the dataset 610 and a significant number of observers (e.g., preferably more than 30 as according to the central limit theorem) allows a collection of clusters to be analysed by the analysis step 670 to generate the size data 330. 30 Fig. 7 is an illustrative plot of the size data 330 produced by the size data acquisition process 320 as described with respect to Figs. 6a and 6b. The X axis of the plot represents distribution of cluster size, described as a percentage of the area of the input image 110. The cluster size is defined as a percentage of the input image, and thus the percentages are relative and are the 17 same regardless of the size of the input image. The Y axis represents, in a histogram manner, the frequency with which clusters of a particular cluster size are identified as being salient throughout the tested image dataset 610. The curve 710 in the plot shows the size distribution of clusters (ie salient regions) as viewed by observers over a large number of images and 5 observers. The frequency of the clusters of each size can be used, for example in the weight determination process 530 as weights to perform the weighting of colour-luminance maps for given sampled sizes, e.g., the input 221 to the scale recombination process 230. Fig. 8 illustrates an example of a method for performing the colour and luminance distance measurement process 430 in Fig. 4. The single scale image 420, selected by the step 410 from 10 the set 211, has its colour encoding modified by a color space transform step 810, performed by the processor 1505 directed by the software program 1533, for example the decorrelated luminance-chrominance perceptual colour transform described in more details in Figs. 11 a 1 Ic, to form a transformed image 811. The transformed image 811 is then processed by a, image element selection step 820, performed by the processor 1505 directed by the software 15 program 1533, that separates the transformed image 811 into a series of elements S (depicted for example by 821). In a preferred SASD arrangement, each image element is a pixel of the transformed image. In addition to the image elements S, the image element selection step 820 also determines nS (depicted for example by 822), the complement of the element S over the input image 420. Thus, for example, if the image element selection step 820 selects each pixel 20 in turn to be the element S, then all the other pixels are nS. The preferred SASD arrangement then requires, for each element S (ie 821) and its complement nS (ie 822), determination of a brightness distance by a step 830, a color C1 distance by a step 840, and a color C2 distance by a step 850, each aforementioned step performed by the processor 1505 directed by the software program 1533. 25 In a preferred SASD arrangement, the brightness distance step 830 performs an operation using a distance function L(S)-L(nS) , with L being the luminance as defined by CIE Lab (see the Luminance axis 1133 in Fig. 11 b). The operation L(S)-L(nS) can be regarded as a measure of contrast / difference, ie a luminance contrast value, between the value of the luminance of the element S and the value of the luminance of the complement nS of the element S. 30 The color C1 distance step 840 performs an operation using another distance function C l(S) Cl(nS), with C1 being the perceptual color axis C1 (ie 1135) as defined in Fig. 1lb. The operation C l(S) - Cl(nS) can be regarded as a measure of contrast / difference between the 18 value of the perceptual color axis C1 of the element S and the value of the perceptual color axis Cl of the complement nS of the element S. The color C2 distance step 850 performs an operation using another distance function C2(S) C2(nS), with C2 being the perceptual color axis C2 (ie 1136) as defined in Fig. 1 lb. The 5 operation C2(S)-C2(nS) can be regarded as a measure of contrast / difference, ie a colour contrast value, between the value of the perceptual color axis C2 of the element S and the value of the perceptual color axis C2 of the complement nS of the element S. While the distances are here defined over a specific colour space and distance function, this should not be construed as an intrinsic limitation of the SASD method. Outputs of the distance 10 functions 830, 840 and 850, depicted by reference numerals 831, 841 and 851 respectively, are combined by a colour and brightness combination step 860, performed by the processor 1505 directed by the software program 1533, to form a single distance measure 440 over the single scale image in question 420. Specifically, the combination step 860 calculates for each element S and its complement nS a global distance function. For example, the global contrast distance 15 function can be defined as Gd(S,nS) = sqrt ( (L(S)-L(nS))A2 + (C1(S)-C1(nS))A2 + (C2(S) C2(nS)) A2). The value for all elements S of the output image 440 is the result of Gd(S,nS). The output single scale CLS 440 has a value for each of the elements S as determined by the element determination step 820. Fig. 9 is an example of how the sharpness extraction feature process 130 (see Fig. 1) can be 20 performed. The input image 110 is first decomposed by a channel decomposition step 910, performed by the processor 1505 directed by the software program 1533, into N channels 1, N (depicted by a reference numeral 920) depending on the initial colour encoding of the image 110. For instance, if the input image 110 is a standard red-Green-Blue encoded image, the number of channels N is equal to three, namely Red, Green, and Blue. Similarly, if the image is 25 encoded in a chrominance-luminance colour space such as CIE Lab, the number of channels is also three, namely L, a, and b. Following a step 930, performed by the processor 1505 directed by the software program 1533, for each channel a wavelet decomposition step 940, performed by the processor 1505 directed by the software program 1533, is performed, for example using Haar wavelet kernel. After the initial wavelet decomposition by the step 940, a following level 30 check 950, performed by the processor 1505 directed by the software program 1533, checks if a target level has been reached. The target level can be specified appropriately depending on image content and saliency application, but in the preferred SASD arrangement the target level 19 is three. If the target level is not reached, the LL subband of the wavelet decomposition is selected by a step 960, performed by the processor 1505 directed by the software program 1533, and is decomposed according to the same decomposition technique 940. When the target level is reached, all of the wavelet decomposition coefficients apart from the LL of the last level are 5 combined across all scales and orientations by a coefficient combination step 970, performed by the processor 1505 directed by the software program 1533. Once the coefficients are combined, following a step 980, performed by the processor 1505 directed by the software program 1533, the method 130 averages values from all the N channels coefficients in a cross-channel combination step 990, performed by the processor 1505 directed by the software program 1533 10 to form a cross channel averaged map 991. The cross channel averaged map 991 is then convolved with a 2-D Gaussian kernel of mean 0 and sigma of 2 degrees in a convolution step 993, performed by the processor 1505 directed by the software program 1533, to produce a sharpness feature output map 1320. The value of sigma of the 2D kernel can depend on the viewing distance of the image In one SASD arrangement, the input image 110 is viewed under 15 a standard viewing distance , so it subtends 30 degrees of visual angle horizontally. Fig. 10a is an example of how the position bias calculation process 140 of Fig. 1 can be performed. The input image 110 is analysed by an initial position measurement step 1010, performed by the processor 1505 directed by the software program 1533. The default initial position the most salient region (eg see 1054 in Fig. 10) is the centre of the captured image. If 20 metadata or additional information is present, the initial position can be determined, for example, as the focus point as defined by a camera Al servo or the area selected on a touch screen enabled image capture device. The initial position and the pre-determined bias data 150 are then transferred to a position bias convolution step 1020, performed by the processor 1505 directed by the software program 1533 where a shape and magnitude of the position bias is 25 defined by the bias data 150 (described in more detail with respect to Figs. 12a and 12b) and the central point for the convolution is the initial position as calculated by the initial position measurement step 1010. The outcome of the position convolution step 1020 is the position bias map 1030, which is a map that shows the relative importance of image parts depending on their position within the input image 110. 30 Fig. 10b is an illustrative example of the position bias calculation process 140. An input image 1050 comprising 2 elements, a tree and a house, over a uniform background is captured in two ways. In 1060, a user input, for instance a user selected focus box 1056, is selected over the tree of the input image 1050. The centre of the focus box 1055 is selected as the initial position, as 20 represented by a dot 1055 in 1060. In 1070, no additional information to the image 1050 is given, and so the centre of the image 1054 is chosen by default, as illustrated by a central dot 1054 in the image 1070. The two different outputs 1060 and 1070 are convolved with the pre determined bias data 150 to produce bias images 1080 and 1090. Corresponding examples of 5 bias data are given in Figs. 12a and 12b. Figs. 11 a and 1 lb are examples, in a preferred SASD arrangement of the how the colour and luminance distance measurement process 430 is performed. Fig. 1 la shows a flowchart of the process 430. The input image and its associated encoding 420, for example a colour image encoded in the standard sRGB colour space is processed by a luminance-chrominance 10 decorrelation transform 1111, performed by the processor 1505 directed by the software program 1533, which produces at least one channel that is a correlate of luminance (or brightness) and a plurality of channels that are correlated to chrominance, collectively forming a luminance-chrominance encoded image depicted by a reference numeral 1114. An example of such a decorrelation transform is the standard CIE defined transform between sRGB and CIE 15 Lab. In general, luminance-chrominance transforms are isotropic in the chrominance channels. Then the luminance-chrominance encoded image 1114 is modified by a hue-dependent colour space modification process 1112, performed by the processor 1505 directed by the software program 1533, where non isotropic elements of chrominance are introduced. The step 1112 modifies the encoding of the luminance-chrominance colour encoding of the step 1111 so that 20 its unitary distance is perceptually meaningful for salience. Research performed by the inventors has shown that a significant hue-dependency exists in terms of the salience of isoluminant colours. The output of the hue-dependent colour space modification 1112 is the output encoded image 810 (see Fig. 8). Fig. 1 lb illustrates an example of the process of Fig. 11 a in terms of colour encodings. As an 25 example, the colour encoding of the input image to the process is taken to be red-green-blue, as generally represented by three colour space axes forming a colour space cube 1120. The luminance-chrominance decorrelation transform 1111 transforms the RGB encoding 1120 into a CIE Lab encoding 1121 that comprises one luminance channel (L) and two chrominance channels, namely a, that ranges from green to red and b that ranges from yellow to blue. 30 Transforms from standard RGB colour spaces to CIE Lab are known in the art. Finally, the CIE Lab encoding 1121 is modified by the hue-dependent colour space modification step 1112 to create a colour encoding in which Euclidean distances are a metric of colour and luminance saliency, as per the result of experiments conducted by the inventors. In this present example of 21 the preferred SASD arrangement, an a axis 1131 is rotated by an angle alpha 1130 and resealed by a factor of 2 to form a C1 axis 1135. A b axis 1132 is rotated by an angle alpha 1132 to form a C2 axis 1136. An L axis 1133 is scaled by a factor of 2 to form a B axis. In the preferred SASD arrangement, alpha is 35 (thirty-five) degrees. Alpha may typically fall between 30-45 5 degrees. Analytically, the values of C1, C2 and B can therefore be determined as follows: C1 = 2*(cos (alpha)a -sin(alpha)b) C2= sin(alpha)a + cos(alpha)b B = 2*L The Euclidean distance computed over the B,C1,C2 colour encoding is isotropic for colour and 10 luminance saliency. Fig. 1 Ic shows iso-salient contours in the CIE Lab plane at 1140. The shape of these isosalient contours is circular, as CIE Lab assumes that perceptual differences are hue independent within CIE Lab. A reference numeral 1150 shows isosalient measures of perceptual scales discovered by the inventors when measuring saliency of luminance and colour. The isosalient contours are 15 modelled as ellipses and exhibit a significant Hue dependency. The ellipses typically have an axis length ratio greater than 1:1 and smaller than 5:1. In this example, it means that a red colour cast over a grey background is twice as likely to attract an observer's attention than a green colour of identical chroma and luminance cast over an identical grey background. Figs. 12a and 12b show examples of the bias data 150. The data depicted at 1210 in Fig. 12a is 20 an example of measured bias data over a large collection of images viewed by a large number of observers. While the measured data 1210 may be more accurate, it may be impractical to use because because the data would need to be incorporated in its entirety, and an approximate model is more robust to slight changes in viewing conditions. In the preferred SASD arrangement, the measured bias 1210 is modelled as modelled bias 1220 as depicted in Fig. 12b. 25 It has been determined by the inventors that the preferred model for prediction is a parametrised 2-D Gaussian function with mean (0.5w, 0.5h) and standard deviation (0.28w,0.26h) where w and h are the width and the height of the input image 110, respectively. Fig. 13 shows an example of how the feature combination process 160 of Fig. 1 can be performed. The CLS map 240 that is the output of the CLS feature extraction process 120 is 22 normalised by a normalisation step 1340, performed by the processor 1505 directed by the software program 1533, to form a normalised CLS map 1341. The normalisation step 1340 proceeds by dividing the CLS map 240 by its maximal value (i.e., every pixel of the CLS map is divided by the maximal value (over all its pixels) of the CLS map) so that its maximal 5 normalised value is equal to 1 (one). The CLS normalised map is then tested by a testing step 1345, performed by the processor 1505 directed by the software program 1533, to determine whether a median value of the normalised map is greater than 0.5. If it is, the map is inverted by an invertor step 1360, performed by the processor 1505 directed by the software program 1533, to form an inverted normalised CLS map 1342, so that a new map value is 1- the normalised 10 CLS map value. The argument for this is that saliency is more meaningfully defined as a salient object or region over a non-salient background. If the step 1345 returns a logical FALSE, then the map is transmitted to the next step without modification. The sharpness map 1320 is the output of the sharpness feature extraction process 130 and is normalised by a normalisation step 1350, performed by the processor 1505 directed by the 15 software program 1533, to form a normalised sharpness map 1343. The normalisation step 1350 normalises the sharpness map 1320 by dividing it by its maximal value so that its maximal values becomes 1 (one). The normalised sharpness map is then tested by a testing step 1355, performed by the processor 1505 directed by the software program 1533, to determine whether its median value is greater than 0.5. If it is, the map is inverted by an invertor step 1365, 20 performed by the processor 1505 directed by the software program 1533, to form a normalised inverted sharpness map 1344, so that the new map value is 1- the normalised sharpness map value. If the step 1355 returns a logical FALSE, then the map is transmitted to the next step without modification. The sharpness and CLS maps post normalisation and inversion, respectively depicted by dotted 25 arrows 1344 and 1342 are then compared by a comparison step 1370, performed by the processor 1505 directed by the software program 1533, to determine combination relative weights 1380. In a preferred SASD arrangement, the comparison step 1370 calculates the mean (average) value of the CLS map 1342, mCLS, and the mean (average) value of the sharpness map 1344, mSharp. It then calculates wCLS and wSharp 1380 as 30 (2*mSharp)/(2*mSharp+mCLS) and (mCLS)/(2*mSharp+mCLS), respectively. Finally, the position map 1330, sharpness map, and CLS map are combined by the maps final combination step 1390. This combination can be additive, or point multiplicative. In a preferred 23 SASD arrangement, the combination function is wSharp * (SharpMap.*PosMap) + wSharp*(SharpMap)+wCLS*(MapCLS) + wCLS*(MapCLS.*PosMap) where ".*" is the point by point multiplication operator. The output of the combination function can then be normalised and constructs the saliency map 170 of the input image 110. It is noted that CLS 5 map refers to the output of 1345 or 1342 if false or true, respectively. Sharpness map refers to 1355 or 1344 if false or true respectively An alternate implementation of 1390 uses (MapCLS+beta*SharpMap).*PosMap where beta is a parameter value that in a SASD arrangement is comprised between 0.5 and 2. Fig. 14 is an illustration of an alternate example of a method to combine various feature maps 10 into a saliency map according to the method of Fig. 1. First of all, a map 1410 is generated based on the method described in Fig. 13. Then a quasi-binarisation step 1411, performed by the processor 1505 directed by the software program 1533, modifies the map 1410 to form the final saliency map 170. In the map 1410, all pixels have values between a minimum pixel value of 0 and a maximum pixel value of 1. In the quasi-binarisation step 1411, pixels in the map 15 1410 are modified to approach either 0 or 1 (ie to approach either the minimum pixel value or the maximum pixel value), and thus less salient regions in map 1410 become even less salient in the saliency map 170, and more salient regions in map 1410 become even more salient in the saliency map 170. In one example of the the quasi-binarisation step 1411, two control parameter values (p1, and 20 p2, where p1 < p 2 ) are chosen. Pixel values smaller than p1 are set to 0, and pixel values greater than p2 are set to 1. For pixel values in the middle range (p < p2 and p > p1, where p is the pixel value), the pixel values are modified to approach either 0 or 1. In one SASD arrangement, if a value is smaller than (p1 + p2)12, the value is reduced, if a value is greater than (p1 + p2)12, the value is increased. One way of modifying the pixel value is shown in 25 Equation (1). p-pt 2 pt+p2 P p2-pl) 2 -2* ( ) 22 (1) p2-pt 2 The modified map is output as the final saliency map 170 of the input image 110. Industrial Applicability 24 The arrangements described are applicable to the computer and data processing industries and particularly for the image processing industry. The foregoing describes only some embodiments of the present invention, and modifications and/or changes can be made thereto without departing from the scope and spirit of the invention, 5 the embodiments being illustrative and not restrictive. In the context of this specification, the word "comprising" means "including principally but not necessarily solely" or "having" or "including", and not "consisting only of'. Variations of the word "comprising", such as "comprise" and "comprises" have correspondingly varied meanings.

Claims

1. A method of creating at least one value of a saliency map for a digital image, the digital image having a plurality of pixels each represented by at least one colour value and a luminance value, the method comprising the steps of: 5 generating, dependent upon a distance function, at least one colour contrast value, from the at least one colour value, for pixels within a plurality of regions of predetermined size, said regions being downscaled representations of the digital image, the size of at least one of the regions being different to the size of another one of the regions, the sizes of the regions being dependent upon a size distribution of saliency clusters in the digital image; 10 generating, dependent upon another distance function, at least one luminance contrast value from the luminance value of said pixels within the plurality regions of the digital image; determining a colour-luminance result from a weighted combination of the at least one generated colour contrast value and the at least one generated luminance contrast value; and constructing from the colour-luminance result the at least one value of the saliency map 15 for the digital image.

2. A method according to claim 1, wherein the distance functions are associated with a hue-dependent colour space modification process. 20

3. A method according to claim 1 comprising, prior to the constructing step, a quasi binarisation step which modifies pixel values to approach either a minimum pixel value or a maximum pixel value dependent upon at least one control parameter.

4. A method according to claim 1, wherein the distance functions are associated with a 25 colour space producing non-isotropic isosalient contours for chrominance information.

5. A method according to claim 4, wherein colour axes of said colour space are rotated by an angle of 35 - 40 degrees from corresponding axes of a colour space producing isotropic isosalient contours. 30

6. A method according to claim 4, wherein the non-isotropic isosalient contours are disposed at an angle of 30 - 45 degrees relative to corresponding a* and b* axes of a CIE colour space. 26

7. A method according to claim 4, wherein the isosalient contours are ellipses wherein a ratio of a length of elliptical axes is greater than 1:1 and smaller than 5:1.

8. An apparatus for creating at least one value of a saliency map for a digital image, the 5 digital image having a plurality of pixels each represented by at least one colour value and a luminance value, the apparatus comprising: a processor; and a non-transitory memory storing a computer executable software program for directing the processor to perform a method for creating at least one value of the saliency map, the 10 method comprising the steps of: generating, dependent upon a distance function, at least one colour contrast value, from the at least one colour value, for pixels within a plurality of regions of predetermined size, said regions being downscaled representations of the digital image, the size of at least one of the regions being different to the size of another one of the regions, the sizes of the regions being 15 dependent upon a size distribution of saliency clusters in the digital image; generating, dependent upon another distance function, at least one luminance contrast value from the luminance value of said pixels within the plurality regions of the digital image; determining a colour-luminance result from a weighted combination of the at least one generated colour contrast value and the at least one generated luminance contrast value; and 20 constructing from the colour-luminance result the at least one value of the saliency map for the digital image.

9. A non-transitory computer readable memory medium storing a computer executable software program for directing the processor to perform a method for creating at least one value 25 of a saliency map for a digital image, the digital image having a plurality of pixels each represented by at least one colour value and a luminance value, the program comprising: computer executable code for generating, dependent upon a distance function, at least one colour contrast value, from the at least one colour value, for pixels within a plurality of regions of predetermined size, said regions being downscaled representations of the digital 30 image, the size of at least one of the regions being different to the size of another one of the regions, the sizes of the regions being dependent upon a size distribution of saliency clusters in the digital image; computer executable code for generating, dependent upon another distance function, at least one luminance contrast value from the luminance value of said pixels within the plurality 35 regions of the digital image; 27 computer executable code for determining a colour-luminance result from a weighted combination of the at least one generated colour contrast value and the at least one generated luminance contrast value; and computer executable code for constructing from the colour-luminance result the at least 5 one value of the saliency map for the digital image.

10. A method of creating at least one value of a saliency map for a digital image, the digital image having a plurality of pixels each represented by at least one colour value and a luminance value, the method comprising the steps of: 10 generating, dependent upon a distance function, at least one colour contrast value, from the at least one colour value, for pixels within a plurality of regions being downscaled representations of the digital image; generating, dependent upon another distance function, at least one luminance contrast value from the luminance value of said pixels within the plurality regions of the digital image; 15 determining a colour-luminance result from a weighted combination of the at least one generated colour contrast value and the at least one generated luminance contrast value; and constructing from the colour-luminance result the at least one value of the saliency map for the digital image, wherein the distance functions are associated with a colour space producing non-isotropic isosalient contours for chrominance information. 20 Dated this 21st day of December 2012 CANON KABUSHIKI KAISHA Patent Attorneys for the Applicant/Nominated Person SPRUSON & FERGUSON 25