AU2015271981A1

AU2015271981A1 - Method, system and apparatus for modifying a perceptual attribute for at least a part of an image

Info

Publication number: AU2015271981A1
Application number: AU2015271981A
Authority: AU
Inventors: Nicolas Pierre Marie Frederic Bonnier; Steven Richard Irrgang; Timothy Stephen Mason
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2015-12-21
Filing date: 2015-12-21
Publication date: 2017-07-06

Abstract

Abstract METHOD, SYSTEM AND APPARATUS FOR MODIFYING A PERCEPTUAL ATTRIBUTE FOR AT LEAST A PART OF AN IMAGE A method of modifying a perceptual attribute for at least a part of an image. A region of 5 the image representing a first object in a scene captured by the image is selected. The method determines, from a depth map of the scene, an initial depth range for the selected image region. The initial depth range for the selected image region is adjusted to a revised depth range to modify a relative depth difference from at least a part of the first object to a second object in the scene. An image process is applied to the selected image region in accordance with the revised 10 depth range to modify a perceptual attribute of the selected object as represented within the image. 1 flQ"73&/|1 -3/7 Start 210 Determine stereo disparity 220 Refine depth Generate aesthetic depth map 240 Display image Receive desired level of change Select processes Enhance image Display enhanced image Fig. 2 CEnd 10827925v1

Description

1 2015271981 21 Dec 2015

METHOD, SYSTEM AND APPARATUS FOR MODIFYING A PERCEPTUAL ATTRIBUTE FOR AT LEAST A PART OF AN IMAGE

TECHNICAL FIELD

The present invention is in the field of image processing and, in particular, automatic 5 aesthetic enhancement of images. The present invention also relates to a method, system and apparatus for modifying a perceptual attribute for at least a part of an image, and to a computer program product including a computer readable medium having recorded thereon a computer program for modifying a perceptual attribute for at least a part of an image.

BACKGROUND 10 The widespread use of digital photography along with the abundance of computing resources has allowed convenient digital enhancement of images. Manual enhancement of images can be time consuming, and amateur photographers may not know the best way to modify their images to achieve a desired effect. Increasingly, more sophisticated and higher-level tools are being created, progressing towards an ideal where a photographer simply 15 specifies a desired aesthetic impact of a change and processing is automatically performed to achieve that effect.

One desirable aesthetic enhancement is to increase the sense of depth in an image. The enhancement can include increasing the sense of shape within objects, increasing the visibility of changes in depth, and increasing the sense of separation between the foreground and 20 background elements.

Methods exist which automatically increase the sense of depth in an image, using a depth map or a small set of depth layers to inform processing of the image. The existing methods may blur the background, sharpen the foreground, or modify other aspects such as colourfulness, brightness or hue to differentiate the foreground and background. 25 Some existing methods segment an image into a small number of discrete layers such as a “foreground”, “background” and potentially “midground”. However, the segmentation into layers does not preserve the depth information within each object. Furthermore, the segmentation into layers does not allow for the case of multiple foreground objects in different layers. 10827264vl 2

Thus, a need exists for an improved method of enhancing images. 2015271981 21 Dec 2015

SUMMARY

It is an object of the present invention to substantially overcome, or at least ameliorate, one or more disadvantages of existing arrangements. 5 Disclosed are arrangements for aesthetically enhancing an image. An image with a depth map is provided. Objects in the scene are segmented based on the combination of the image and the depth map. The importance of each object is evaluated based on clues such as focus, salience and centrality. An aesthetic depth map is generated based on the input depth map and the object importance. The aesthetic depth map has the most important objects in the 10 scene marked as nearby, and separates the foreground from the background as well as different foreground layers from each other, while still preserving depth details within each object. Processing is then performed on the image which emphasises the important objects based on the aesthetic depth map. The result of the processing is an aesthetically enhanced image.

According to one aspect of the present disclosure, there is provided a method of 15 modifying a perceptual attribute for at least a part of an image, said method comprising the steps of: selecting a region of the image representing a first object in a scene captured by the image; determining, from a depth map of the scene, an initial depth range for the selected image 20 region; adjusting the initial depth range for the selected image region to a revised depth range to modify a relative depth difference from at least a part of the first object to a second object in the scene; and applying an image process to the selected image region in accordance with the revised 25 depth range to modify a perceptual attribute of the selected object as represented within the image. 10827264vl 3 2015271981 21 Dec 2015

According to another aspect of the present disclosure, there is provided a system for modifying a perceptual attribute for at least a part of an image, said system comprising: a memory storing data and a computer program; a processor coupled to the memory for executing the computer program, the computer 5 program comprising instructions for: selecting a region of the image representing a first object in a scene captured by the image; determining, from a depth map of the scene, an initial depth range for the selected image region; 10 adjusting the initial depth range for the selected image region to a revised depth range to modify a relative depth difference from at least a part of the first object to a second object in the scene; and applying an image process to the selected image region in accordance with the revised depth range to modify a perceptual attribute of the selected object as represented 15 within the image.

According to still another aspect of the present disclosure, there is provided an apparatus for modifying a perceptual attribute for at least a part of an image, said apparatus comprising: means for selecting a region of the image representing a first object in a scene captured by the image; 20 means for determining, from a depth map of the scene, an initial depth range for the selected image region; means for adjusting the initial depth range for the selected image region to a revised depth range to modify a relative depth difference from at least a part of the first object to a second object in the scene; and 10827264vl 4 2015271981 21 Dec 2015 means for applying an image process to the selected image region in accordance with the revised depth range to modify a perceptual attribute of the selected object as represented within the image.

According to still another aspect of the present disclosure, there is provided a computer 5 readable medium having a computer program stored on the medium for modifying a perceptual attribute for at least a part of an image, said program comprising: code for selecting a region of the image representing a first object in a scene captured by the image; code for determining, from a depth map of the scene, an initial depth range for the 10 selected image region; code for adjusting the initial depth range for the selected image region to a revised depth range to modify a relative depth difference from at least a part of the first object to a second object in the scene; and code for applying an image process to the selected image region in accordance with the 15 revised depth range to modify a perceptual attribute of the selected object as represented within the image.

Other aspects are also disclosed.

BRIEF DESCRIPTION OF THE DRAWINGS

One or more embodiments of the invention will now be described with reference to the 20 following drawings, in which:

Figs. 1A and IB form a schematic block diagram of a general purpose computer system upon which arrangements described can be practiced;

Fig. 2 is a schematic flow diagram illustrating a method of aesthetically enhancing an already captured image or image pair according to one embodiment of the invention; 25 Fig. 3 is a schematic flow diagram illustrating a method of capturing and aesthetically enhancing an image according to one embodiment of the invention. 10827264V1 5

Fig. 4 is a schematic flow diagram showing a method of aesthetically enhancing an 2015271981 21 Dec 2015 image;

Fig. 5 is a diagram representing objects in an example image to be aesthetically enhanced; and 5 Fig. 6 is a schematic flow diagram showing a method of generating an aesthetic depth map.

DETAILED DESCRIPTION INCLUDING BEST MODE

As discussed above, one desirable aesthetic enhancement is to increase the sense of depth in an image. The enhancement can include increasing the sense of shape within objects, 10 increasing visibility of changes in depth, and increasing the sense of separation between foreground and background elements. For example, modifications may be made to the image that bring foreground objects perceptually closer or forward in the image, background objects perceptually further or backward in the image or a combination of both. Perceptually closer, forward, further or backward in this regard refers to people observing the images having more 15 of a sense of those scene elements being closer or further, regardless of whether the geometry implied by the objects in the scene has changed. A faded or blurrier background is perceptually further or backward in the image because the background draws less attention and low level visual cues suggest the background is further away even though the geometry of the scene is unchanged. 20 A useful input to aid in enhancing the sense of depth of an image is knowledge of the actual depth of the objects in a scene. With increasing interest in three dimensional (3D) stereoscopic viewing, increasingly many images and videos are captured alongside depth information. Technology for measuring or estimating depth is also improving. Depth can be captured or estimated for example by directly scanning a scene at the time of capture or by 25 capturing multiple images from different viewpoints. As another example, depth can be captured or estimated by capturing multiple images with different camera parameters such as depth of field or by estimating from the level of blur in a single image. Depth can also be captured or estimated by analysing the scene contents and geometry, or by a combination of the above methods. 10827264vl 6 2015271981 21 Dec 2015

As described above, some methods of increasing sense of depth in an image may blur the background, sharpen the foreground, or modify other aspects such as colourfulness, brightness or hue to differentiate the foreground and background. However, the actual depth of objects in the scene may not correspond to an ideal aesthetic enhancement of the image. The 5 distance in depth between objects also may not correspond to the desired difference in processing. Some objects in the scene may be closer to a camera than the subject of the image, and enhancing those closer objects may not be desirable. In many images the closest object is the area of ground at the bottom of the image, and it is not desirable to bring the ground perceptually closer rather than the subject of the photo. 10 A method 400 of aesthetically enhancing an image will be described below with reference to Fig. 4. The image is enhanced in accordance with the method 400 by modifying a perceptual attribute for at least a part of an image.

Figs. 1A and IB depict a general-purpose computer system 100, upon which the various arrangements described can be practiced. 15 As seen in Fig. 1A, the computer system 100 includes: a computer module 101; input devices such as a keyboard 102, a mouse pointer device 103, a scanner 126, a camera 127, and a microphone 180; and output devices including a printer 115, a display device 114 and loudspeakers 117. An external Modulator-Demodulator (Modem) transceiver device 116 may be used by the computer module 101 for communicating to and from a communications 20 network 120 via a connection 121. The communications network 120 may be a wide-area network (WAN), such as the Internet, a cellular telecommunications network, or a private WAN. Where the connection 121 is a telephone line, the modem 116 may be a traditional “dial-up” modem. Alternatively, where the connection 121 is a high capacity (e g., cable) connection, the modem 116 may be a broadband modem. A wireless modem may also be used 25 for wireless connection to the communications network 120.

The computer module 101 typically includes at least one processor unit 105, and a memory unit 106. For example, the memory unit 106 may have semiconductor random access memory (RAM) and semiconductor read only memory (ROM). The computer module 101 also includes an number of input/output (I/O) interfaces including: an audio-video interface 107 that 30 couples to the video display 114, loudspeakers 117 and microphone 180; an I/O interface 113 that couples to the keyboard 102, mouse 103, scanner 126, camera 127 and optionally a joystick or other human interface device (not illustrated); and an interface 108 for the external 10827264vl 7 2015271981 21 Dec 2015 modem 116 and printer 115. In some implementations, the modem 116 may be incorporated within the computer module 101, for example within the interface 108. The computer module 101 also has a local network interface 111, which permits coupling of the computer system 100 via a connection 123 to a local-area communications network 122, known as a 5 Local Area Network (LAN). As illustrated in Fig. 1 A, the local communications network 122 may also couple to the wide network 120 via a connection 124, which would typically include a so-called “firewall” device or device of similar functionality. The local network interface 111 may comprise an Ethernet circuit card, a Bluetooth® wireless arrangement or an IEEE 802.11 wireless arrangement; however, numerous other types of interfaces may be practiced for the 10 interface 111.

The I/O interfaces 108 and 113 may afford either or both of serial and parallel connectivity, the former typically being implemented according to the Universal Serial Bus (USB) standards and having corresponding USB connectors (not illustrated). Storage devices 109 are provided and typically include a hard disk drive (HDD) 110. Other storage 15 devices such as a floppy disk drive and a magnetic tape drive (not illustrated) may also be used. An optical disk drive 112 is typically provided to act as a non-volatile source of data. Portable memory devices, such optical disks (e.g., CD-ROM, DVD, Blu-ray Disc™), USB-RAM, portable, external hard drives, and floppy disks, for example, may be used as appropriate sources of data to the system 100. 20 The components 105 to 113 of the computer module 101 typically communicate via an interconnected bus 104 and in a manner that results in a conventional mode of operation of the computer system 100 known to those in the relevant art. For example, the processor 105 is coupled to the system bus 104 using a connection 118. Likewise, the memory 106 and optical disk drive 112 are coupled to the system bus 104 by connections 119. Examples of computers 25 on which the described arrangements can be practised include IBM-PC’s and compatibles, Sun Sparcstations, Apple Mac™ or like computer systems.

The method 400 and other methods described below may be implemented using the computer system 100 wherein the processes of Figs. 2 to 6, to be described, may be implemented as one or more software application programs 133 executable within the computer 30 system 100. In particular, the steps of the described methods are effected by instructions 131 (see Fig. IB) in the software 133 that are carried out within the computer system 100. The software instructions 131 may be formed as one or more code modules, each for performing 10827264vl 8 2015271981 21 Dec 2015 one or more particular tasks. The software may also be divided into two separate parts, in which a first part and the corresponding code modules performs the described methods and a second part and the corresponding code modules manage a user interface between the first part and the user. 5 The software may be stored in a computer readable medium, including the storage devices described below, for example. The software 133 is typically stored in the HDD 110 or the memory 106. The software is loaded into the computer system 100 from the computer readable medium, and then executed by the computer system 100. Thus, for example, the software 133 may be stored on an optically readable disk storage medium (e g., CD-ROM) 125 10 that is read by the optical disk drive 112. A computer readable medium having such software or computer program recorded on the computer readable medium is a computer program product. The use of the computer program product in the computer system 100 preferably effects an advantageous apparatus for implementing the described methods.

In some instances, the application programs 133 may be supplied to the user encoded on 15 one or more CD-ROMs 125 and read via the corresponding drive 112, or alternatively may be read by the user from the networks 120 or 122. Still further, the software can also be loaded into the computer system 100 from other computer readable media. Computer readable storage media refers to any non-transitory tangible storage medium that provides recorded instructions and/or data to the computer system 100 for execution and/or processing. Examples of such 20 storage media include floppy disks, magnetic tape, CD-ROM, DVD, Blu-ray™ Disc, a hard disk drive, a ROM or integrated circuit, USB memory, a magneto-optical disk, or a computer readable card such as a PCMCIA card and the like, whether or not such devices are internal or external of the computer module 101. Examples of transitory or non-tangible computer readable transmission media that may also participate in the provision of software, application 25 programs, instructions and/or data to the computer module 101 include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the Internet or Intranets including e-mail transmissions and information recorded on Websites and the like.

The second part of the application programs 133 and the corresponding code modules 30 mentioned above may be executed to implement one or more graphical user interfaces (GUIs) to be rendered or otherwise represented upon the display 114. Through manipulation of typically the keyboard 102 and the mouse 103, a user of the computer system 100 and the 10827264vl 9 2015271981 21 Dec 2015 application may manipulate the interface in a functionally adaptable manner to provide controlling commands and/or input to the applications associated with the GUI(s). Other forms of functionally adaptable user interfaces may also be implemented, such as an audio interface utilizing speech prompts output via the loudspeakers 117 and user voice commands input via 5 the microphone 180.

Fig. IB is a detailed schematic block diagram of the processor 105 and a “memory” 134. The memory 134 represents a logical aggregation of all the memory modules (including the HDD 109 and semiconductor memory 106) that can be accessed by the computer module 101 in Fig. 1A. 10 When the computer module 101 is initially powered up, a power-on self-test (POST) program 150 executes. The POST program 150 is typically stored in a ROM 149 of the semiconductor memory 106 of Fig. 1 A. A hardware device such as the ROM 149 storing software is sometimes referred to as firmware. The POST program 150 examines hardware within the computer module 101 to ensure proper functioning and typically checks the 15 processor 105, the memory 134 (109, 106), and a basic input-output systems software (BIOS) module 151, also typically stored in the ROM 149, for correct operation. Once the POST program 150 has run successfully, the BIOS 151 activates the hard disk drive 110 of Fig. 1A. Activation of the hard disk drive 110 causes a bootstrap loader program 152 that is resident on the hard disk drive 110 to execute via the processor 105. This loads an operating system 153 20 into the RAM memory 106, upon which the operating system 153 commences operation. The operating system 153 is a system level application, executable by the processor 105, to fulfil various high level functions, including processor management, memory management, device management, storage management, software application interface, and generic user interface.

The operating system 153 manages the memory 134 (109, 106) to ensure that each 25 process or application running on the computer module 101 has sufficient memory in which to execute without colliding with memory allocated to another process. Furthermore, the different types of memory available in the system 100 of Fig. 1A must be used properly so that each process can run effectively. Accordingly, the aggregated memory 134 is not intended to illustrate how particular segments of memory are allocated (unless otherwise stated), but rather 30 to provide a general view of the memory accessible by the computer system 100 and how such is used. 10827264vl 10 2015271981 21 Dec 2015

As shown in Fig. IB, the processor 105 includes a number of functional modules including a control unit 139, an arithmetic logic unit (ALU) 140, and a local or internal memory 148, sometimes called a cache memory. The cache memory 148 typically includes a number of storage registers 144 - 146 in a register section. One or more internal busses 141 5 functionally interconnect these functional modules. The processor 105 typically also has one or more interfaces 142 for communicating with external devices via the system bus 104, using a connection 118. The memory 134 is coupled to the bus 104 using a connection 119.

The application program 133 includes a sequence of instructions 131 that may include conditional branch and loop instructions. The program 133 may also include data 132 which is 10 used in execution of the program 133. The instructions 131 and the data 132 are stored in memory locations 128, 129, 130 and 135, 136, 137, respectively. Depending upon the relative size of the instructions 131 and the memory locations 128-130, a particular instruction may be stored in a single memory location as depicted by the instruction shown in the memory location 130. Alternately, an instruction may be segmented into a number of parts each of 15 which is stored in a separate memory location, as depicted by the instruction segments shown in the memory locations 128 and 129.

In general, the processor 105 is given a set of instructions which are executed therein. The processor 105 waits for a subsequent input, to which the processor 105 reacts to by executing another set of instructions. Each input may be provided from one or more of a 20 number of sources, including data generated by one or more of the input devices 102, 103, data received from an external source across one of the networks 120, 102, data retrieved from one of the storage devices 106, 109 or data retrieved from a storage medium 125 inserted into the corresponding reader 112, all depicted in Fig. 1 A. The execution of a set of the instructions may in some cases result in output of data. Execution may also involve storing data or 25 variables to the memory 134.

The disclosed arrangements use input variables 154, which are stored in the memory 134 in corresponding memory locations 155, 156, 157. The disclosed arrangements produce output variables 161, which are stored in the memory 134 in corresponding memory locations 162, 163, 164. Intermediate variables 158 may be stored in memory 30 locations 159, 160, 166 and 167.

Referring to the processor 105 of Fig. IB, the registers 144, 145, 146, the arithmetic logic unit (ALU) 140, and the control unit 139 work together to perform sequences of micro- 10827264vl 11 2015271981 21 Dec 2015 operations needed to perform “fetch, decode, and execute” cycles for every instruction in the instruction set making up the program 133. Each fetch, decode, and execute cycle comprises: a fetch operation, which fetches or reads an instruction 131 from a memory location 128, 129, 130; 5 a decode operation in which the control unit 139 determines which instruction has been fetched; and an execute operation in which the control unit 139 and/or the ALU 140 execute the instruction.

Thereafter, a further fetch, decode, and execute cycle for the next instruction may be 10 executed. Similarly, a store cycle may be performed by which the control unit 139 stores or writes a value to a memory location 132.

Each step or sub-process in the processes of Figs. 2 to 6 is associated with one or more segments of the program 133 and is performed by the register section 144, 145, 147, the ALU 140, and the control unit 139 in the processor 105 working together to perform the fetch, 15 decode, and execute cycles for every instruction in the instruction set for the noted segments of the program 133.

The described methods may alternatively be implemented on a general purpose electronic device (not shown) including embedded components. Such an electronic device may include, for example, a mobile phone, a portable media player or a digital camera. 20 The described methods may also be implemented in dedicated hardware such as one or more integrated circuits performing the functions or sub functions of the described methods. Such dedicated hardware may include graphic processors, digital signal processors, or one or more microprocessors and associated memories.

The method 400 may be implemented as one or more software code modules of the 25 application program 133 resident on the hard disk drive 110 and being controlled in its execution by the processor 105. The method 400 begins at receiving step 410, where an image along with associated depth data is received under execution of the processor 105. The depth data may be captured or estimated, for example, using any one of the methods described above. 10827264vl 12 2015271981 21 Dec 2015

The image and associated depth data may be stored in the storage module 109. The depth data may be stored at a different resolution to the image or in a compressed form. However, the depth data should be configured to provide a depth value for any given pixel of the image.

At segmenting step 420, a segmentation process is performed on the image to identify 5 the objects of a scene captured by the image. The captured image is segmented into a plurality of regions with each region representing an object in the scene captured by the image. The output of the segmentation process is a collection of regions each representing a segmented object, and a mask representing the pixel level location of each segmented object in the image and corresponding depth map. 10 Fig. 5 shows an example image 500 of a scene. In the example of Fig. 5, step 420 is configured to produce a segmentation of the scene captured in the image 500 into five objects; a person 510, the ground 520, another person 530, a tree 540 and the sky 550. In practice, however, the segmented objects may not correspond exactly to the objects in the scene captured in the image 500. For example, the person 530 may consist of a separate segmented object for 15 the arm and the body, or person 510 may be divided into many regions each representing segmented objects based on the clothes that the person 530 is wearing and different parts of the body.

At step 420, the segmentation may be performed on the image data or on the depth data. Alternatively, the segmentation may be performed in a manner informed by both the image and 20 depth data. Many methods of performing object segmentation are known with varying degrees of complexity, reliability and resources required. Any suitable method may be used at step 420 with the specific selection of the segmentation method used being dependant on the context of the system 100 on which the method 400 is practiced.

The method 400 continues at determining step 430, where the “importance” of each 25 object in the image (eg., image 500) is determined under execution of the processor 105. “Importance” in this context is an aesthetic evaluation, which refers to whether the object is, for example, the subject of the image, a key foreground element, a distractor, or part of the background of the scene. The importance of an object may be determined by selecting a region of the image representing the object. The importance corresponds to how desirable it is to 30 emphasize or deemphasize each object (i.e., emphasize or deemphasize the region of the image representing the object), according to the artistic intention of the photographer. In order to enable automate aesthetic enhancement, in accordance with the described methods, the 10827264vl 13 2015271981 21 Dec 2015 photographer’s intention is not gleaned from the photographer themselves but is estimated based on available visual clues.

The centrality of each object (i.e. distance in pixels between the centre of the object and the centre of the image) may be determined at step 430. The more central objects are more 5 likely to be the subject of the image than objects around the edges of the image and are therefore considered to be more important. Also the “rule of thirds” may be used at step 430 to determine the importance of each object in the image. The “rule of thirds” suggests that important objects may be placed on the lines one third or two thirds of the way across the image either horizontally or vertically. 10 Another indicator to the intention of the photographer is focus. The level of blur for each pixel may be determined at step 430 using any suitable method known in the art. Importance of objects may be determined based on sharpness with the objects which are in focus being considered of higher importance than those objects which are out of focus.

Other information which may be used to inform the determination of importance of the 15 objects at step 430 include (estimated) salience, high level recognition of specifically important objects such as people, animals and faces, and explicitly identifying less important areas such as ground, floor, walls, sky or ceiling. In one arrangement, importance of objects at step 430 is determined based on ground detection. Importance of objects at 430 may also be determined based on face detection 20 The importance of each object as determined at step 430 may be stored in memory 106, for example, as a value associated with the object.

In the example of Fig. 5, the image 500 includes the person 510 who is the subject of the image 500. The image 500 also includes the ground 520, an additional person 530 close to the camera (e.g., the camera 127) used to capture the image 500 and partly visible from behind, 25 some background objects such as a tree 540, and the sky 550. Although person 530 is the closest to the camera 101, a number of clues are available to suggest that the person 530 is not as important in the image 500 as person 510. The camera 127 would have been focused on the person 510, making the image 500 much sharper. Person 510 is also close to the centre of the image 500, while person 530 is on the edge of the image 500. Person 510 also has their face 30 visible while person 530 does not. All of the factors above factors lead to a higher importance value being assigned to person 510 than person 530. 10827264vl 14 2015271981 21 Dec 2015

The ground 520 may be directly identified as the ground by virtue of being a flat region extending to the bottom of the image 500 and covering a wide range of depth values.

Identifying the flat region as the ground 520 leads to a low importance value being assigned to the flat region. Object 540 is a large distance away, and is not in focus. As such, the object 540 5 is assigned a low importance value.

In generating step 440, an aesthetic depth map is generated under execution of the processor 105. The aesthetic depth map is formed based on a combination of the input depth data, the segmented objects, and the object importance determined for each object at step 430. The aesthetic depth map of the scene captured by the image may be used for determining an 10 initial depth range for one or more selected image regions representing one or more objects in the scene. The result of step 440 is an aesthetic depth value for each pixel of the image, ranging from zero (0) to one (1). The range zero (0) to nought-point-five (0.5) corresponds to background elements of the scene, and the range nought-point-five (0.5) to one (1) corresponds to the foreground elements of the scene. 15 The method 400 concludes at enhancing step 450, where the image (e.g., 500) is aesthetically enhanced, based on the aesthetic depth map generated at step 440. The image is aesthetically enhanced by applying an image adjustment process or image adjustment processes to the image. At step 450, image adjustment processes such as sharpening, unsharp-masking, hue shifting, or increasing brightness or saturation may be applied to objects selected as being 20 in the predetermined nought-point-five (0.5) to one (1) range of the aesthetic depth map. The image adjustment processes may be applied to the selected objects as represented within the image to modify a perceptual attribute of the objects making the objects stand out more and bringing the objects perceptually closer (or forward). The method 400 is thus configured for bringing forward more important objects. 25 Inverse processes (such as blurring, decreasing brightness or saturation, shifting hue in the other direction and inverted unsharp-masked) are also applied to the background objects at step 450, moving less important objects backward in the image.

The combination of processes to apply and the maximum strength with which to apply the processes may be determined using a number of methods. For example, the processes to 30 apply and the maximum strength with which to apply the processes may be specified as an additional input. Alternatively, a desired level of change may be specified and the processes to achieve that level of change may be determined using a model of the impact of applying the 10827264vl 15 2015271981 21 Dec 2015 processes on the perceptual sense of depth. Such a model may be formed by performing psychophysical experiments in which observers are asked to rate the sense of depth in images with different combinations and strengths of processes applied.

The aesthetic depth map controls the strength of processing applied at each pixel of the 5 image (e g., image 500) at step 450. The strength of processing applied at each pixel of the image is proportional to the difference between the aesthetic depth map value for the pixel and the middle value nought-point-five (0.5). In other words, the strength is equal to maximum-strength x 2 x |aesthetic map value - 0.51. As a result, strong processing may be applied to the front-most parts of the front-most objects in the aesthetic depth map, while little processing is 10 applied to objects close to the middle values of the aesthetic depth map. This difference in processing aesthetically improves the image (e.g., image 500) by increasing the perceived separation between the objects in the scene captured in the image 500.

The method 600 of generating an aesthetic depth map, as executed at step 440, will now be described in more detail with reference to Fig. 6. 15 The method 600 of generating an aesthetic depth map, as executed at step 440, will be described in detail below with reference to Fig. 6. The method 600 may be implemented as one or more software code modules of the application program 133 resident on the hard disk drive 110 and being controlled in its execution by the processor 105.

The method 600 continues at determining step 610, where it is determined which of the 20 objects in the scene captured by the image received at step 410 constitute the foreground of the scene. The determination is made at step 610 based on the importance of the objects, by taking a threshold on the importance value. For the example image 500 in Fig. 5, the objects comprising the two people 510 and 530 both have a high importance value and are considered foreground, while the objects comprising the ground 520, background object 540 and sky 550 25 have low importance values and are considered as background.

At generating step 620, an initial transformation is performed to generate an initial aesthetic depth map of the scene captured by the image. The initial aesthetic depth map may be stored within the memory 106. The initial aesthetic depth map of the scene captured by the image may be used for determining an initial depth range for one or more selected image 30 regions representing one or more objects in the scene. A depth midpoint for the scene captured by the image is determined based on the depth ranges of the foreground and background objects. 10827264vl 16 2015271981 21 Dec 2015

If possible, the depth midpoint for the scene should lie between the depth values of the foreground and background objects. However, since factors other than depth are used to determine object importance and in turn whether an object is in the foreground or background in some cases it may not be possible to place the depth midpoint strictly between the foreground 5 and background. In such cases a midpoint may be selected which minimises the number of foreground pixels behind the midpoint and background pixels in front of the midpoint. Based on depth midpoint, a curve is generated which maps and reverses the initial depth values for selected image regions into the range zero (0) to one (1), such that the furthest away point maps to a revised depth value of zero (0), the midpoint maps to a revised depth value of nought-point 10 -five (0.5) and the closest point maps to a revised depth value of one (1). For the example image 500 in Fig. 5, it is not possible to find a midpoint which puts the foreground objects 510 and 530 in the foreground without at least some of the ground 520 being also included in the foreground. Instead, a point just behind person 510 would be selected as the midpoint.

The method 600 continues at determining step 630, where layers are determined for the 15 foreground objects under execution of the processor 105. A layer is a group of objects which have their depth values adjusted together. In one arrangement, objects which have a similar appearance, original depth range and estimated importance are selected and grouped into a single layer. Objects of different importance, appearance and original depth ranges are placed in separate layers. The result of step 630 is a set of foreground layers which partition the 20 foreground objects.

For the example image 500 in Fig. 5, the person 530 may be composed of multiple regions representing multiple objects, as for example the image pixels constituting the arm are disconnected from the image pixels constituting the head. However, each of the objects of person 530 have a similar initial depth value and should also have a similar importance value. 25 In accordance with the method 600, the objects of person 530 are grouped into a single layer representing person 530. Similarly, person 510 may have been segmented into multiple objects, however the objects of person 510 form a single layer.

The method 600 continues at adjusting step 640, where the initial depth range of each layer is adjusted under execution of the processor 105 to a revised depth range for each layer. 30 As described below, the revised depth ranges may be used to modify a relative depth difference from at least a part of a first object to a second object in the scene. 10827264vl 17 2015271981 21 Dec 2015 A number of ranges are determined within the foreground region of nought-point -five (0.5) to one (1), one for each foreground layer, are determined at step 640. A space may also be included in between the ranges. Each foreground layer is then linearly mapped into one range, with the layers sorted by the average importance of the objects in that layer, such that the most 5 important objects have the highest aesthetic depth values. For the example image 500 of Fig. 5, the layer corresponding to person 510 has the highest average importance value, and is mapped into the highest depth range, for example nought-point-eight (0.8) to one (1.0). The layer corresponding to person 530 has a lower importance value and is mapped into a lower range, for example, nought-point-five (0.5) to nought-point-seven (0.7). 10 Then at adjusting step 650, the depth values of the background objects are adjusted so as to fit inside the range zero (0) to nought-point-five (0.5). For the example image 500 in Fig. 5, some areas of the ground 520 are initially mapped into the foreground. The aesthetic depth values are then adjusted so that the values for the ground 520 are entirely within the range zero (0) to nought-point-five (0.5). The aesthetic depth values of the other background objects are 15 also adjusted to keep the same relative relation to the ground 520. A method 200 of enhancing an image will now be described with reference to Fig. 2.

The method 200 may be implemented as one or more software code modules of the application program 133 resident on the hard disk drive 110 and being controlled in its execution by the processor 105. The method 200 will described by way of example where the memory 106 of 20 the computer system 100 contains image data representing two images each capturing the same scene from different locations and one of the two images is to be enhanced. Each of the images is captured by a different camera.

The method 200 begins at determining step 210, where a stereo matching process is applied to the two images, under execution of the processor 105, to determine stereo disparity 25 between the two images. Any suitable stereo matching process may be applied to the images at step 210. Step 210 results in a disparity map, representing the distance from each pixel of the image to be enhanced to the best matching (over a local region) pixel in the other image. The disparity map may be stored in the memory 106 under execution of the processor 105. From the disparity and a known, assumed or estimated distance between the cameras used to capture 30 each of the images, the depth of each pixel in either image may be determined. The depth of each pixel is stored in a depth map configured within memory 106 under execution of the processor 105. 10827264vl 18 2015271981 21 Dec 2015

At the next refining step 220, the depth map determined at step 210 is refined. Regions which are only visible in one image due to being occluded in the other image have their depth estimated by other means, such as matching the depth of nearby unoccluded pixels with similar appearance. Noise present in the depth map, particularly for smooth regions where disparity 5 may be hard to estimate, is smoothed out. These and any other errors in the depth map are corrected to remove any artifacts as artifacts in the depth map may cause visible distortion in the enhanced image if carried through to the aesthetic depth map. Some or all of the corrections may be performed by existing suitable stereo matching method.

The method 200 continues at generating step 230, where the aesthetic depth is generated 10 for the image to be enhanced, under execution of the processor 105. The aesthetic depth map is generated at step 230 as described above for steps 420, 430 and 440, in which objects in the image to be enhanced are segmented, the importance of the objects is determined and the aesthetic depth map is generated for the image.

At displaying step 240, one or both of the images are displayed on the display 114 to a 15 user of the method 200. The aesthetic depth map may also be displayed on the display 114.

The method 200 continues at receiving step 250, where a desired level of enhancement is received under execution of the processor 105. A desired level of enhancement is obtained from the user through the user interface, such as by the user moving a slider, or by selecting from a menu then typing a number in a box, or by choosing from a set of predetermined options 20 with descriptions such as “small”, “medium” and “strong”.

At determining step 260, image adjustment processes to be applied to the image to be enhanced are determined, under execution of the processor 105, along with relative strengths of the adjustment processes. A fixed selection of image adjustment processes including one or more processes may be used, or the image adjustment processes may be specified by the user. 25 In one arrangement, a method may be used to automatically determine the most suitable processes to apply to the image to be enhanced based on the content of the image. Such a method could identify visual features of the image or identify semantic content in the image and apply filters known to be suited to images with those visual features or semantic content.

The method 200 continues at enhancing step 270, where the image is aesthetically 30 enhanced, based on the aesthetic depth map generated at step 230, the desired level of enhancement received at step 250 and the image adjustment processes determined at step 260. 10827264vl 19 2015271981 21 Dec 2015

Step 270 corresponds to step 450 described above. As described above, image adjustment processes such as sharpening, unsharp-masking, hue shifting, or increasing brightness or saturation may be applied at step 270 to objects selected as being in the predetermined nought-point-five (0.5) to one (1) range of the aesthetic depth map. The image adjustment processes 5 are applied to selected objects as represented within the image to modify a perceptual attribute of the objects making the objects stand out more and bringing the objects perceptually closer.

The method 200 concludes at displaying step 280, where the enhanced image is displayed to the user on the display 114, under execution of the processor 105. The user may then review the image and select to store the enhanced image (e.g., in memory 106), or apply 10 and compare different levels of enhancement.

As described above, the described methods may alternatively be implemented a digital camera or similar device. A method 300 of enhancing an image will now be described with reference to Fig. 3. The method 300 will be described by way of example where the method 300 is implemented 15 using the camera 127 which is in the form of a digital camera.

For the purposes of the present description, the camera 127 will be considered to have the same configuration as the computer module 101, as described above with reference to Figs. 1A and IB, including the processor 105 and the memory 106 described above. However, for the camera 127, the video display 114 is a liquid crystal display (LCD) panel or the like. The 20 camera 127 also includes user input devices which for the camera 127 are typically formed by keys, a keypad or like controls. In some implementations, the user input devices may include a touch sensitive panel physically associated with the display 114 to collectively form a touchscreen for the camera 127. The camera 127 also includes a lens (not shown), focus control (not shown) and image sensor (not shown). 25 The camera 127 may capture both an image and an associated depth map through a number of means. For example, a direct depth sensor may be attached to the camera 127. Alternatively, multiple images may be captured in quick succession with different focus brackets allowing depth from defocus to be applied.

The method 300 is implemented as one or more software code modules of the 30 application program 133 resident in the memory 106 of the camera 127 and being controlled in 10827264vl 20 2015271981 21 Dec 2015 its execution by the processor 105 of the camera 127. The image captured by the camera 127 is enhanced in accordance with the method 300 using depth information.

The method 300 begins at receiving step 310, where camera mode in which the image enhancement is active is received by the camera 127 under execution of the processor 105 of 5 the camera 127. For example, the user may select enhancement mode using the user input devices for the camera 127.

Then at capturing step 320, an image of a scene is captured under execution of the processor 105 of the camera 127. Also at step 320, a depth map of the scene is captured or estimated under execution of the processor 105 using any one of the methods described above. 10 The method 300 continues at determining step 330, where the aesthetic depth map is generated for the image captured at step 320. The aesthetic depth map is generated at step 330 as described above for steps 420, 430 and 440, in which objects in the image captured at step 320 are segmented, the importance of the objects is determined and the aesthetic depth map is generated for the image captured at step 320. 15 At enhancing step 340, the image is aesthetically enhanced based on the aesthetic depth map generated at step 330. Step 340 corresponds to step 450 as described above. As described above, image adjustment processes such as sharpening, unsharp-masking, hue shifting, or increasing brightness or saturation may be applied at step 340 to objects selected as being in the predetermined nought-point-five (0.5) to one (1) range of the aesthetic depth map. The image 20 adjustment processes are applied to selected objects as represented within the image to modify a perceptual attribute of the objects making the objects stand out more and bringing the objects perceptually closer or forward in the image.

The method 300 continues at displaying step 350, where the enhanced image is displayed to the user on the display 114 for review. 25 The method 300 concludes at storing step 360, where both the original image captured at step 320 and the enhanced image are stored on the memory 106 in the camera 127. Alternatively, just the enhanced image may be stored, or the user may, after reviewing the image at step 350, choose to store just one of the enhanced or original captured image. 10827264vl 21

Industrial Applicability 2015271981 21 Dec 2015

The arrangements described are applicable to the computer and data processing industries and particularly for image processing.

The foregoing describes only some embodiments of the present invention, and 5 modifications and/or changes can be made thereto without departing from the scope and spirit of the invention, the embodiments being illustrative and not restrictive.

In the context of this specification, the word “comprising” means “including principally but not necessarily solely” or “having” or “including”, and not “consisting only of’. Variations of the word "comprising", such as “comprise” and “comprises” have correspondingly varied 10 meanings. 10827264vl

Claims

1. A method of modifying a perceptual attribute for at least a part of an image, said method comprising the steps of: selecting a region of the image representing a first object in a scene captured by the image; determining, from a depth map of the scene, an initial depth range for the selected image region; adjusting the initial depth range for the selected image region to a revised depth range to modify a relative depth difference from at least a part of the first object to a second object in the scene; and applying an image process to the selected image region in accordance with the revised depth range to modify a perceptual attribute of the selected object as represented within the image.

2. The method according to claim 1, wherein the depth map is captured.

3. The method according to claim 1, wherein the depth map is estimated.

4. The method according to claim 1, wherein the image process is sharpening, unsharp-masking, brightness adjusting, saturation and/or hue shifting.

5. The method according to claim 1, further comprising: determining importance of the objects; and bringing forward more important objects and moving backward less important objects based on the determined importance.

6. The method according to claim 5, wherein importance is determined based on one or more of salience, centrality, brightness, ground detection, sharpness and face detection.

7. The method according to claim 5, wherein the first object is mapped into a predetermined depth range based on the importance of the first object.

8. The method according to claim 5, wherein most important objects have highest depth values.

9. The method according to claim 1, further comprising determining layers for the first object and the second object.

10. The method according to claim 1, wherein regions having a similar appearance, depth range and importance are grouped into a single layer.

11. The method according to claim 1, wherein regions having a different appearance, depth range and importance are grouped into separate layers.

12. The method according to claim 1, further comprising applying stereo matching to the image to determine stereo disparity.

13. The method according to claim 1, wherein errors in the depth map are corrected to remove artifacts.

14. The method according to claim 1, further comprising determining a level of enhancement to be applied to the image.

15. The method according to claim 1, wherein each region is composed of multiple objects having a similar initial depth range.

16. A system for modifying a perceptual attribute for at least a part of an image, said system comprising: a memory storing data and a computer program; a processor coupled to the memory for executing the computer program, the computer program comprising instructions for: selecting a region of the image representing a first object in a scene captured by the image; determining, from a depth map of the scene, an initial depth range for the selected image region; adjusting the initial depth range for the selected image region to a revised depth range to modify a relative depth difference from at least a part of the first object to a second object in the scene; and applying an image process to the selected image region in accordance with the revised depth range to modify a perceptual attribute of the selected object as represented within the image.

17. An apparatus for modifying a perceptual attribute for at least a part of an image, said apparatus comprising: means for selecting a region of the image representing a first object in a scene captured by the image; means for determining, from a depth map of the scene, an initial depth range for the selected image region; means for adjusting the initial depth range for the selected image region to a revised depth range to modify a relative depth difference from at least a part of the first object to a second object in the scene; and means for applying an image process to the selected image region in accordance with the revised depth range to modify a perceptual attribute of the selected object as represented within the image.

18. A computer readable medium having a computer program stored on the medium for modifying a perceptual attribute for at least a part of an image, said program comprising: code for selecting a region of the image representing a first object in a scene captured by the image; code for determining, from a depth map of the scene, an initial depth range for the selected image region; code for adjusting the initial depth range for the selected image region to a revised depth range to modify a relative depth difference from at least a part of the first object to a second object in the scene; and code for applying an image process to the selected image region in accordance with the revised depth range to modify a perceptual attribute of the selected object as represented within the image.