AU2016273984A1 - Modifying a perceptual attribute of an image using an inaccurate depth map - Google Patents

Modifying a perceptual attribute of an image using an inaccurate depth map Download PDF

Info

Publication number
AU2016273984A1
AU2016273984A1 AU2016273984A AU2016273984A AU2016273984A1 AU 2016273984 A1 AU2016273984 A1 AU 2016273984A1 AU 2016273984 A AU2016273984 A AU 2016273984A AU 2016273984 A AU2016273984 A AU 2016273984A AU 2016273984 A1 AU2016273984 A1 AU 2016273984A1
Authority
AU
Australia
Prior art keywords
image
regions
artefacts
region
depth map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
AU2016273984A
Inventor
Matthew Raphael Arnison
Nicolas Pierre Marie Frederic Bonnier
Peter Jan Pakulski
Stuart William Perry
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Priority to AU2016273984A priority Critical patent/AU2016273984A1/en
Publication of AU2016273984A1 publication Critical patent/AU2016273984A1/en
Abandoned legal-status Critical Current

Links

Landscapes

  • Image Processing (AREA)

Abstract

-33 Abstract A method of modifying a perceptual attribute of an image using a corresponding depth map of a scene captured by the image. A visibility measure is determined for image artefacts 5 produced in each of a plurality of regions of the image by each of a set of image adjustment processes when applied to content of the image in accordance with depth values in the depth map. The image artefacts are caused by inaccurate depth values of the corresponding depth map. A first and second image adjustment process is selected from the set of image adjustment processes based on respective determined visibility measures. A perceptual attribute of the image is 0 modified by locally applying the first and second selected image adjustment processes to different respective regions of the image. 17(19977Rv1 (P2S1416 As Filed)

Description

MODIFYING A PERCEPTUAL ATTRIBUTE OF AN IMAGE USING AN
INACCURATE DEPTH MAP
TECHNICAL FIELD
The present invention relates generally to the field of image processing, and, in particular, to the field of processing images according to how images are perceived by humans.
BACKGROUND
Processing an image with reference to depth information for the scene the image is capturing is used in a variety of applications.
One field in which depth information in the form of depth-maps are used to process images, is Depth Image Based Rendering (DIBR), where depth values are used to offset pixels in an image to generate a novel view of the scene. A common purpose is to generate a second view for stereo 3D display applications. The quality of the depth map is important in such applications and research has been done into filtering or modifying depth maps to make the end-result more pleasing, for example research into the compression requirements for depth-maps to avoid introducing artefacts, and the perceptual distortions arising from DIBR techniques when the depth-map contains errors or artefacts.
Modifying perceptual aspects of images in image-dependent manner is implemented in some cameras, for example the feature whereby cameras which detect faces and switch to “Portrait mode” in which the camera prefers a wide aperture setting to produce an out of focus background.
With the growing availability of depth-maps, other aspects of images are being modified using the depth-map information. Some of these aspects are perceptual in nature. There are techniques that use depth maps for enhancing the perceived depth of objects in an image. However, depth maps may have errors and these techniques do not consider those errors or compensate for them.
Image cropping and retargeting methods exist which use depth maps to improve an estimate of scene geometry and saliency in order to perform seam carving and retarget an image for a different aspect ratio. Inaccurate values in the depth map will result in errors in the estimated saliency data which in turn will result in output retargeted images with visible distortions across salient objects.
There exists a need for an improved modification of perceptual aspects of an image when using inaccurate depth information.
SUMMARY
It is an object of the present invention to substantially overcome, or at least ameliorate, one or more disadvantages of existing arrangements.
According to one aspect of the present disclosure, there is provided a method of modifying a perceptual attribute of an image using a corresponding depth map of a scene captured by the image, said method comprising: determining a visibility measure for image artefacts produced in each of a plurality of regions of the image by each of a set of image adjustment processes when applied to content of the image in accordance with depth values in the depth map, said image artefacts being caused by inaccurate depth values of the corresponding depth map; selecting a first and second image adjustment process from the set of image adjustment processes based on respective determined visibility measures; and modifying a perceptual attribute of the image by locally applying the first and second selected image adjustment processes to different respective regions of the image.
According to another aspect of the present disclosure, there is provided an apparatus for modifying a perceptual attribute of an image, the apparatus comprising: a processor; and a memory storing a computer executable software program for directing the processor to perform a method comprising the steps of: determining a visibility measure for image artefacts produced in each of a plurality of regions of the image by each of a set of image adjustment processes when applied to content of the image in accordance with depth values in the depth map, said image artefacts being caused by inaccurate depth values of the corresponding depth map; selecting a first and second image adjustment process from the set of image adjustment processes based on respective determined visibility measures; and modifying a perceptual attribute of the image by locally applying the first and second selected image adjustment processes to different respective regions of the image.
According to still another aspect of the present disclosure, there is provided a non-transitory computer readable medium storing a computer executable software program for directing a processor to perform a method for modifying a perceptual attribute of an image using a corresponding depth map of a scene captured by the image, said method comprising: determining a visibility measure for image artefacts produced in each of a plurality of regions of the image by each of a set of image adjustment processes when applied to content of the image in accordance with depth values in the depth map, said image artefacts being caused by inaccurate depth values of the corresponding depth map; selecting a first and second image adjustment process from the set of image adjustment processes based on respective determined visibility measures; and modifying a perceptual attribute of the image by locally applying the first and second selected image adjustment processes to different respective regions of the image.
According to another aspect of the present disclosure, there is provided a method of modifying a perceptual attribute of an image using a corresponding depth map of a scene captured by the image, said method comprising: generating a first modified image by applying a first image adjustment process to the image and a second modified image by applying a second image adjustment process to the image, said image adjustment processes being selected from a from a set of image adjustment processes and being applied to content of the image to modify a perceptual attribute of the image in accordance with depth values in the depth map; determining a visibility measure for image artefacts produced in each of a plurality of regions of the first and second modified images by the corresponding image adjustment processes, said image artefacts being caused by inaccurate depth values of the corresponding depth map; and selecting one of the first and second modified images based on respective determined visibility measures.
According to another aspect of the present disclosure, there is provided an apparatus for implementing any one of the aforementioned methods.
According to another aspect of the present disclosure there is provided a computer program product including a computer readable medium having recorded thereon a computer program for implementing any one of the methods described above.
Other aspects of the invention are also disclosed.
BRIEF DESCRIPTION OF THE DRAWINGS
One or more embodiments of the invention will now be described with reference to the following drawings, in which:
Figs. 1A and IB are schematic block diagrams of a general purpose computer on which described arrangements may be practised;
Fig. 2 illustrates the enhancement of an image to increase the perceived sense of depth, also known as rittai-kan,
Fig. 3 illustrates problems associated with modifying a perceptual attribute of an image using an inaccurate depth map, using two different image-modification processes;
Fig. 4 is a schematic flow diagram illustrating a method of modifying a perceptual attribute of an image using an inaccurate depth map;
Fig. 5 is a schematic flow diagram illustrating a method of detecting artefacts in a modified image, as used in the method of Fig. 4;
Fig. 6 is a schematic flow diagram illustrating a further method of detecting artefacts in a modified image, as used in the method of Fig. 4;
Fig. 7 is a schematic flow diagram illustrating a method of combining images according to given quality scores, as used in the method of Fig. 4;
Fig. 8 is a schematic flow diagram illustrating a method of combining regions of different images according to the quality scores for each region, as used in the method of Fig. 4;
Fig. 9 illustrates the result of combining different regions of two images prepared with different image-processes to modify a perceptual attribute, each having artefacts caused by an inaccurate depth map, to form an image with a modified perceptual attribute but having reduced artefacts;
Fig. 10 is a schematic flow diagram illustrating a method of modifying a perceptual attribute of an image using an inaccurate depth map; and
Fig. 11 is a schematic flow diagram illustrating a method of predicting artefacts in an image to be modified, as used in the method of Fig. 10.
DETAILED DESCRIPTION INCLUDING BEST MODE
Where reference is made in any one or more of the accompanying drawings to steps and/or features, which have the same reference numerals, those steps and/or features have for the purposes of this description the same function(s) or operation(s), unless the contrary intention appears.
It is to be noted that the discussions contained in the "Background" section and the section above relating to prior art arrangements relate to discussions of documents or devices which may form public knowledge through their respective publication and/or use. Such discussions should not be interpreted as a representation by the present inventor(s) or the patent applicant that such documents or devices in any way form part of the common general knowledge in the art.
Fig 3, 300, illustrates problems associated with processing an image 310 using an inaccurate depth map 320. The image 310 consists of three regions shown in a separate frame 330 for reference. There is a first region marked A, 335, corresponding to a person wearing brightly coloured clothes, a second region marked B, 337, corresponding to the ground covered in pale yellow grass, and a third region marked C, 332, corresponding to a blue sky. The Image 310 correspondingly shows the person 315, the ground 317, and the sky 312. Additionally, the sky has a diffuse cloud 311.
There is also an estimated depth map 320 corresponding to the image 310, in which lighter regions correspond to small distances from the camera to an object in the scene, and darker regions correspond to larger distances. The depth map 320 has some inaccuracies, ie: • A region towards the back of the person incorrectly shows very large distance 323. Such an error may be caused by disparity-based depth-estimation methods such as depth-from-stereo where, due to sharp discontinuities around the person, stereo depth information is not available in occluded regions around the sides of the person. Such errors may also be caused when using slow-acquisition methods, such as laser range scanning, if the person is moving. • Two regions in the sky 328 incorrectly have very low distance and very high distance respectively. Such errors may be caused by disparity-based depth-estimation methods or defocus-based depth-estimation methods, as the sky has no texture to correlate between images or at different focus levels. Such errors may also be caused by structured-light methods, as the sky is well outside the working range of typical structured-light methods. • Finally, two regions of the ground 326 incorrectly have very low distance and very high distance respectively. Such errors may result from disparity-based depth-estimation methods when a texture is repeating and causes false correlations between the stereo views. Such errors may also be caused by structured-light depth-estimation methods when a surface has very low reflectivity or has strong specular reflection.
It will be appreciated by people skilled in the art, that other methods of generating depth-maps exist, and these various depth-estimation methods produce depth artefacts under different conditions.
Where the inaccuracies in the depth map can be reduced a better depth map may be produced. However, the resulting depth-map may still have remaining inaccuracies. While knowledge that specific regions of a depth map have inaccuracies can be used to mark those regions with a low-confidence rating, that information in and of itself cannot be used to improve the depth map. One method of improving the accuracy of a depth map is to adjust the depth map based on a comparison between the image and the depth map. For example, an edge in the image between areas with different colours may correspond to the boundary between an object in the foreground and the background, and therefore the image edge may correspond to a change in depth. If there is an inaccuracy in the depth map such that the depth is constant across and image edge, then in this situation the depth map accuracy may be improved by introducing a change in the depth at the location of the edge in the image.
However, this process of depth map refinement is not always reliable. Discontinuities in a depth map that do not align with edges in the image may be correct, whereby the depth-map is supplying depth information that cannot be perceived from the image alone. Sometimes, however, the depth values in the depth-map are incorrect, and it is not known which side of the discontinuity has the correct value for the flat region. Smooth depth values across image edges may be correct when corresponding to multi-coloured sections of a single object, but sometimes such values are incorrect resulting from a mis-detection of depth values between objects. When probability-based image-based constraints are used to filter or improve an inaccurate depth map, errors cannot be assumed to be entirely removed, and the result is a depth map with unknown errors, and which is therefore still inaccurate.
Different image processes applied to an image, using an inaccurate depth map, result in different image artefacts, and the exact nature of the image artefacts depends on the image content, the depth-map inaccuracies, and the process applied.
The image 310 is processed 340 by altering the colour saturation, otherwise known as chroma, according to the depth-map values 320. This is done by increasing the saturation in the foreground, and decreasing the saturation in the background, where the level of increase is scaled by the distance such that very close objects get the largest increase while distant objects get the largest decrease. The overall process of adjusting the saturation is effective, in that a significant change is made to the image saturation. The resulting image 360 is affected in a number of ways: • The modified image of the person 365 becomes more saturated, as people have coloured skin-tones sensitive to saturation and the person is close. The presence of the depth inaccuracy 323 results in a corresponding image artefact in the person 363, such that this region of the person has significantly lower saturation than the rest of the person. The process is effective, but the image artefact is highly visible. Furthermore, as people are very sensitive to changes in skin-tone, so a suitable skin-tone detector at a later stage would be able to weight this artefact as being even more visible. • The modified image of the sky 362 becomes less saturated, being blue and far away. The modified image of the cloud 361 is not affected as the cloud 361 had little colour to begin with. The low-distance and high-distance depth inaccuracies 328 result in corresponding image artefacts in the sky 368, becoming regions of oversaturation and under saturation respectively. The process is only moderately effective, and the image artefacts are highly visible. • The modified image of the ground 367 changes little, as the ground 367 has both low saturation originally, and is at an intermediate distance. The low distance and high-distance depth inaccuracies 326 result in corresponding image artefacts in the ground 366, becoming regions of over saturation and under saturation respectively, but they have very low visibility. The process is not effective, but the image artefacts have low visibility.
Alternatively, or in addition, the image 310 is processed 350 by altering the luminance contrast according to the depth-map values 320. This is done by increasing the contrast of details in the foreground, and decreasing the contrast of details in the background, where the level of increase or decrease is scaled by the distance, such that very close objects get the largest increase while distant objects get the largest decrease. One method of contrast change is to scale pixel lightness values according to a sigmoid function as is commonly used in image manipulation software. Another method of contrast change is to apply an unsharp mask with a radius depending on the depth value, where for close objects the unsharp mask is added to increase the apparent contrast, and for distant objects the unsharp mask is subtracted to decrease the apparent contrast. Such an operation is commonly known as “smart sharpen” or “smart blur” or “lens blur”. The resulting image 370 is affected in different ways: • The modified image of the person, 375 gains more contrast, especially as they are surrounded by strong edges, but the image of the person does not have strong edges internally. The presence of the depth innacuracy 323 results in a corresponding image artefact 373 in the person. The process is medium-level effective, and the image artefact has low visibility. • The modified image of the sky 372 is not affected strongly, as the sky 372 is largely featureless. The cloud 374 however, becomes slightly less visible. The visibility of a cloud in the sky is affected by the amount of difference in brightness between the sky and the cloud at the edge of the cloud, so reduced contrast lowers its visibility. However, in this case the cloud has a diffuse edge, so the effect of changing the contrast is small and difficult to perceive. The low-distance and high-distance depth inaccuracies 328 do not result in any visible image artefacts at their corresponding locations 378. The process is medium-level effective, and the image artefact has no visibility. • The modified image of the ground 377 is strongly affected, as the ground 377 has strong texture. The low-distance and high-distance depth inaccuracies 326, result in corresponding image artefacts 376 of high-contrast and low-contrast regions respectively and these would be highly visible to a viewer of the image. The process is highly effective, and the image artefacts have high visibility.
It will be appreciated by people skilled in the art, that other image artefacts will be produced when using different processes, dependant on the image content, the inaccuracies in the depth-map, and the process.
Figs. 1A and IB depict a general-purpose computer system 100, upon which the various arrangements described below can be practiced.
As seen in Fig. 1 A, the computer system 100 includes: a computer module 101; input devices such as a keyboard 102, a mouse pointer device 103, a scanner 126, a camera 127, and a microphone 180; and output devices including a printer 115, a display device 114 and loudspeakers 117. An external Modulator-Demodulator (Modem) transceiver device 116 may be used by the computer module 101 for communicating to and from a communications network 120 via a connection 121. The communications network 120 may be a wide-area network (WAN), such as the Internet, a cellular telecommunications network, or a private WAN. Where the connection 121 is a telephone line, the modem 116 may be a traditional “dial-up” modem. Alternatively, where the connection 121 is a high capacity (e.g., cable) connection, the modem 116 may be a broadband modem. A wireless modem may also be used for wireless connection to the communications network 120.
The computer module 101 typically includes at least one processor unit 105, and a memory unit 106. For example, the memory unit 106 may have semiconductor random access memory (RAM) and semiconductor read only memory (ROM). The computer module 101 also includes an number of input/output (I/O) interfaces including: an audio-video interface 107 that couples to the video display 114, loudspeakers 117 and microphone 180; an I/O interface 113 that couples to the keyboard 102, mouse 103, scanner 126, camera 127 and optionally a joystick or other human interface device (not illustrated); and an interface 108 for the external modem 116 and printer 11. In some implementations, the modem 116 may be incorporated within the computer module 101, for example within the interface 108. The computer module 101 also has a local network interface 111, which permits coupling of the computer system 100 via a connection 123 to a local-area communications network 122, known as a Local Area Network (LAN). As illustrated in Fig. 1 A, the local communications network 122 may also couple to the wide network 120 via a connection 124, which would typically include a so-called “firewall” device or device of similar functionality. The local network interface 111 may comprise an Ethernet circuit card, a Bluetooth® wireless arrangement or an IEEE 802.11 wireless arrangement; however, numerous other types of interfaces may be practiced for the interface 111.
The I/O interfaces 108 and 113 may afford either or both of serial and parallel connectivity, the former typically being implemented according to the Universal Serial Bus (USB) standards and having corresponding USB connectors (not illustrated). Storage devices 109 are provided and typically include a hard disk drive (HDD) 110. Other storage devices such as a floppy disk drive and a magnetic tape drive (not illustrated) may also be used. An optical disk drive 112 is typically provided to act as a non-volatile source of data. Portable memory devices, such optical disks (e.g., CD-ROM, DVD, Blu-ray DiscTM), USB-RAM, portable, external hard drives, and floppy disks, for example, may be used as appropriate sources of data to the system 100.
The components 105 to 113 of the computer module 101 typically communicate via an interconnected bus 104 and in a manner that results in a conventional mode of operation of the computer system 100 known to those in the relevant art. For example, the processor 105 is coupled to the system bus 104 using a connection 118. Likewise, the memory 106 and optical disk drive 112 are coupled to the system bus 104 by connections 119. Examples of computers on which the described arrangements can be practised include IBM-PC’s and compatibles, Sun Sparcstations, Apple Mac™ or a like computer systems.
Processes to be described may be implemented using the computer system 100 wherein the processes, to be described, may be implemented as one or more software application programs 133 executable within the computer system 100. In particular, the steps of the current method are effected by instructions 131 (see Fig. IB) in the software 133 that are carried out within the computer system 100. The software instructions 131 may be formed as one or more code modules, each for performing one or more particular tasks. The software may also be divided into two separate parts, in which a first part and the corresponding code modules performs the current methods and a second part and the corresponding code modules manage a user interface between the first part and the user.
The software may be stored in a computer readable medium, including the storage devices described below, for example. The software is loaded into the computer system 100 from the computer readable medium, and then executed by the computer system 100. A computer readable medium having such software or computer program recorded on the computer readable medium is a computer program product. The use of the computer program product in the computer system 100 preferably effects an advantageous apparatus for implementing the described processes.
The software 133 is typically stored in the HDD 110 or the memory 106. The software is loaded into the computer system 100 from a computer readable medium, and executed by the computer system 100. Thus, for example, the software 133 may be stored on an optically readable disk storage medium (e.g., CD-ROM) 125 that is read by the optical disk drive 112. A computer readable medium having such software or computer program recorded on it is a computer program product. The use of the computer program product in the computer system 100 preferably effects an apparatus for implementing the described processes.
In some instances, the application programs 133 may be supplied to the user encoded on one or more CD-ROMs 125 and read via the corresponding drive 112, or alternatively may be read by the user from the networks 120 or 122. Still further, the software can also be loaded into the computer system 100 from other computer readable media. Computer readable storage media refers to any non-transitory tangible storage medium that provides recorded instructions and/or data to the computer system 100 for execution and/or processing. Examples of such storage media include floppy disks, magnetic tape, CD-ROM, DVD, Blu-rayTM Disc, a hard disk drive, a ROM or integrated circuit, USB memory, a magneto-optical disk, or a computer readable card such as a PCMCIA card and the like, whether or not such devices are internal or external of the computer module 101. Examples of transitory or non-tangible computer readable transmission media that may also participate in the provision of software, application programs, instructions and/or data to the computer module 101 include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the Internet or Intranets including e-mail transmissions and information recorded on Websites and the like.
The second part of the application programs 133 and the corresponding code modules mentioned above may be executed to implement one or more graphical user interfaces (GUIs) to be rendered or otherwise represented upon the display 114. Through manipulation of typically the keyboard 102 and the mouse 103, a user of the computer system 100 and the application may manipulate the interface in a functionally adaptable manner to provide controlling commands and/or input to the applications associated with the GUI(s). Other forms of functionally adaptable user interfaces may also be implemented, such as an audio interface utilizing speech prompts output via the loudspeakers 117 and user voice commands input via the microphone 180.
Fig. IB is a detailed schematic block diagram of the processor 105 and a “memory” 134. The memory 134 represents a logical aggregation of all the memory modules (including the HDD 109 and semiconductor memory 106) that can be accessed by the computer module 101 in Fig. 1A.
When the computer module 101 is initially powered up, a power-on self-test (POST) program 150 executes. The POST program 150 is typically stored in a ROM 149 of the semiconductor memory 106 of Fig. 1A. A hardware device such as the ROM 149 storing software is sometimes referred to as firmware. The POST program 150 examines hardware within the computer module 101 to ensure proper functioning and typically checks the processor 105, the memory 134 (109, 106), and a basic input-output systems software (BIOS) module 151, also typically stored in the ROM 149, for correct operation. Once the POST program 150 has run successfully, the BIOS 151 activates the hard disk drive 110 of Fig. 1 A. Activation of the hard disk drive 110 causes a bootstrap loader program 152 that is resident on the hard disk drive 110 to execute via the processor 105. This loads an operating system 153 into the RAM memory 106, upon which the operating system 153 commences operation. The operating system 153 is a system level application, executable by the processor 105, to fulfil various high level functions, including processor management, memory management, device management, storage management, software application interface, and generic user interface.
The operating system 153 manages the memory 134 (109, 106) to ensure that each process or application running on the computer module 101 has sufficient memory in which to execute without colliding with memory allocated to another process. Furthermore, the different types of memory available in the system 100 of Fig. 1A must be used properly so that each process can run effectively. Accordingly, the aggregated memory 134 is not intended to illustrate how particular segments of memory are allocated (unless otherwise stated), but rather to provide a general view of the memory accessible by the computer system 100 and how such is used.
As shown in Fig. IB, the processor 105 includes a number of functional modules including a control unit 139, an arithmetic logic unit (ALU) 140, and a local or internal memory 148, sometimes called a cache memory. The cache memory 148 typically includes a number of storage registers 144 - 146 in a register section. One or more internal busses 141 functionally interconnect these functional modules. The processor 105 typically also has one or more interfaces 142 for communicating with external devices via the system bus 104, using a connection 118. The memory 134 is coupled to the bus 104 using a connection 119.
The application program 133 includes a sequence of instructions 131 that may include conditional branch and loop instructions. The program 133 may also include data 132 which is used in execution of the program 133. The instructions 131 and the data 132 are stored in memory locations 128, 129, 130 and 135, 136, 137, respectively. Depending upon the relative size of the instructions 131 and the memory locations 128-130, a particular instruction may be stored in a single memory location as depicted by the instruction shown in the memory location 130. Alternately, an instruction may be segmented into a number of parts each of which is stored in a separate memory location, as depicted by the instruction segments shown in the memory locations 128 and 129.
In general, the processor 105 is given a set of instructions which are executed therein. The processor 105 waits for a subsequent input, to which the processor 105 reacts to by executing another set of instructions. Each input may be provided from one or more of a number of sources, including data generated by one or more of the input devices 102, 103, data received from an external source across one of the networks 120, 102, data retrieved from one of the storage devices 106, 109 or data retrieved from a storage medium 125 inserted into the corresponding reader 112, all depicted in Fig. 1 A. The execution of a set of the instructions may in some cases result in output of data. Execution may also involve storing data or variables to the memory 134.
The disclosed arrangements use input variables 154, which are stored in the memory 134 in corresponding memory locations 155, 156, 157. The arrangements produce output variables 161, which are stored in the memory 134 in corresponding memory locations 162, 163, 164. Intermediate variables 158 may be stored in memory locations 159, 160, 166 and 167.
Referring to the processor 105 of Fig. IB, the registers 144, 145, 146, the arithmetic logic unit (ALU) 140, and the control unit 139 work together to perform sequences of micro-operations needed to perform “fetch, decode, and execute” cycles for every instruction in the instruction set making up the program 133. Each fetch, decode, and execute cycle comprises: a fetch operation, which fetches or reads an instruction 131 from a memory location 128, 129, 130; a decode operation in which the control unit 139 determines which instruction has been fetched; and an execute operation in which the control unit 139 and/or the ALU 140 execute the instruction.
Thereafter, a further fetch, decode, and execute cycle for the next instruction may be executed. Similarly, a store cycle may be performed by which the control unit 139 stores or writes a value to a memory location 132.
Each step or sub-process in the processes of Figs. 4 to 11 is associated with one or more segments of the program 133 and is performed by the register section 144, 145, 147, the ALU 140, and the control unit 139 in the processor 105 working together to perform the fetch, decode, and execute cycles for every instruction in the instruction set for the noted segments of the program 133.
The current method may alternatively be implemented in dedicated hardware such as one or more integrated circuits performing the required functions or sub functions. Such dedicated hardware may include graphic processors, digital signal processors, or one or more microprocessors and associated memories.
When images are altered using different processes, such as saturation alteration 340 and contrast alteration 350, inaccuracies in the depth map cause image artefacts in the results. Different processes produce different image artefacts, and artefacts of different visible severity.
Where a process is intended to achieve a particular perceptual effect in the image, as will be discussed below, the same perceptual effect may usually be achieved using different processes. Having different processes with a common goal, allows the trade-off of the processes to achieve fewer artefacts in the resulting image. The process of applying a mix of different processes requires a method of evaluating the alternative artefacts that would be caused, and of managing such a trade-off.
One such perceptual effect is a sense of depth, similar to the Japanese concept of rittai-kan ϋ in Japanese). When working with a single 2D image, rittai-kan is altered by the enhancement of monocular depth cues. For example, differentially altering the saturation in an image to make the background less-saturated, has an effect perceptually similar to aerial perspective. Aerial perspective (also known as distance haze) makes distant objects in the scene appear to be more distant, as if the object was so far away that the air and any dust between the observer and the objects has affected their appearance. When applied in a manner invoking aerial perspective, a human observer does not perceive the modification (a lowering of contrast), but instead perceives the rittai-kan of a scene, or a property of the air between them and a distant object. Differentially altering the contrast has a similar effect, also invoking aspects of aerial perspective. Making the distant parts of an image lighter and “washing-out” the distant objects, is a third way of invoking aerial perspective, and is a third different process with different mechanics, invoking the same atmospheric effect, and therefore the same perceptual effect. Conversely, increasing the visibility of detail and texture in foreground elements gives the perceptual effect of nearness, invoking the sense where detail on an object can usually only be seen when the object is close. Thus, increasing foreground contrast is an entirely different way to achieve an increase in the perceptual aspect of rittai-kan.
In Fig. 2, an image 210 is shown, together with two enhanced versions of that image, designated 220 and 230. Both image 220 and image 230 have increased levels of rittai-kan. In image 220, the brightness and contrast of the flower in the foreground 225 have been increased relative to the original flower 215. The brightness and contrast of the background 222 have been decreased relative to the original background 212. The image 220 gives a greater sense of the foreground being closer to the viewer than the background. In the second image 230, artificial effects have been used. The background is blacked out 232 behind an introduced border 237 which is drawn behind the foreground flower 235. A drop-shadow 238 is also drawn behind the flower 235. In this image 230, there is also a greater sense of the foreground being closer to the viewer than the background.
Another perceptual aspect of an image is the degree to which the subject of the image attracts visual attention, away from distractors. For example, differentially altering the saturation in an image to make the foreground more saturated and the background less-saturated will make the foreground attract more attention, and reduce the effect of any background distractors. Differentially altering the contrast will also make the foreground attract more attention by boosting the detail which can be seen, and reduce the effect of background distractors by reducing their detail.
In both examples above, at least two different processes exist which may be used to affect the perception of the same perceptual aspect an image, by different means affecting content in an image.
It will be appreciated by people skilled in the art, that other perceptual aspects of images exist, which are modified by at least two different processes, at least one of which relies on a depth map to differentiate foreground and background. A method 400 for modifying the sense of depth in an image with an inaccurate depth map is shown in Fig 4. The available input information is the image 410, the corresponding depth map 420, and information regarding what regions the image is comprised of 440.
In one arrangement the region information 440 is supplied by a user directly by drawing one or more outlines around objects in the scene using, for example, a stylus or finger on a touchscreen, or a mouse on a computer, or a gesture in a gesture interface, to trace the outline of a region of the image depicting the object.
In another arrangement, the user supplies the region information using a general purpose computer and software to designate specific sections of an image as belonging to different regions. In another arrangement the region information is calculated from the image content, such as using face detection to segment faces in the scene and selecting regions of the image depicting the detected faces for the purposes of processing. In another arrangement, the region information is a set of superpixels formed from neighbouring pixels with similar colour and variance. In yet another arrangement the regions are calculated based on both the image content and the depth map, by using the depth information to seed a graph-cut method. In yet another arrangement the regions are defined by the inaccurate depth map which has been low-pass filtered to produce a blurred version, such that the regions overlap and merge smoothly into each other, representing the depth layers such as the foreground and background of the scene in a graded manner.
The process 400 begins 430, with a saturation process 450 being applied to the image 410, modulated by the inaccurate depth map 420, as described previously. The result of the saturation process 450 is a first modified image 455 such as, for example, image 360.
Next the visibility of artefacts introduced into the modified image 455, as a result of the reliance on the inaccurate depth map 420, are detected 470 by comparing the modified image 455 with the original image 410. The region information 440 is provided, and the output is a list of quality scores 475 for each region. The quality scores represent an indication of the balance between aspects of the final images and artefacts introduced. The quality scores are described in more detail below. The method of detecting artefacts 470 is expanded upon in two different arrangements 500, 600, in Figs 5 and 6 respectively. In one arrangement the effectiveness of the changes are also detected, and the effectiveness for each region forms part of the quality score.
Effectiveness, which can also be called a process effectiveness measure, is evaluated directly as the strength of the desired change made to the image. For example, the measure of effectiveness of a change for a particular region is a mean of the differences between the modified image 455 and the original image 410 in each of the Red, Green, and Blue channels in that region. Alternatively, a measure of the effectiveness is a mean of the differences between the modified image 455 and the original image 410 in each of the Y, Cb, and Cr channels in a YCbCr image or a mean of the differences in Luma, Chroma, and Hue in an CIELCH image. Alternatively, the mean square error is used instead of the mean of the differences.
In another arrangement, effectiveness is evaluated using a separate method which evaluates the absolute level of an aspect of the image before and after the change. For example, a measure of contrast of the spectral energy of the image is measured as the sum of the absolute values of the Fourier transform with the DC component removed. A differential change in contrast is measured as the increase in spectral energy of the foreground and a decrease in spectral energy in the background as separate regions. Another means of measuring the differential change in contrast relies on depth information and measures an increase in close regions of the image as per the inaccurate depth map 420, and a corresponding decrease in contrast in far regions of the image. Other examples include a level of the saturation which is measured as the mean of the Chroma values across the image, surface texture which is measured as the mean of the outputs of an edge-detector (such as the Sobel operator )across an image, and foreground brightness which is measured as the mean luma of the image.
In yet another arrangement, effectiveness is evaluated by measuring the change in an image quality metric. A measure of the change in contrast is the Multi-Scale Structural Similarity Index Metric, known as MS SSIM. Another measure of the change in contrast is the Visual Information Fidelity metric. A measure of a change in saturation is a Mean-Square-Error (MSE) change in the chroma values across an image. A measure of the change in chromostereopsis is change in hue towards red in the foreground of the image, and a change in hue towards blue in the background of the image.
The process also begins 430, with a contrast adjustment process 460 being applied to the image 410, modulated by the inaccurate depth map 420. This process is similar to step 450 described above, but with a different process being applied. The result of the contrast adjustment process 460 is a second modified image 465 such as, for example, image 370.
The artefacts introduced in the second modified image 465 are detected 480, by comparing the modified image 465 with the original image 410. The region information 440 is given, and the output is a list of scores 485 for each region. The method of detecting artefacts 480 is expanded upon in two different arrangements 500, 600, in Figs 5 and 6 respectively.
The two processes saturation 450, and contrast 460, and the two detection steps 470 and 480, may be performed in parallel by the processor unit 105, or may be performed sequentially by the processor unit 105.
The first modified image 455 and the quality scores 475 for each region in the first modified image4, as well as the second modified image 465 and the quality scores 485 for each region in the second modified image 4654, are provided to the image combination step 490. Two different arrangements of an image combination method 700, 800, are expanded upon in Figs 7 and 8 respectively.
The result of the image combination step 490 is a modified image 495 having reduced artefacts. The process 400 is completed at 499.
Fig 5 shows a schematic flow diagram 500 of one method of detecting image artefacts in at least one region of a modified image 470. An example of an image artefact is a new image feature, such as an edge, in a region of the modified image which was substantially flat in the same region of the original image. Another example of an image artefact is a change in saliency in a region in the modified image compared with the same region in the original image. The method in Fig 5 is in particular concerned with the introduction of undesired image edges as one example of an image artefact.
The original image 510 is processed to find the visible edges 540, resulting in a map 545 of the locations of those edges5. Similarly, the modified image 515 is processed to find the visible edges 530, resulting in a map 535 of the locations of those edges5. In one arrangement, the visible edges are found by passing a Sobel filter over the image. The visible edges are found by processing the image with a low-pass filter and the output is a set of gradient values, or alternatively, the visible edges are detected only in the luminance channel or in the chroma channel, and the choice of image channel in which to do the edge detection depends on the process used to modify the image, for example saturation 455 or contrast 460. In another arrangement a colour-difference metric such as Delta E is used to compare neighbouring locations to detect visible changes in the image regardless of what process was used to generate the changes.
The two maps of edges 535 and 545 are compared 550 to produce a map of the visible edges which were introduced into the modified image 515 but do not correspond to edges existing in the original image 510. In one arrangement a subtraction is used to remove edges existing in the original image from map 535 so that only introduced edges remain. In another arrangement, edge values from the modified image are set to zero at the locations of the edges in the original image.
In one arrangement, the modified image 515 is also analysed to calculate the saliency 520, resulting in a saliency map of the modified image 525. This map identifies which content in the modified image is visually important, and will be affected by the content of the original image 510 and any artefacts introduced in processing the image, such as, for example, modifying the image saturation 450. The introduced visible edges 555 may be weighted by the saliency map 560, to produce a map of introduced salient image edges 565.
The introduced visible edges 555 or optionally the saliency-weighted edges 565, represent modified image changes corresponding to flat regions of the scene. As the correctness of the depth-map is unknown, it is possible that the introduced visible edges are indeed correct, but this is less-common, and therefore the introduced visible edges are used an indicator of image artefacts, and hence the quality of the result.
The region information 518 is used to sum the energy 570 of the visible edges 565 in each region, producing the per-region quality scores 575 which form a visibility measure of the image artefacts in each region. The energy of the visible edges in a region is the root mean square of the pixel values in the map of introduced salient visible edges within the region. In one arrangement each quality score is a single value. In another arrangement a measure of effectiveness is calculated and associated with each quality score.
Fig 6 shows a schematic flow diagram 600 of one method of detecting artefacts in each region of a modified image 470. This method is concerned with undesired changes in salience where a perceptual effect is intended to affect how an image is perceived but is not intended to change the location in the image which attracts attention. For example, an image includes a person in front of a fence. A salience map of the original image shows the person is the centre of attention for the image. After processing by increasing contrast to produce a modified image, the appearance of the fence becomes stronger and a salience map of the modified image shows the saliency changing to spread the centre of attention across the image to also include at least a part of the fence. In another example, an image shows a performer in front of a crowd. A saliency map of the original image shows the performer as the centre of attention. After processing the image by increasing saturation, one member of the crowd wearing a strongly coloured shirt becomes more prominent in the modified image, attracting more attention and distracting from the performer. In yet another example, an image shows a low-contrast sign in front of a high-contrast crowd of people. A saliency map of the original image shows the sign area as the focus of attention by its difference from the remainder of the surrounding image. After processing by differentially increasing contrast, the relative contrast of the sign is increased while the contrast of the crowd is decreased, causing the sign to become less visibly attentive, and a saliency map of the modified image shows no focus of attention.
The saliency-based method of artefact detection 600 begins with the calculation 620 of a saliency map 625 of the original image 610. It will be appreciated by persons skilled in the art that a number of methods for estimating salience and determining salient regions in an image are known. Depending on the exact application and computational limitations of an implementation of method 600, saliency can be determined using a highly accurate saliency estimation method or a computationally fast saliency estimation method. In another arrangement the choice of appropriate saliency algorithm is conditional on a feature of the image, such as the detection of a face in the image content, image metadata, such as flash being used to capture the image, or a property of the image file, such as the colourspace or resolution. A salience map 635 is calculated 630 using the same method as 620, from the modified image 615, although in some circumstances the salience map 625 for the original image 610 and the saliency map 635 for the modified image 615 may be produced using different saliency determination methods.
The salience map of the original image 625 and the salience map of the modified image 635 are passed to the saliency difference calculation step 640, producing a map of salience differences 645. In one arrangement the calculation 640 is a subtraction operation of the two saliency maps 625, 635. In another arrangement the two maps are first scaled to equal mean values to reduce the difference values associated with scaled/increased desired salience, and increase the values associated with local changes, and then a subtraction operation is applied to the two maps. In another arrangement the two saliency maps 625, 635 are first scaled to equal median values, and then a subtraction operation is applied to the two maps.
The region information 618 is used to sum 650 the saliency differences 645 in each region, producing the per-region quality scores 655, being an estimate of the introduced image artefacts. In one arrangement each region’s quality score is a single value. In another arrangement a measure of effectiveness is calculated and associated with each region’s quality score.
Fig 7 shows a schematic flow diagram 700 of one method of combining images according to the relative values of the region quality scores 490. This method is concerned with choosing an output image from a set of alternative modified images.
The per-region quality scores 715 of a first modified image 710 are combined together 730 producing a total quality score 735 for the first image 710. In one arrangement the per-region quality scores 715 are combined by summation to produce a total quality score 735. In another arrangement, the per-region quality scores 715 are first normalised by the number of pixels in each corresponding region, before summation over the regions, to produce a total quality score 735, thus giving each region equal weighting regardless of size. In another arrangement there is only one region consisting of the entire image, and correspondingly only a single score, which becomes the total quality score 735 for modified image 1710.
In one arrangement each quality score is a single value. In another arrangement each quality score comprises of a number of values together, and they are correspondingly summed or normalised as above.
The per-region quality scores 725 of a second modified image 720 are similarly combined 740 producing a total quality score 745 for the second modified image 720.
Next the modified image, either image 1 710 or image 2 720, is chosen 750 according to which has the better total quality score (735 or 745). The chosen image is output 755.
In one arrangement each quality score is a single value representing the introduction of edges, as possible image artefacts, and the best score is the one with the lowest value. In another arrangement each quality score is a single value representing the change in salience, as a possible image artefact, and the best score is the one with the lowest value. In yet another arrangement each score comprises two values, the first being an edge introduction or salience change as above, and the second being an effectiveness score. The effectiveness score is a measure of the change in the image as described when the quality scores are generated 470, and the best score is the one with the highest value. Where the quality score comprises two values, the choice of image is correspondingly a trade-off between the two values. In one arrangement the best score is the one with the fewest image artefacts. In another arrangement the best score is the one with the greatest effectiveness. In yet another arrangement the two scores are normalised according to predetermined ranges of values, and the best score is the one where the sum of the normalised values is greatest. In yet another arrangement the two scores are normalised as above and the best score is the one where the weighted sum of the normalised values is greatest, where the weighting favours one factor over another, for example 80% weighting for an absence of artefacts and 20% weighting for an effectiveness of the process.
Fig 8 shows a schematic flow diagram 800 of another method of combining processed images according to the relative values of the region quality scores 490. This method is concerned with choosing output regions from a set of alternative modified regions and assembling a final image from the chosen output regions. In this way a perceptual attribute of the image, such as the sense of depth, is modified by locally applying different image adjustment processes in different regions of the image. A region 845 is selected 840 from the available regions in the image as contained in the region information 830. The score for the region in modified image 1 810 is obtained from the per-region quality scores 1815. The score for the region in modified image 2 820 is obtained from the per-region quality scores 2 825. These scores are compared 850, producing a choice of image for that region 855. The comparison of the scores for each region is performed as for the comparison for an entire image, as described for step 750.
The region information 830 is used to extract 860 the image region from either modified image 1 810 or from modified image 2 820, according to the image choice for the region 855. The chosen region is added to a pool of image regions 865. If 8all the regions have not yet been processed 880, the process returns to the selection step 840 to select and extract the next region.
If all the regions have been extracted 880, control passes to an image assembly step 870, in which all of the regions from the pool of image regions 865 are combined to form the output image 875. In the image assembly step 870, the spatial location of each image region within the output image is the same as the spatial location of the image regions in the original image.
In one arrangement each image region is held separately in memory until the assembly step. In another arrangement each image region is composited into the final image as each image is chosen and the final step 870 is composed solely of outputting 875 the completed image which comprises the pool of image regions 8658. In another arrangement the choice of image for a region 855 is a proportion value indicating a weighted combination of data from the first modified image 810 and the second modified image 820, and extraction step 860 includes multiplying the image regions according to the respective proportion values. In one arrangement the regions are processed in their entirety as in the flow diagram 800, while in another arrangement the process 800 is performed for each pixel of an image and invokes the corresponding regions for the pixel’s location.
In another arrangement, the per-region quality scores are calculated for a different set of score regions for each modified image, and the process choice and compositing is applied to a set of choice regions covering the original image, where the sets of score regions and choice regions may be different. The process choice for a choice region from the second set is evaluated according to the overlap between the choice region and the overlapping score regions from the set of score regions for each modified image, with the score from each score region being weighted by the size of the overlap of each score region with the choice region. In one arrangement, the choice regions are based on a pre-determined template. An example is an artistic effect where a perceptual attribute is only modified in the centre of the image, and the centre region and outer region form the set of choice regions, or alternatively a perceptual attribute is modified in horizontal bands across the image and the bands form the set of choice regions.
It will be appreciated by those skilled in the art that an image composed of processed regions of images as produced in Figs 7 and 8, can equally be produced by combining pre-processed regions, or by selectively processing regions of a single image.
Fig 9 illustrates 900 the combination 950 according to arrangements, of two processed images 920 and 930, to produce a final image 960. The region information is shown 940 to provide reference for how the final image 960 was assembled, and the original image 910 is provided for reference.
With reference to Fig 3 as described previously, the effectiveness of different processes may be evaluated, as can the presence of introduced image artefacts. For example for the sense of depth:
Table (l) -Artefacts per resiori of processed images 360. 920 and 370. 930.
Different properties may be optimised-for as described for step 750. Those skilled in the art will appreciate that other properties besides the presence of artefacts or effectiveness may be considered, such as naturalness, or a measure of aesthetic quality. Fig 9 illustrates the minimisation of visible image artefacts. With reference to the region information 940 and the original image 910, the region 945 corresponding to the person 915 is selected from image 2 930, 999 the region 942 corresponding to the sky 912 is selected from image 2 930, and the region 947 corresponding to the ground 917 is selected from image 1 920.
The region in the output image 960 of the person 965 has only a small image artefact 973. The region is taken from image 2, 930 where the corresponding region has that small image artefact 933, whereas the artefact 923 in image 1 920 is more visible.
The region in output image 960 of the sky 962 has no visible image artefacts. The regions 978 corresponding to the depth artefacts 938 are as smooth as the corresponding regions 938 in image 2 930, whereas the corresponding regions 928 of image 1 920 have highly visible image artefacts.
Finally, the region in the output image 960 of the ground 967 has only minimal artefacts 976.
The region is taken from image 1 920, where the corresponding region has minimal artefacts 926, whereas the artefacts 936 in image 2 930 are highly visible.
Thus the output image 960 has few artefacts but has successfully increased the sense of depth compared to the original 910 through combining the processes.
Fig 10 shows a flow diagram 1000 of a second method for modifying the sense of depth in an image with an inaccurate depth map. The available input information is the image 1010, the corresponding depth map 1018, and information regarding what regions the image is comprised of 1015.
Initially, prediction steps 1020 and 1030 produce predicted quality scores 1025 and 1035 respectively, corresponding to two processes Process 1 1060 and Process 2 1070, say saturation and contrast respectively. A flowchart 1100 for the prediction step is outlined below with reference to Fig 11. In one arrangement the prediction steps 1020 and 1030 are generalised and produce quality scores with the same scaling. In another arrangement the prediction steps 1020 and 1030 are adapted to their respective processes, and the quality scores 1025 and 1035 are normalised so as to be comparable.
The process then begins 1040 with the selection 1042 of a region 1045, from the available regions as per the region information 1015.
The quality scores 1025 and 1035 of the selected region 1045, are compared 1050. A decision is made 1055 on which score is better. If quality score 1 is better for the selected region, then control passes to Process 1 1060. Process 1 is, for example, a differential increase in saturation, requiring as inputs the image 1010, the depth-map 1018, and the selected region on which to apply the process, 1045. If quality score 2 is better for the selected region, then correspondingly Process 2 1070, such as a differential increase in contrast, is called and also requires the image 1010, the depth-map 1018, and the selected region on which to apply the process 1045.
Whichever process is called, the resulting region is added to a pool of processed image regions 1075. Control then passes to decision step 1080, which determines whether all of the regions of the image have been processed. If lOall the regions have not yet been processed 1080, control returns to the selection stop 1042 at which point, a new region is selected. If lOall the regions have now been processed 1080, control passes to an assembly step 1090, in which all of the regions from the pool of processed image regions 1075 are combined to form the output image 1095. The process is then complete 1099.
In one arrangement, each processed region is held separately in memory until the assembly step. In another arrangement each region is composited into the final image as each region is processed and the final step 1090 is composed solely of outputting 1095 the completed image which comprises the pool of image regions 107510. In another arrangement, a single copy of the image is recycled through the process, and selected regions 1045 are processed in place according to the selected image process for the region.
In one arrangement the region selection 1045 is a proportion value indicating a combination of the processes to apply 1060 and 1070, and the process of adding them to the pool 1075 involves a relative blending of the processed region, or applying the processes sequentially to the same region, in accordance with proportion value.
In one arrangement the regions are processed in their entirety as in the flow diagram 1000, while in another arrangement the process 1000 is performed for each pixel of an image and invokes the corresponding regions for the pixel’s location.
Fig 11 shows a flow diagram 1100 corresponding to the process 1020, for predicting the per-region quality scores 1195.
Prior information 1130 about the process to be applied, is used to transform or extract 1170 the image information which will be affected 1175, from the original image 1110. For example, if the process is a change in saturation, the chroma channel of an image is extracted. If the process is a change in contrast, the luma channel is extracted. If the process is an alteration of the colour temperature, the hue channel is extracted. If the process is an edge enhancement operation, the image gradients are extracted from the luma channel. If the process is a sharpening affecting all colour channels independently, the image information extracted 1175 is the full original image 1110.
The original image 1110 is passed to an edge-detection step 1150 producing a map 1155 of the locations of visible edges in the original image. Similarly, the depth map 1120 is passed to an edge-detection step 1160 producing a map 1165 of the visible edges in the depth map. The map of original image edges 1155 and the transformed/extracted image information 1175 and the map of the depth-map edges 1165 are all passed to the introduced-edge-detection process 1180.
The introduced-edge-detection process 1180 identifies depth map edges 1165 which will be visible when a process affecting the relevant image information 1175 is applied to the image, at locations not corresponding to the original image edges 1155. This is done by performing a subtraction of the strength of the original image edges 1155 from the strength of the depth-map edges 1165, and the values are multiplied by the relevant image information 1175 to produce a map of introduced edges 1185. Alternatively, depth edge values 1165 are set to zero at the locations of the edges of the original image 1165, and the remaining depth edge values are then multiplied by the relevant image information 1175 to produce a map of introduced edges 1185.
In one arrangement, the original image 1110 is also analysed to calculate a saliency map which is multiplied by the transformed image information 1175 weighting the output 1185 by the salience of the introduced edges.
The map of introduced edges 1185 is summed 1190 across each region of the image, using the region information 1140, producing the per-region quality scores 1195. The process 1020 is complete.
One example of a camera implemented arrangement is described below.
Firstly, a user captures a 2D image of a 3D scene using an RGBZ camera. The RGBZ camera captures a 2D RGB image along with a depth map of the 3D scene. The depth map has inaccuracies due to limitations in the depth map estimation. An image processing system embedded in the camera applies a saturation process and a contrast process to the captured 2D image to create modified images with an enhanced sense of depth. Because of the inaccuracies in the depth map, the modified images have a mixture of visible artefacts and invisible artefacts, which detract from the perceived quality of the modified images.
The system generates a set of regions of the original image, using image content segmentation. The system uses the original image, the modified images, the depth map and the set of regions to estimate a visibility measure for the image artefacts introduced in each region in the modified images for each process, and an effectiveness measure for the strength of change in each region between the original image and the modified images for each process. The visibility measure is weighted by the saliency of the introduced image artefacts. The visibility measure and the effectiveness measure together form a per-region quality score for each region.
The system then determines which process to apply to each region in the original image. This is done by comparing the per-region quality score between the modified images corresponding to each different process for each region, and choosing the modified image in each region which has the lowest image artefact visibility measure, and the highest effectiveness measure. The chosen modified image regions are then composited into a single output image. The output image has an increased sense of perceived depth compared to the original image, and the process choice method causes a reduction in visible artefacts caused by inaccuracies in the depth map compared with the images modified by each process.
The arrangements described are applicable to the computer and data processing industries and particularly for image processing and image editing.
The foregoing describes only some embodiments of the present invention, and modifications and/or changes can be made thereto without departing from the scope and spirit of the invention, the embodiments being illustrative and not restrictive.
In the context of this specification, the word “comprising” means “including principally but not necessarily solely” or “having” or “including”, and not “consisting only of’. Variations of the word "comprising", such as “comprise” and “comprises” have correspondingly varied meanings.

Claims (17)

1. A method of modifying a perceptual attribute of an image using a corresponding depth map of a scene captured by the image, said method comprising: determining a visibility measure for image artefacts produced in each of a plurality of regions of the image by each of a set of image adjustment processes when applied to content of the image in accordance with depth values in the depth map, said image artefacts being caused by inaccurate depth values of the corresponding depth map; selecting a first and second image adjustment process from the set of image adjustment processes based on respective determined visibility measures; and modifying a perceptual attribute of the image by locally applying the first and second selected image adjustment processes to different respective regions of the image.
2. A method according to claim 1, wherein the different respective regions correspond to distinct depth layers in the depth map.
3. A method according to claim 1, wherein the different respective regions correspond to distinct image content regions.
4. A method according to claim 1, wherein the first and second image adjustment processes are combined and applied to the at least one region of the image.
5. A method according to claim 4, wherein the combination of the first and second image adjustment processes is weighted according to a proportion value.
6. A method according to claim 1, wherein the respective determined visibility measures are used to weight more visible image artefacts higher than less visible artefacts.
7. A method according to claim 1, wherein the respective determined visibility measures are used to weight more salient image artefacts higher than less salient artefacts.
8. A method according to claim 1, wherein an image artefact is defined as at least one new image feature located in an image region which was substantially flat before application of an image process to the image region.
9. A method according to claim 1, wherein an image artefact is defined as a local change in saliency of an image region as a result of an application of an image process to the image region.
10. A method according to claim 1, wherein each region of the image is one of a single image pixel, a plurality of image pixels, and a superpixel.
11. A method according to claim 1, wherein the each of the plurality of regions of the image and the different respective regions of the image correspond and overlap.
12. A method according to claim 1, wherein each of the plurality of regions of the image are determined by a superpixel segmentation process, and the first and second image adjustment processes are applied to different respective regions which are determined by an image content analysis process or a user indication, and a predetermined template.
13. A method according to claim 1, wherein image content information associated with the location of each image artefact in the image is used to determine the corresponding visibility measure.
14. A method according to claim 1, wherein the selection of the first and second image adjustment processes are further based on respective process effectiveness measures.
15. An apparatus for modifying a perceptual attribute of an image, the apparatus comprising: a processor; and a memory storing a computer executable software program for directing the processor to perform a method comprising the steps of: determining a visibility measure for image artefacts produced in each of a plurality of regions of the image by each of a set of image adjustment processes when applied to content of the image in accordance with depth values in the depth map, said image artefacts being caused by inaccurate depth values of the corresponding depth map; selecting a first and second image adjustment process from the set of image adjustment processes based on respective determined visibility measures; and modifying a perceptual attribute of the image by locally applying the first and second selected image adjustment processes to different respective regions of the image.
16. A non-transitory computer readable medium storing a computer executable software program for directing a processor to perform a method for modifying a perceptual attribute of an image using a corresponding depth map of a scene captured by the image, said method comprising: determining a visibility measure for image artefacts produced in each of a plurality of regions of the image by each of a set of image adjustment processes when applied to content of the image in accordance with depth values in the depth map, said image artefacts being caused by inaccurate depth values of the corresponding depth map; selecting a first and second image adjustment process from the set of image adjustment processes based on respective determined visibility measures; and modifying a perceptual attribute of the image by locally applying the first and second selected image adjustment processes to different respective regions of the image.
17. A method of modifying a perceptual attribute of an image using a corresponding depth map of a scene captured by the image, said method comprising: generating a first modified image by applying a first image adjustment process to the image and a second modified image by applying a second image adjustment process to the image, said image adjustment processes being selected from a from a set of image adjustment processes and being applied to content of the image to modify a perceptual attribute of the image in accordance with depth values in the depth map; determining a visibility measure for image artefacts produced in each of a plurality of regions of the first and second modified images by the corresponding image adjustment processes, said image artefacts being caused by inaccurate depth values of the corresponding depth map; and selecting one of the first and second modified images based on respective determined visibility measures. CANON KABUSHIKI KAISHA Patent Attorneys for the Applicant/Nominated Person SPRUSON & FERGUSON
AU2016273984A 2016-12-16 2016-12-16 Modifying a perceptual attribute of an image using an inaccurate depth map Abandoned AU2016273984A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2016273984A AU2016273984A1 (en) 2016-12-16 2016-12-16 Modifying a perceptual attribute of an image using an inaccurate depth map

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2016273984A AU2016273984A1 (en) 2016-12-16 2016-12-16 Modifying a perceptual attribute of an image using an inaccurate depth map

Publications (1)

Publication Number Publication Date
AU2016273984A1 true AU2016273984A1 (en) 2018-07-05

Family

ID=62748589

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2016273984A Abandoned AU2016273984A1 (en) 2016-12-16 2016-12-16 Modifying a perceptual attribute of an image using an inaccurate depth map

Country Status (1)

Country Link
AU (1) AU2016273984A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325693A (en) * 2020-02-24 2020-06-23 西安交通大学 Large-scale panoramic viewpoint synthesis method based on single-viewpoint RGB-D image
CN111428602A (en) * 2020-03-18 2020-07-17 浙江科技学院 Convolutional neural network edge-assisted enhanced binocular saliency image detection method
US11688090B2 (en) 2021-03-16 2023-06-27 Toyota Research Institute, Inc. Shared median-scaling metric for multi-camera self-supervised depth evaluation

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325693A (en) * 2020-02-24 2020-06-23 西安交通大学 Large-scale panoramic viewpoint synthesis method based on single-viewpoint RGB-D image
CN111325693B (en) * 2020-02-24 2022-07-12 西安交通大学 Large-scale panoramic viewpoint synthesis method based on single viewpoint RGB-D image
CN111428602A (en) * 2020-03-18 2020-07-17 浙江科技学院 Convolutional neural network edge-assisted enhanced binocular saliency image detection method
US11688090B2 (en) 2021-03-16 2023-06-27 Toyota Research Institute, Inc. Shared median-scaling metric for multi-camera self-supervised depth evaluation

Similar Documents

Publication Publication Date Title
US9311901B2 (en) Variable blend width compositing
US7218792B2 (en) Stylized imaging using variable controlled illumination
US7206449B2 (en) Detecting silhouette edges in images
US9292928B2 (en) Depth constrained superpixel-based depth map refinement
US7359562B2 (en) Enhancing low quality videos of illuminated scenes
US9661239B2 (en) System and method for online processing of video images in real time
EP2987134B1 (en) Generation of ghost-free high dynamic range images
US7102638B2 (en) Reducing texture details in images
US7103227B2 (en) Enhancing low quality images of naturally illuminated scenes
US7295720B2 (en) Non-photorealistic camera
US8644638B2 (en) Automatic localized adjustment of image shadows and highlights
US10148895B2 (en) Generating a combined infrared/visible light image having an enhanced transition between different types of image information
KR100846513B1 (en) Method and apparatus for processing an image
US20130251260A1 (en) Method and system for segmenting an image
US10198794B2 (en) System and method for adjusting perceived depth of an image
CN108377374B (en) Method and system for generating depth information related to an image
Riaz et al. Single image dehazing via reliability guided fusion
US20150227779A1 (en) Image processing device, image processing method, and program
Abebe et al. Towards an automatic correction of over-exposure in photographs: Application to tone-mapping
US9672447B2 (en) Segmentation based image transform
AU2016273984A1 (en) Modifying a perceptual attribute of an image using an inaccurate depth map
Han et al. Automatic illumination and color compensation using mean shift and sigma filter
JP2013182330A (en) Image processor and image processing method
AU2016273979A1 (en) System and method for adjusting perceived depth of an image
AU2015271981A1 (en) Method, system and apparatus for modifying a perceptual attribute for at least a part of an image

Legal Events

Date Code Title Description
MK4 Application lapsed section 142(2)(d) - no continuation fee paid for the application