AU2015202282A1

AU2015202282A1 - Camera parameter optimisation for depth from defocus

Info

Publication number: AU2015202282A1
Application number: AU2015202282A
Authority: AU
Inventors: Matthew Raphael Arnison; David Peter Morgan-Mar; Peter Jan Pakulski
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2015-05-01
Filing date: 2015-05-01
Publication date: 2016-11-17

Abstract

CAMERA PARAMETER OPTIMISATION FOR DEPTH FROM DEFOCUS Abstract A method, apparatus and system of selecting a parameter step size are disclosed. The method selects a parameter step size for capturing at least two images of a scene using an image capture device, said method comprises determining, for at least one parameter step size and at least one depth, a plurality of depth sensitivity measurements (520) for the image capture device for each of a plurality of spatial frequencies from an optical transfer function associated with the image device, wherein a point spread function, associated with the optical transfer function at a given focus depth, has a non-zero radius. The method further comprises selecting a deterministic spectral scene function associated with the scene and determining depth accuracy (540) from the plurality of determined depth sensitivity measurements, said depth accuracy being determined by weighting the depth sensitivity measurements with the selected deterministic spectral scene function (530). The method further comprises selecting, based on the determined depth accuracy, the parameter step size (560) from at least one parameter step size for capturing the at least two images of the scene. QQIQ2"7Q I D12C0rfn -5/8 Start Set up camera aimed at scene Set focus, zoom, aperture |Select focus step size K 440 Capture first image of scene Change focus by focus step size Capture second image of scene K 470 Determine depth map Fig. 4 471_1

Description

CAMERA PARAMETER OPTIMISATION FOR DEPTH FROM DEFOCUS

TECHNICAL FIELD

[0001] The present invention relates to camera control and digital image processing. In particular, the present invention relates to a method, apparatus and system for selecting a parameter step size for determining the distance to objects in a scene from images in the scene. The present invention also relates to a computer program product including a computer readable medium having recorded thereon a computer program for selecting a parameter step size for determining the distance to objects in a scene from images in the scene.

BACKGROUND

[0002] In many applications of image capture, it can be advantageous to determine the distance from the image capture device to objects within the field of view of the image capture device. A collection of such distances to objects in an imaged scene is sometimes referred to as a depth map. A depth map of an imaged scene may be represented as an image, which may be of a different pixel resolution to the image of the scene itself, in which the distance to objects corresponding to each pixel of the depth map is represented by a greyscale or colour value.

[0003] A depth map can be useful in the fields of photography and video, as it enables several desirable post-capture image processing capabilities. For example, a depth map can be used to segment foreground and background objects to allow manual post-processing, or the automated application of creative visual effects. A depth map can also be used to apply depth-related visual effects, such as simulating the aesthetically pleasing graduated blur of a high-quality lens using a smaller and less expensive lens.

[0004] In such applications, the accuracy of the depth map is very important, especially for objects near the best focus point because such objects are usually the main subject of the scene. It is also important that the depth is accurate over a working range of distances in the scene which includes most or all of the objects in the scene. Such can help to isolate objects at different distances within the scene. Errors in the depth map may create visual artefacts which are highly visible in the processed image.

[0005] Depth estimation may be performed by depth from defocus (DFD) using a single camera by capturing two images with different focus or aperture settings and analysing the relative blur between the images. The two images are known as a focus bracket. DFD is a flexible method because DFD uses a single standard camera without special hardware modifications. The same camera can be used for image or video capture and also for depth capture.

[0006] The accuracy and working range of DFD depend strongly on the camera parameters used to capture focus bracket images and the parameters of the scene being captured. A need exists to select the camera parameters to capture images such that depth accuracy and working range for the scene may be maximised. Preferably, this selection should be performed on the camera just before image capture, so it is important that the selection is efficient.

SUMMARY

[0007] It is an object of the present disclosure to substantially overcome, or at least ameliorate, at least one disadvantage of present arrangements.

[0008] A first aspect of the present disclosure provides a method of selecting a focus step size for capturing at least two images of a scene using an image capture device, said method comprising the steps of: determining, for at least one focus step size and at least one depth, a plurality of depth sensitivity measurements for the image capture device for each of a plurality of spatial frequencies from an optical transfer function associated with the image device, wherein a point spread function, associated with the optical transfer function at a given focus depth, has a non-zero radius; selecting a deterministic spectral scene function associated with the scene; determining depth accuracy from the plurality of determined depth sensitivity measurements, said depth accuracy being determined by weighting the depth sensitivity measurements with the selected deterministic spectral scene function; and selecting, based on the determined depth accuracy , the focus step size from at least one focus step size for capturing the at least two images of the scene.

[0009] A further aspect of the present disclosure provides a method of selecting a parameter step size for capturing at least two images of a scene using an image capture device, said method comprising the steps of: determining, for at least one parameter step size and at least one depth, a plurality of depth sensitivity measurements for the image capture device for each of a plurality of spatial frequencies from an optical transfer function associated with the image device, wherein a point spread function, associated with the optical transfer function at a given focus depth, has a non-zero radius; selecting a deterministic spectral scene function associated with the scene; determining depth accuracy from the plurality of determined depth sensitivity measurements, said depth accuracy being determined by weighting the depth sensitivity measurements with the selected deterministic spectral scene function; and selecting, based on the determined depth accuracy, the parameter step size from at least one parameter step size for capturing the at least two images of the scene.

[0010] In one implementation, the selected parameter step size may be a zoom step size.

[0011] In another implementation the selected parameter step size may be a zoom step size. In another implementation, the selected parameter step size may be an aperture step size.

[0012] In yet another implementation, the selected parameter step size may a step size of at least one of focus step size, aperture step size and zoom step size.

[0013] In a specific example, the parameter step size is selected based upon the depth accuracy at a target depth. In one implementation, the target depth corresponds to a main subject of the scene determined from operation of an auto-focus system of the image capture device. In another implementation the target depth corresponds to a depth determined by the image device from a captured image using single image depth from defocus.

[0014] In another example, the parameter step size is selected based upon a working range of the determined depth accuracy.

[0015] In a specific implementation, the deterministic scene function uses a natural scene spectrum function.

[0016] In another implementation, the deterministic scene function is selected according to a mode of the image capture device.

[0017] In a further implementation, the deterministic scene function is selected based upon the scene of a captured image.

[0018] In a further implementation, the optical transfer function with a non-zero blur radius at the given depth is determined using a Stokseth approximation.

[0019] In another implementation, the optical transfer function with a non-zero blur radius at the given depth is determined using a Gaussian approximation.

[0020] In a further implementation, the optical transfer function with a non-zero blur radius at the given depth is determined using Fourier optics.

[0021] In another implementation, the optical transfer function with a non-zero blur radius at the given depth is determined using a Gaussian beam waist having a non-zero blur radius.

[0022] In one implementation, the depth sensitivity measurements are determined using a complex optical transfer function.

[0023] In another implementation, the depth sensitivity measurements are determined using a spatially varying optical transfer function.

[0024] Other aspects are also disclosed.

BRIEF DESCRIPTION OF THE DRAWINGS

[0025] One or more embodiments of the invention will now be described with reference to the following drawings, in which: [0026] Figs. 1A and IB show a schematic block diagram of an electronic device on which arrangements described may be practised; [0027] Fig. 2 shows a schematic representation of the geometry of a lens forming two different images at two different focal planes; [0028] Fig. 3 A shows a first plot of example results of a method of parameter step size selection; [0029] Fig. 3B shows a second plot of example results of a method of parameter step size selection; [0030] Fig. 4 shows a schematic flow diagram illustrating image capture and depth estimation including parameter step size selection; [0031] Fig. 5 shows a schematic flow diagram illustrating selecting a focus step size; [0032] Fig. 6 shows a schematic flow diagram illustrating image capture and depth estimation including parameter step size selection; and [0033] Fig. 7 shows a schematic flow diagram showing image capture and depth estimation including parameter step size selection.

DETAILED DESCRIPTION INCLUDING BEST MODE

[0034] The present disclosure is directed to providing methods of selecting image capture parameters (e.g., camera parameters) for capturing two images of a scene using a single image capture device with different capture parameters, where the image capture device is substantially located at the same position. The present disclosure is also directed to extracting a depth map from the captured images. The methods seek to select the parameters to improve or maximise the accuracy and working range of the depth map.

[0035] The present disclosure is also directed towards a method of capturing two images with an image capture device, with different parameters, such as focus or aperture settings, in order to determine the depth of objects in a scene using depth from defocus (DFD). The method improves the accuracy of DFD by selecting a parameter step size between the two images determined likely to give the greatest depth accuracy for DFD.

[0036] The two captured images may be referred to as a focus bracket. Test values are selected for the parameter change effected by the parameter step size for the focus bracket and for a target or given depth. Using the test values, the depth sensitivity of an optical transfer function (OTF) of the image capture device (camera) is evaluated at a range of spatial frequencies for each image in the focus bracket. In some implementations, the point spread function (PSF) associated with the OTF at the best focus target depth, has a non-zero radius. A scene spectral weighting is applied to the depth sensitivity. The depth accuracy is determined by determining the Cramer-Rao lower bound (CRLB) using a deterministic scene function and the depth sensitivity. If more parameter step sizes or more target depths need to be tested, then the process repeats with the additional test values. After the depth accuracy has been determined for all test values, the parameter step size target depth values is selected as the capture value based upon the depth accuracy.

[0037] The capture value is used to capture two images in a focus bracket using the image capture device. After the images are captured, the depth of objects in the scene is determined from the two images.

[0038] In the context of the present application, the best focus depth refers to a particular depth between a lens of an image capture device (e.g., camera) and an object (or plane) of a scene at which the object is at sharpest focus in an image captured by the image capture device. The best focus depth varies according to parameters of the image capture device which affect image blur, such as focus, aperture, zoom, and the like.

[0039] One method of selecting parameters for DFD is to analyse the changes in the optical transfer function (OTF) of the camera device which depend on defocus. The derivative of the OTF with respect to defocus can be determined for specific spatial frequencies and used to select the focus step size between images captured in a focus bracket for DFD. However, the defocus OTF is an oscillatory function with respect to both spatial frequency and defocus. Considering only a single spatial frequency at a time make it unlikely that the focus step size maximising overall accuracy is selected. In addition, the depth accuracy cannot be determined, which means that the accuracy cannot be compared with performance targets which provides difficulty in trading depth accuracy against other goals for image capture such as image quality.

[0040] Another method of selecting parameter step size for DFD is to analyse the depth sensitivity of the point spread function (PSF) peak ratio across the focus bracket. Using such a method, a formula can be obtained for selecting a focus step size. However, such a method cannot be used to determine the depth accuracy. In addition, such methods do not consider several camera and scene parameters which have a significant effect on DFD accuracy, such as object texture, pixel spacing and sensor noise. In addition, such methods are based on a specific DFD algorithm which uses limited information from the image signal, such that the appropriate camera parameters for a different DFD algorithm may not be selected.

[0041] Another method of selecting camera parameters for DFD is to determine the Cramer-Rao lower bound (CRLB) on the variance of depth estimation for a range of parameters and select the parameters which give improved depth accuracy. The CRLB can be derived using a stochastic scene function and the parameters can be described using a focus blur ratio which is the ratio of the geometric blur circle between the first image and the second image. However, the stochastic scene function creates many cross-terms in the CRLB, making the CRLB complicated and time consuming to determine. In addition, the focus blur ratio has a singularity at best focus, which makes parameter selection unstable for the depth accuracy of the main subject of the image.

[0042] Another method of determining the CRLB for DFD uses a purely real OTF and a geometric blur circle radius to derive a formula for the CRLB. However, lenses often have aberrations which have a significant effect on depth accuracy. A CRLB based on a purely real OTF cannot be used to estimate the effect of common asymmetric lens aberrations such as coma and astigmatism. This means that the parameters most likely to improve depth accuracy may not be selected such that the depth accuracy may be reduced. In addition, the geometric blur circle radius is zero at a point of best focus depth, which makes results of parameter selection unstable for the depth accuracy of the main subject of the image.

[0043] The arrangements presently disclosed may be implemented on a variety of hardware platforms, including in an image capture device such as a camera, or on a general purpose computer (PC), or in a cloud computing implementation. Figs. 1A and IB collectively form a schematic block diagram of an electronic device 101 including embedded components, upon which the methods to be described may be practiced. The electronic device 101 in the example of Fig. 1A is a camera device. However the device 101 may be any type of image capture device capable of capturing still and/or video images, such as a camera, a video camera, or a device having a camera or video camera as a component and the like, in which processing resources may be limited. Further, the methods to be described may also be performed on higher-level devices such as desktop computers, server computers, and other such devices with significantly larger processing resources.

[0044] As seen in Fig. 1 A, the camera device 101, comprises an embedded controller 102. Accordingly, the camera device 101 may be referred to as an “embedded device.” In the present example, the controller 102 has a processing unit (or processor) 105 which is bidirectionally coupled to an internal storage module 109. The storage module 109, also known as a memory, may be formed from non-volatile semiconductor read only memory (ROM) 160 and semiconductor random access memory (RAM) 170, as seen in Fig. IB. The RAM 170 may be volatile, non-volatile or a combination of volatile and non-volatile memory.

[0045] The camera device 101 includes a display controller 107, which is connected to a video display 114, such as a liquid crystal display (LCD) panel or the like. The display controller 107 is configured for displaying graphical images on the video display 114 in accordance with instructions received from the embedded controller 102, to which the display controller 107 is connected.

[0046] The camera device 101 also includes user input devices 113 which are typically formed by keys, a keypad or like controls. In some implementations, the user input devices 113 may include a touch sensitive panel physically associated with the display 114 to collectively form a touch-screen. Such a touch-screen may thus operate as one form of graphical user interface (GUI) as opposed to a prompt or menu driven GUI typically used with keypad-display combinations. Other forms of user input devices may also be used, such as a microphone (not illustrated) for voice commands or a joystick/thumb wheel (not illustrated) for ease of navigation about menus.

[0047] As seen in Fig. 1 A, the camera device 101 also comprises a portable memory interface 106, which is coupled to the processor 105 via a connection 119. The portable memory interface 106 allows a complementary portable memory device 125 to be coupled to the camera device 101 to act as a source or destination of data or to supplement the internal storage module 109. Examples of such interfaces permit coupling with portable memory devices such as Universal Serial Bus (USB) memory devices, Secure Digital (SD) cards, Personal Computer Memory Card International Association (PCMIA) cards, optical disks and magnetic disks.

[0048] The camera device 101 also has a communications interface 108 to permit coupling of the device 101 to a computer or communications network 120 via a connection 121. The connection 121 may be wired or wireless. For example, the connection 121 may be radio frequency or optical. An example of a wired connection includes Ethernet. Further, an example of wireless connection includes Bluetooth™ type local interconnection, Wi-Fi (including protocols based on the standards of the IEEE 802.11 family), Infrared Data Association (IrDa) and the like. The camera device 101 may for example be connected to a cloud server computer 190 via if the connection 121 is a wireless connection. Alternatively, the camera device may be connected to a personal computer (PC) 195 via a wired connection (not shown). In some instances the camera device 101 may be a component of another electronic device, such as the PC 195, a tablet, a notebook, a smartphone, and the like.

[0049] Typically, the camera device 101 is configured to perform some special function. The embedded controller 102, possibly in conjunction with further special function components 110, is provided to perform that special function. If the electronic device 101 is a camera device, the components 110 may include at least a lens, focus control and image sensor. In the example shown in Fig. 1A, the components 110 includes at least a lens 180, a lens focus motor 182, a lens auto-focus system 183, and an image sensor 189. The lens 180 may be fixed to the device 101 or may be interchangeable. In implementations where the lens 180 is interchangeable, the lens 180 includes a lens memory 184. The lens memory 184 is similar to the memory 109 and stores information specific to the lens 180, such as a look up table of OTF data, or a look up table of parameter step sizes to use for DFD, and the like. The image sensor 189 may be any type of image sensor suitable for use in an image capture device.

[0050] The lens motor 182 and the lens auto-focus system 183 may controlled by the execution of the processor 105 to set parameters of the camera device 101 such as focus, aperture and zoom. Parameters of the camera device 101 which affect blur of a captured image, for example focus, aperture and zoom, are hereafter referred to as camera parameters.

[0051] The special function components 110 are connected to the embedded controller 102. As another example, the camera device 101 may be a mobile telephone handset with an inbuilt camera. In this instance, the components 110 would represent those components required for communications in a cellular telephone environment. The special function components 110 may also represent a number of encoders and decoders of a type including Joint Photographic Experts Group (JPEG), (Moving Picture Experts Group) MPEG, MPEG-1 Audio Layer 3 (MP3), and the like (not shown).

[0052] The methods described hereinafter may be implemented using the embedded controller 102, where the processes of Figs. 4 to 7 may be implemented as one or more software application programs 133 executable within the embedded controller 102. The camera device 101 of Fig. 1A implements the methods described herein. In particular, with reference to

Fig. IB, the steps of the described methods are effected by instructions in the software 133 that are carried out within the controller 102. The software instructions may be formed as one or more code modules, each for performing one or more particular tasks. The software may also be divided into two separate parts, in which a first part and the corresponding code modules perform the described methods and a second part and the corresponding code modules manage a user interface between the first part and the user.

[0053] The software 133 of the embedded controller 102 is typically stored in the non-volatile ROM 160 of the internal storage module 109. The software 133 stored in the ROM 160 can be updated when required from a computer readable medium. The software 133 can be loaded into and executed by the processor 105. In some instances, the processor 105 may execute software instructions that are located in RAM 170. Software instructions may be loaded into the RAM 170 by the processor 105 initiating a copy of one or more code modules from ROM 160 into RAM 170. Alternatively, the software instructions of one or more code modules may be pre-installed in a non-volatile region of RAM 170 by a manufacturer. After one or more code modules have been located in RAM 170, the processor 105 may execute software instructions of the one or more code modules.

[0054] The application program 133 is typically pre-installed and stored in the ROM 160 by a manufacturer, prior to distribution of the camera device 101. However, in some instances, the application programs 133 may be supplied to the user encoded on one or more CD-ROM (not shown) and read via the portable memory interface 106 of Fig. 1A prior to storage in the internal storage module 109 or in the portable memory 125. In another alternative, the software application program 133 may be read by the processor 105 from the network 120, or loaded into the controller 102 or the portable storage medium 125 from other computer readable media. Computer readable storage media refers to any non-transitory tangible storage medium that participates in providing instructions and/or data to the controller 102 for execution and/or processing. Examples of such storage media include floppy disks, magnetic tape, CD-ROM, a hard disk drive, a ROM or integrated circuit, USB memory, a magneto-optical disk, flash memory, or a computer readable card such as a PCMCIA card and the like, whether or not such devices are internal or external of the device 101. Examples of transitory or non-tangible computer readable transmission media that may also participate in the provision of software, application programs, instructions and/or data to the device 101 include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the Internet or Intranets including e-mail transmissions and information recorded on Websites and the like. A computer readable medium having such software or computer program recorded on it is a computer program product.

[0055] The second part of the application programs 133 and the corresponding code modules mentioned above may be executed to implement one or more graphical user interfaces (GUIs) to be rendered or otherwise represented upon the display 114 of Fig. 1 A. Through manipulation of the user input device 113 (e.g., the keypad), a user of the device 101 and the application programs 133 may manipulate the interface in a functionally adaptable manner to provide controlling commands and/or input to the applications associated with the GUI(s).

Other forms of functionally adaptable user interfaces may also be implemented, such as an audio interface utilizing speech prompts output via loudspeakers (not illustrated) and user voice commands input via the microphone (not illustrated).

[0056] Fig. IB illustrates in detail the embedded controller 102 having the processor 105 for executing the application programs 133 and the internal storage 109. The internal storage 109 comprises read only memory (ROM) 160 and random access memory (RAM) 170. The processor 105 is able to execute the application programs 133 stored in one or both of the connected memories 160 and 170. When the camera device 101 is initially powered up, a system program resident in the ROM 160 is executed. The application program 133 permanently stored in the ROM 160 is sometimes referred to as “firmware”. Execution of the firmware by the processor 105 may fulfil various functions, including processor management, memory management, device management, storage management and user interface.

[0057] The processor 105 typically includes a number of functional modules including a control unit (CU) 151, an arithmetic logic unit (ALU) 152, a digital signal processor (DSP) 1153 and a local or internal memory comprising a set of registers 154 which typically contain atomic data elements 156, 157, along with internal buffer or cache memory 155. One or more internal buses 159 interconnect these functional modules. The processor 105 typically also has one or more interfaces 158 for communicating with external devices via system bus 181, using a connection 161.

[0058] The application program 133 includes a sequence of instructions 162 through 163 that may include conditional branch and loop instructions. The program 133 may also include data, which is used in execution of the program 133. This data may be stored as part of the instruction or in a separate location 164 within the ROM 160 or RAM 170.

[0059] In general, the processor 105 is given a set of instructions, which are executed therein. This set of instructions may be organised into blocks, which perform specific tasks or handle specific events that occur in the camera device 101. Typically, the application program 133 waits for events and subsequently executes the block of code associated with that event. Events may be triggered in response to input from a user, via the user input devices 113 of Fig. 1 A, as detected by the processor 105. Events may also be triggered in response to other sensors and interfaces in the camera device 101.

[0060] The execution of a set of the instructions may require numeric variables to be read and modified. Such numeric variables are stored in the RAM 170. The disclosed method uses input variables 171 that are stored in known locations 172, 173 in the memory 170. The input variables 171 are processed to produce output variables 177 that are stored in known locations 178, 179 in the memory 170. Intermediate variables 174 may be stored in additional memory locations in locations 175, 176 of the memory 170. Alternatively, some intermediate variables may only exist in the registers 154 of the processor 105.

[0061] The execution of a sequence of instructions is achieved in the processor 105 by repeated application of a fetch-execute cycle. The control unit 151 of the processor 105 maintains a register called the program counter, which contains the address in ROM 160 or RAM 170 of the next instruction to be executed. At the start of the fetch execute cycle, the contents of the memory address indexed by the program counter is loaded into the control unit 151. The instruction thus loaded controls the subsequent operation of the processor 105, causing for example, data to be loaded from ROM memory 160 into processor registers 154, the contents of a register to be arithmetically combined with the contents of another register, the contents of a register to be written to the location stored in another register and so on. At the end of the fetch execute cycle the program counter is updated to point to the next instruction in the system program code. Depending on the instruction just executed this may involve incrementing the address contained in the program counter or loading the program counter with a new address in order to achieve a branch operation.

[0062] Each step or sub-process in the processes of the methods described below is associated with one or more segments of the application program 133, and is performed by repeated execution of a fetch-execute cycle in the processor 105 or similar programmatic operation of other independent processor blocks in the camera device 101.

[0063] The described methods may alternatively be implemented in whole or part in dedicated hardware such as one or more integrated circuits performing the functions or sub functions to be described. Such dedicated hardware may include graphic processors, digital signal processors, or one or more microprocessors and associated memories.

[0064] For example, the camera device 101 may effect an apparatus or means to implement the arrangements described herein, including parameter step size selection algorithmic processes to be described in hardware or firmware in order to capture pairs of images with different camera parameters, and to process the captured images to provide a depth map for various purposes. Such purposes may include artificially blurring the background of portrait photos to achieve a pleasing aesthetic, or attaching depth information as image metadata to enable various postprocessing operations. The camera device 101 hardware can select the capture parameters, then capture multiple images of a scene using the parameters. The captured images are suitable for application of the DFD processing. Processing occurs in the camera device 101 ’s embedded controller 102. Results of the processing may be retained in the memory 109 of the camera device 101, written to a memory card such as the portable storage medium 125, or other memory storage device connectable to the camera device 101. Alternatively, results of the processing may be uploaded to the cloud computing server 190 or the PC 195 for later retrieval by the user.

[0065] In another example, the PC 195 or the like may execute the arrangements described herein for selecting a parameter step size in software to enable calibration of the camera device 101 or the lens 180 for DFD during design or manufacturing. For example, the device 101 or the lens 180 may be used to capture images of a test chart, and the PC 195 may be used to measure performance of the device 101 including the optical transfer function (OTF) and the depth sensitivity of the OTF. The camera device 101 or lens 180 performance measurements may be used to select the parameter step size for various expected image capture device, lens and scene conditions to generate a look up table. The look up table of selected parameters may be retained in the memory 109, or the lens memory 184, or written to a memory card or other memory storage device such as the portable storage medium 125. At a later time, the selected parameter step size may be read from the look up table in the memory 109 and used to capture two or more images for processing in the device 101 using DFD to generate a depth map. In alternative implementations, the arrangements for selecting a parameter step size for the camera device may be implemented on the cloud computer 190, and the results transmitted to the camera device 101 for storage on the memory 109. Each of the camera device 101, the PC 195 and the cloud server computing server 190 effect an apparatus or means for selecting a parameter step size.

[0066] In another example, the selected parameter step size are selected on the desktop PC 195 or in the camera device 101, and the device 101 then captures two images in a focus bracket. The two images are then uploaded to the cloud computing server 190 where DFD is used to generate a depth map. In another example, the device 101 captures a series of images or video in which the focus changes between images or frames. After the series of images or video are captured the focus changes are analysed by the camera device 101 under execution of the processor 105 to select which frames likely give improved depth accuracy. The selected frames are then processed by the device 101 under execution of the processor 105 using DFD to generate a depth map.

[0067] The arrangements described hereafter are generally directed to selecting a focus step size. However, this is an example only. The arrangements described may also be used to select a step size for any camera capture parameter to capture two or more images. The varying parameter step sizes may include at least one of: focus step size, zoom step size, aperture step size, or any other camera setting that influences the amount of blur in the captured image. In some implementations, parameters, such as zoom in particular but also focus and potentially other parameters, the magnification of the captured images may be different. In such implementations one or more of the images may be scaled to bring the images substantially into registration before applying the DFD algorithm to determine a depth map.

[0068] The described methods use the geometry and optics of image capture devices such as the camera device 101. Most scenes that are captured using an image capture device, such as the camera device 101, contain multiple objects, which are located at various distances from the lens 180 of the device 101. Commonly, the electronic device 101 is focused on an object of interest in a scene. The object of interest is referred to as the subject of the scene.

[0069] Fig. 2 is a schematic diagram showing the geometrical relationships between the lens 180 of the camera device 101 and objects in a scene to be captured. The image sensor 189 is positioned at an image distance z, 225 behind the lens 180, which corresponds to an object distance za 235 to an object plane 250 in the scene. Any parts of the scene in the object plane 250 are at best focus in the image captured by the image sensor 189. A subject 240 at a different distance from the lens 180, the subject distance zos 255, is at best focus at an image subject plane 220 which is at an image subject distance zjs 245 behind the lens 180. The subject 240 is blurred on the image sensor 189 by a blur radius or which is related to an image defocus distance Zd 215 between the image subject plane 220 and the image sensor 189. The distances between the image planes and the corresponding object planes in the scene, including the object distance z0 235 and the image distance z, 225, are approximately determined by a thin lens law according to Equation (1), as follows

(1) where/is the focal length of the lens 180.

[0070] The principle of estimating the blur difference in depth from defocus can be explained using a convolution model of noise-free image formation. The first image gi(x, y) of a scene f(x, y) with spatial co-ordinates (x, y) can be modelled using Equation (2), as follows:

(2) where ® denotes convolution and h(x, y; zd, pi) is the defocus PSF for an object with image defocus distance Zd captured with a camera (image capture device) parameter setting pi. The parameter setting pi may be any setting which changes the rate of change of the PSF with respect to changes in image defocus distance. Example camera parameter settings with this property include the lens focal length/ the lens focus distance z, and a lens aperture diameter Av.

[0071] A second image g2(x, y) of the scene/(x, y) can be modelled using Equation (3), as follows:

(3) where ® denotes convolution and h(x, y; Zd, P2) is the defocus PSF for an object captured with camera (image capture device) parameterp2. The first and second images gi and g2 are called a focus bracket.

[0072] The spatial extent of the PSF determines the amount of blur for the subject 240 within the captured image. The amount of blur can be characterised using a blur radius. A Gaussian function of radius

may be used to approximate the PSF for the camera device 101 near best focus, in which case the blur radius can be defined as the standard deviation of the Gaussian function. Generally the blur radius is smallest at best focus, and increases monotonically with defocus. However, due to diffraction, the blur radius never reaches zero at best focus.

[0073] Equations (2) and (3) can be expressed in the spatial frequency domain in accordance with Equations (4), as follows:

(4) where capital letters denote Fourier transforms of the corresponding lower case functions in the spatial domain, G2 and G2 are the image spectra, His the optical transfer function (OTF), / is the scene spectrum, and (u, v) are co-ordinates in the spatial frequency domain. In general, the OTF is a complex function. Complex values in the OTF occur when there are odd or antisymmetric aberrations of the lens 180 such as coma or astigmatism. If the only aberration of the lens 180 is defocus, then the OTF is purely real, but OTF values may be positive or negative depending on the spatial frequency and the amount of defocus.

[0074] By assuming that the OTF and the scene spectra are non-zero and that there is no imaging noise, the ratio of the image spectra can be determined using Equation (5), as follows:

(5) where G21 is called a spectral ratio, and H2i is defined to be the relative OTF. The spectral ratio G21 is determined from a pair of captured images by selecting a corresponding tile in both images, performing a Fourier transform on the tiles, and then taking the ratio. The relative OTF depends on the unknown object defocus Zd and the known parameters pi and p2 (and the associated parameter step size) of the captured images gi and g2, and can be determined using calibrated tables of the OTF for each of the parameters and subject defocus. The OTF can be calibrated using a theoretical model for the lens, using Fourier optics or an approximation based on a Bessel function or a Gaussian function. Alternatively, the OTF can be calibrated using images captured of a test chart with a range of camera parameters and subject distances.

[0075] The unknown object defocus Zd can be determined from the captured images by finding the value of subject defocus zd which gives a best fit between the measured spectral ratio G21 and the relative OTF H2\. Using the determined subject defocus zd, the known parametersp\ can be used with the thin lens equation (Equation (1)) to determine the subject distance zos.

Such a process is known as depth from defocus (DFD) and can be repeated for multiple regions in the image to generate a depth map of the scene.

[0076] In practice, there will be imaging noise in the captured images and zeroes in the OTFs and scene spectrum, which will affect the accuracy of the estimate of the subject defocus za. Such effects can be reduced by applying a weighting function to the spectral ratio Gn when fitting the relative OTF, for example by identifying noisy spectral samples where the phase of G\2 exceeds a predetermined tolerance, and reducing the weight of those samples.

[0077] The depth accuracy of DFD depends on the capture conditions, including the parameters (and corresponding parameter step size) used to capture the focus bracket, the subject distance, the image capture device performance including the OTF and imaging noise, and the subject texture. The depth accuracy also depends on the DFD algorithm and the DFD algorithm parameters. One algorithm parameter is tile size. To improve the depth accuracy, a method is needed to select the camera parameters for capturing the focus bracket most likely to maximise depth accuracy and/or working range. One method of selecting the camera parameters includes determining the depth accuracy for multiple camera device and algorithm parameters, and selecting the parameters most likely to maximise accuracy. Such an approach requires a method to determine the depth accuracy.

[0078] The Cramer-Rao lower bound (CRLB) is a method for determining accuracy of a statistical estimation. The CRLB predicts the most accurate performance of an unbiased estimator, based on the sensitivity of the observed data to a parameter of the camera device 101 being determined. The CRLB for DFD depends on the (image capture) parameters of the device 101 and the algorithm parameters, but is independent of the algorithm. A definition of the CRLB is defined by Equation (6), as follows:

(6) where var is variance, zd is an estimate of the subject defocus parameter, E is an expectation value and p(x; zj is the probability distribution function (PDF) of the observation vector x given the subject defocus parameter value z&.

[0079] For DFD, the observed parameters are the spectral samples in G\ and G2, which are complex. The PDF for complex observations with zero-mean additive Gaussian noise is defined in accordance with Equation (7), as follows:

(7) where xn are the N observed complex values, μ„ is the mean of the nth observation and ση is the variance of the nth observation. The imaging noise can be approximated to be constant for the image tile, giving a variance σ. Using the imaging formation model described above, the CRLB for DFD can be expressed in accordance with Equation (8), as follows:

(8) where F(qm) is a deterministic scene spectrum at M spatial frequencies qm = (um, vm), where the spatial frequencies correspond to the spectral samples in a Fourier transform of an image tile used for depth estimation, #n(qm; ζα,Ρη) is the OTF for captured image notN with camera parameterspa of a subject at image defocus Zd, ( )" = δ2( )/dzd are the second order partial derivatives with respect to z&, and () is the complex conjugate.

[0080] Equation (8) is valid for an OTF with complex values. If the OTF is purely real, then the CRLB for DFD can be simplified in accordance with Equation (9), as follows:

(9) where Ηή (qm; zd, pn) = dHn/dzd is the first order partial derivative of the OTF of the «th captured image with respect to z&.

[0081] In general, the scene spectrum F is not known. However, a typical spectrum for natural scenes can be assumed, for example a spectrum with a 1/q falloff, where q = Vu2 + v2. The partial derivatives of the OTF are the depth sensitivity values. The partial derivatives of the OTF can be determined by analytical differentiation if an analytical approximation for the OTF is used, such as a Bessel function or a Gaussian function. Alternatively, if the OTF is determined numerically using Fourier optics theory, or if the OTF is measured using captured images, then the partial derivatives of the OTF can be determined numerically using finite difference.

[0082] The arrangements described herein improve upon some existing methods by selecting a parameter step size for camera parameters pi and p2 that are used to capture the focus bracket images, in order to improve or maximise the depth accuracy of DFD when applied to the captured images. One method of selecting the camera parameters is to determine the CRLB for DFD for a range of camera parameters and subject distances, and select the parameters which give a suitable depth accuracy. This selection can be performed by the camera device 101 just before the images are captured, using an estimate of the subject distance, for example, where the subject distance is estimated from the camera auto-focus (AF) system 183. Alternatively, parameter step size selection can be performed during design or manufacturing of the camera device 101, stored as a look up table in the memory 109 of the camera device 101. The selected parameters may be retrieved and used to control the camera device 101 during image capture.

[0083] One aspect of depth accuracy is a range of distances over which the depth map has an acceptable level of accuracy, called the working range. The working range of DFD depends on the camera device 101, scene and algorithm parameters. The CRLB for DFD can be used to select the parameters of the device 101 to improve the working range by determining the depth accuracy over a range of distances and selecting the parameters of the camera device 101, including corresponding parameter step size, which give the greatest average accuracy over the preferred range of distances, or selecting camera parameters which achieve a minimum level of accuracy over the preferred range of distances, or selecting camera parameters which achieve a minimum or predetermined level of accuracy over the widest range of distances.

[0084] Figs. 3 A and 3B show plots of example results obtained during camera parameter selection using arrangements described herein. The CRLB in the plots has been converted to 3-sigma (3σζ) which is 3 times the square root of the CRLB variance var(z). This CRLB was determined for the following camera, scene and algorithm parameters:

Table 1 [0085] The OTF used in the example of Figs. 3 A and 3B was determined using an approximation by Stokseth, based on Bessel functions. The PSF is determined from the OTF by performing a Fourier transform. The blur radius of the Stokseth OTF is always non-zero, including at best focus.

[0086] The CRLB for DFD was determined using Eq. (9) and plotted for a range of image defocus values za and a focus step size Azd = 30 pm in a plot 310 of Fig. 3 A. The 3-sigma is lowest near best focus, and increases rapidly with increasing defocus. The curve of the plot 310 is symmetric around best focus apart from a small positive offset of 15 pm. The positive offset is caused by the offset focus bracket, in which the first image is captured at zd and the second image is captured at Zd + Az&. This offset focus bracket is based on the principle that the first image should be captured with the main subject at best focus, so that the first image can be used as a high quality standard image. The second image is offset as the second image is used together with the first image to generate a depth map. The 3-sigma of the plot 310 has 2 minima, one on either side of best focus. A first minimum in the 3-sigma is caused by a maximum in the OTF derivative with respect to defocus corresponding to the first captured image, and the second minimum in the 3-sigma is caused by a maximum in the OTF derivative with respect to defocus corresponding to the second captured image.

[0087] The CRLB for DFD is plotted for a range of focus step sizes Aza at best focus (image defocus Zd= 0 pm) in a plot 320 in Fig. 3B. A selected focus step size 330 is at the minimum 3-sigma value 3σζ = 1.3 pm, giving Azd = 30 pm as the focus step size which can be used to achieve the depth accuracy for these camera, scene and algorithm parameters of Table 1 for a subject at best focus. Using such a method to select the focus step size at best focus is practical because the PSF has a non-zero blur radius at best focus. If the PSF had a zero blur radius at best focus, then selection of the focus step size would be unstable and selecting the focus step size likely to maximize depth accuracy may not be achieved.

[0088] The 3-sigma can be converted to object space by dividing the image space 3-sigma by the axial image magnification, being the square of the transverse image magnification. For the camera and scene parameters listed in Table 1, axial image magnification is (1/50)2, and the 3-sigma in object space at best focus is 3.3 mm.

[0089] The CRLB results in the plot 310 of Fig. 3 A can also be used to determine a working range. For example, if the maximum acceptable 3-sigma is 2 pm, then the working range is -0.10 pm < Zd < 0.13 pm. By determining the CRLB for a range of image defocus values and a range of camera parameters, the working range can be estimated for each set of camera parameters, and the camera parameters (and corresponding parameter step size) which achieve the widest operating working range can be selected.

[0090] Fig. 4 shows a method 400 for image capture including camera parameter step size selection for DFD. The method 400 may be implemented as one or more software application programs 133 executable within the embedded controller 102, controlled by execution of the processor 105.

[0091] The method 400 starts at setting step 410, where the user sets up the camera device 101 and aims the camera device 101 at the scene to be captured. Under execution of the processor 105, the method 400 progresses to setting step 420, in which parameters such as the focus, zoom and aperture of the camera device 101 are set. The focus, zoom and aperture of the camera device 101 may be set automatically by the camera device 101 based on the scene. For example, execution of the AF system 183 in the camera device 101 by the processor 105 may detect the distance of the main subject, and control the lens focus motor 182 to set the lens 180 to focus the main subject. Alternatively, the focus, aperture and zoom of the camera device may be set by manipulation of inputs of the camera device by the user, for example by using manual focus, exposure or zoom control, and detected by execution of the processor 105. The camera parameters of focus, aperture and zoom are recorded by the camera device 101 as camera parameters p\ on a memory of the camera device, such as the memory 109.

[0092] Under execution of the processor, the method 400 progresses from step 420 to selecting step 430, in which the focus step size is selected for camera device 101, scene and algorithm parameters and used to generate and record a set of camera parameters p2 to be used for capturing the second image. In the example of step 430, the processor 105 effectively executes a method of selecting a camera parameter step size, in this example a focus step size, for capturing two images for DFD. A method 500 of selecting a focus size for depth from defocus, as executed at step 430, is described below with reference to Fig. 5.

[0093] The method 400 progresses under execution of the processor 105 to capturing step 440, in which a first image of the scene is captured by the camera using camera parameters p\. The method 400 progresses under execution of the processor 105 from step 440 to changing step 450. At step 450 camera parametersp2 are set, including a focus change by the selected focus step size selected in step 430. In changing the focus by the selected step size, step 450 operates to change the focus parameter by the selected parameter step size. The method 400 progresses under execution of the processor 105 from step 450 to capturing step 460. In step 460, under execution of the processor 105, a second image of the scene is captured by the camera device 101 using camera parameters p2. The method 400 progresses under execution of the processor 105 from step 460 to step 470. At determining step 470, under execution of the processor 105, a depth map is determined using DFD from the first captured image and the second captured image and using information from the camera parameters pi and pi.

[0094] In an alternative implementation, in step 430, under execution of the processor 105, a step size in a combination of parameters, such as at least one of as focus step size, aperture step size, and zoom step size is selected for the camera device 101, scene and algorithm parameters, and the changed camera parameters are recorded in the memory 109 as camera parameters pi In another alternative implementation, under execution of the processor 105, additional changes in focus, zoom or aperture are selected in execution of step 430 and recorded on the memory 109 as additional camera parameters ρι,Ρι and so on. In such implementations, a corresponding number of images are captured using each set of parameters, and the captured images are used by DFD to generate a depth map.

[0095] In another alternative implementation, target depths with depth accuracy that fail to meet a performance target are identified, under execution of the processor 105, from amongst the expected depths in the scene during camera parameter selection for a focus bracket with 2 images. In such implementations, under execution of the processor 105, the target depths that fail to meet the performance target are used to select a third set of camera parameters p3 which provide improved depth accuracy for the target depths with poor accuracy when three images are captured and used with DFD to determine a depth map.

[0096] The DFD algorithm executed under the processor 105 in step 470 applies Gabor filter kernels to both captured images to determine a spectral ratio for each pixel in the first captured image. The phase of the spectral ratio is used to apply a weighting to the spectral ratio to reduce the effect of imaging noise, and a fit of the spectral ratio to a parabolic approximation of the relative OTF is used to estimate the image defocus Zd for each pixel in the first captured image. The thin lens Equation (1) is used to convert the image defocus Zd into a subject distance zos in object space and a depth map is generated.

[0097] In alternative implementations, the DFD algorithm executed on the processor 105 uses blur matching to determine a depth map. In blur matching, the second captured image is convolved with multiple Gaussian PSF kernels with different blur radii, creating a set of reblurred images. For each pixel in the first captured image, under execution of the processor, a blur radius is selected which gives the best match between the first captured image and the reblurred image. The blur radius at each pixel is used to estimate an image defocus Zd for each pixel, and a depth map is generated. In other implementations, another DFD algorithm may be used, such as a DFD algorithm selected from existing DFD algorithms.

[0098] The depth accuracy predicted by the CRLB is independent of the specific DFD algorithm. However, the depth accuracy for a specific algorithm and the depth accuracy according to the CRLB will generally have the same trends with respect to camera, scene and algorithm parameters. In this way the CRLB can be used to predict the appropriate camera parameters without knowing the specific DFD algorithm. This may be advantageous where the DFD algorithm may not be known at the time when the focus step size is selected. For example, the camera device 101 may, under execution of the processor 105, capture a focus bracket for DFD for later processing by a DFD algorithm on a cloud computing server, such as the cloud computing server 190. In this case, the DFD algorithm applied to the focus bracket under execution of the processor 105 may change between uses of the camera device 101 and the camera device 101 may not have access to which DFD algorithm will be applied to the focus bracket at capture time.

[0099] Alternatively, the relative accuracy of a DFD algorithm compared with the CRLB can be determined during design or manufacture of the camera device 101 by calibration. The relative accuracy can be stored as an accuracy factor in the camera memory 109. The CRLB depth accuracy can then be scaled by the accuracy factor during execution of camera parameter selection if the DFD algorithm to be used is known by the camera device 101. Such a method can be used when the depth accuracy is required to be selected to meet a specific performance goal.

[00100] The method 500 of selecting a focus step, as executed at step 430, will now be described with reference to Fig. 5. The method of Fig. 5 may be implemented as one or more submodules of the application 133 of the embedded controller 102. Execution of the method 500 is controlled under execution of the processor 105.

[00101] The method 500 starts at selecting step 510, at which, under execution of the processor 105, a test value is selected by execution of the application 133. In selecting the test value, at least one focus step size and at least one depth, being a target depth, are selected. The focus step size may be selected from a range of possible step sizes. The range of focus step sizes may be predetermined, for example the focus step sizes may be an integer multiple of a step size of the lens focus motor 182, up to a predetermined limit on the total focus step size. The focus step size selected in execution of step 510 is used to generate a set of test camera parameters p{ based on applying the focus step to camera parameters p\, as described hereafter. The target depth may be set using the main subject distance, as may be determined by execution of a camera AF system on the processor 105. The focus step size and the target depth may be selected relative to a given focus depth, such as a best focus depth.

[00102] In other implementations, the target depth may be selected from a range of target depths according to a likely range for the scene. For example, if the camera device 101 is in portrait shooting mode then the range of depths may be smaller than if the camera device 101 is in outdoor shooting mode. If the selection of focus step size is based on maximising average accuracy over a predetermined working range, then the range of depths may be the predetermined working range. The range of depths may be set using a range of depths on both the near and far side of the main subject. In alternative implementations, the range of depths may be set in other ways using the camera parameters. The selected target depth is converted to image space defocus.

[00103] Under execution of the processor 105, the method 500 progresses from step 510 to determining step 520. At step 520, under execution of the processor 105, a plurality of depth sensitivity measurements are determined for each of a plurality of spatial frequencies for each set of test camera parameters p\ and /V· The plurality of depth sensitivity measurements are determined for the focus step size and the target depth of the test value. The depth sensitivity measurements are determined from the OTF associated with the camera device 101 and the derivatives of the OTF with respect to image defocus zd. If the OTF is complex, then the second order derivatives from within Equation (8) are determined. If the OTF is purely real, then the first order derivative from within Equation (9) is determined. In step 520, the processor 105 executes to determine a plurality of depth sensitivity measurements of the camera device 101 for each of the spatial frequencies of the OTF.

[00104] The OTF is determined using the Stokseth approximation and the OTF derivatives are determined numerically using finite difference. The PSF associated with the OTF at a given depth, in this instance the best focus depth, has a non-zero radius. In some implementations, the OTF is determined using a Gaussian PSF approximation, where a Gaussian beam waist is used which has a non-zero blur radius at best focus. In further alternative implementations, the OTF and the OTF derivatives are retrieved from a look up table in the camera memory 109 using the camera parameters as an index, where the look up table was populated during camera manufacture by calibrating the OTF and the OTF derivative using captured images of a test chart. In further alternative implementations, a radially symmetric OTF approximation is used, to reduce memory requirements and processing time in execution of the step 520.

[00105] The OTF may be spatially varying across the image due to aberrations of the lens 180. If a complex OTF is used in execution of step 520 then the typical symmetric and antisymmetric aberrations of the lens 180 can be represented using a spatially varying function or look up table. The depth accuracy determined for different image positions can take account of the effect of spatially varying aberrations.

[00106] The range of spatial frequencies for which the depth sensitivity measurements are determined is set by the image sensor 189 pixel size and the DFD algorithm tile size. The image sensor 189 pixel size is used, by execution of the processor 105, to determine the Nyquist frequency wN which is used as the maximum spatial frequency in the range of spatial frequencies. The algorithm tile size sets the number of spatial frequencies in the range. For example, if the tile size is 16x16, then the spatial frequencies are a 16x8 grid of evenly spaced samples with -mn < u < mn and 0 < u < vN. Because the observed spectral samples are based on the Fourier transform of a purely real intensity from the sensor, the vertical range is half the horizontal range to avoid including Hermitian copies. The DC value at (w, v) = (0, 0) is dropped because it is not sensitive to depth, giving a total of 127 spectral samples in the range of spatial frequencies.

[00107] The method 500 progresses under execution of the processor from step 520 to applying step 530. At step 530, under execution of the processor 105, the application 133 weights the depth sensitivity measurements. In weighting the depth sensitivity measurements, the application 133 firstly determines scene spectral weightings for the range of spatial frequencies by selecting a deterministic spectral scene function associated with the scene. The scene spectrum function is selected to use a typical spectrum for a natural scene. Pseudorandom noise with a uniform distribution is used to create a test pattern the same size as the DFD algorithm tile size. The test pattern is Fourier transformed, a spectral weighting of \/q is applied to simulate a natural scene spectrum, and the pattern is inverse Fourier transformed. The mean of the test pattern is set to the mean of the camera exposure, and the standard deviation is set to 20% of the exposure mean to simulate a high contrast scene texture. The mean of the camera exposure is retrieved from the camera exposure system. The test pattern is Fourier transformed to create a scene spectral weighting for each spatial frequency in the range of spatial frequencies. The application 133 executes on the processor 105 to multiply the determined depth sensitivities by the scene spectral weightings to create weighted depth sensitivities. In multiplying the depth sensitivities by the scene spectral weightings, the application 133 operates to weight the depth sensitivity measurements by the selected deterministic spectral scene function. The process of step 520 is repeated under execution of the processor 105 with multiple scene spectral weightings generated with different seeds for the pseudorandom noise, for example the process is repeated 100 times, generating an average of the weighted depth sensitivities.

[00108] Under execution of the processor 105, the method 500 progresses from step 530 to determining step 540. In step 540, under execution of the processor 105, depth accuracy is determined from the plurality of depth sensitivity measurements by the weighted depth sensitivities, or the average of the weighted depth sensitivities, for the test value using the CRLB using Equation (8) if the OTF is complex or Equation (9) if the OTF is purely real. The sensor noise ση is set using a shot noise model based on the exposure mean. For example, if the exposure mean is five thousand (5000) electrons, then the sensor noise is σα = sqrt(5000) = 70 electrons. The electron counts are converted from sensor digital number counts using a predetermined conversion rate for the camera sensor. The depth accuracy is stored in a results table associated with the test value on the memory 109 in execution of the step 540.

[00109] Under execution of the processor 105, the method 500 progresses from step 540 to decision step 550. In step 550 the application 133 is executed on the processor 105 to make a decision whether to determine the depth accuracy for more test values. The decision to stop testing values is made if the depth accuracy has been determined for all of the set of test camera parameters and range of depth values. In other implementations, the decision to stop testing values is made in execution of the step 550 if the determined depth accuracy has reached a predetermined threshold for acceptable performance. If more test values are determined, the method 500 returns, under execution of the processor 105, to step 510. Otherwise, the method 500 progresses under execution of the processor 105 to selecting step 560.

[00110] In step 560, under execution of the processor 105, a focus step size is selected based upon the determined depth accuracy. Selection of the focus step size is based upon the depth accuracy at a target depth. For example, the test value having the focus step size with the greatest depth accuracy at the target depth may be selected. The minimum depth accuracy in the stored results table is selected, and the corresponding test value is selected as the set of camera parameters pi to be used for capturing the second image in step 460. In some implementations, the test value is selected to select the focus step size based upon the working range, for example the focus step size which provides the greatest working range of the test values. Alternatively, the test value is selected to select the focus step size based upon combined depth accuracy and working range of the test values.

[00111] In an alternative implementation, the scene spectral weighting may be determined by the camera device 101 according to the scene in front of the camera device 101. Fig. 6 shows operation of a method 600 for DFD using such an implementation. The method of Fig. 6 may be implemented as one or more submodules of the application 133 of the embedded controller 102. Execution of the method 500 is controlled under execution of the processor 105.

[00112] The method 600 generally operates in a similar manner to the method 400 of Fig. 4.

The method 600 starts at setting step 610, where, similarly to step 410, the user sets up the camera device 101 and aims the camera device 101 at the scene to be captured. Under execution of the processor 105, the method 600 progresses to setting step 620, in which, similarly to step 420, the focus, zoom and aperture of the camera device 101 are set. The focus, zoom and aperture of the camera device 101 may be set automatically by the camera device 101 based on the scene. For example, the focus may be set by execution of the AF system 183 in the camera device 101 by the processor 105 or by manipulation of inputs of the camera device by the user. The camera device 101 parameters of focus, aperture and zoom are recorded by the camera device 101 as camera parametersp\ on the memory 109.

[00113] Under execution of the processor 105, the method 600 progresses from step 620 to capturing step 640 to capture a first image of the scene. In the method 600, the camera device lOluses the first captured image to determine the scene spectral weighting (in contrast to Fig. 4). In the method 600, the step 640 of capturing the first image (similar to step 440) is executed before step 630 (similar to step 430) in which the focus step is selected.

[00114] Under execution of the processor 105, the method 600 progresses from step 640 to selecting step 630. Step 630 operates in a similar manner to step 430 of Fig. 4, as described by the method 500 of Fig. 5. However, in execution of the step 630, in contrast to the step 530 (of the method 500), the first captured image is used to determine the scene spectral weighting by taking a tile from the region of the main subject in the first captured image, and Fourier transforming the tile to create a captured image spectrum. The image spectrum may be noisy due to sensor noise, so the captured image spectrum is smoothed using a convolution with a Gaussian function. In some implementations the captured image spectrum is smoothed by averaging over the polar axis. Since the main subject is at best focus, the OTF is known from theory or calibration, and the captured image spectrum is divided by the best focus OTF to create a scene spectral weighting. The scene spectral weighting is executed by multiplying the depth sensitivities to create weighted depth sensitivities.

[00115] In alternative implementations, in execution of the step 630 on the processor 105, multiple tiles are selected from the region of the main subject in the first captured image, and the tiles are used to create multiple scene spectral weightings and multiple weighted depth sensitivities.

[00116] In alternative implementations, in execution of the step 630 on the processor 105, the captured image spectrum is smoothed and the highest values (apart from the DC value) in the smoothed captured image spectrum are used to fit a natural scene spectrum function. The fitted natural scene spectrum function is selected as the scene spectral weighting. In the implementations described above, in step 630 the processor 105 executes to select the deterministic scene functions based upon the scene of a prior image, being the first captured image.

[00117] In alternative implementations, in execution of the step 630 on the processor 105, the scene spectral weighting is retrieved from a look up table stored in the camera memory. The index to the look up table is the current camera shooting mode. In such implementations the method 630 operates to select the deterministic scene function according to a mode of the camera device 101, being the current shooting mode. In further alternative implementations, the index to the look up table is the scene type detected using scene analysis. For example, the look up table may have different stored scene spectral weightings for portrait mode, macro mode and landscape mode. The retrieved scene spectral weighting is then applied by multiplying the depth sensitivities to create weighted depth sensitivities. In such implementations, the application 133 executes to select the deterministic scene function based upon the scene of a captured image [00118] In execution of the method 600 on the processor 105, the first captured image can also be used to select a range of target depths during camera parameter selection. For example, a single image DFD algorithm can be used to determine the range of depths from the first captured image. Although the depth map determined from a single image will generally have decreased accuracy than the depth map determined from multiple images, the range of depths determined using single image DFD can be used to determine a target depth or working range goal for the camera parameter selection.

[00119] In single image DFD the captured image is blurred using convolution with a Gaussian kernel with a standard deviation equal to a predetermined blur radius σ0, forming a reblurred image. An example value for σ0 is one (1) pixel. The gradient magnitude of the captured image is divided by the gradient magnitude of the reblurred image, forming a gradient magnitude ratio image. Edge locations in the captured image are detected, for example using Canny edge detection. For each edge location, the gradient magnitude ratio image is used to estimate the blur radius in the captured image using Equation (10):

(10) where ογ(χ, y) is an image containing the pixelwise estimated blur radius, σ0 is the predetermined reblur radius and R(x, y) is the gradient magnitude ratio image. The result is a sparse depth map, where the depth is expressed in the form of blur radius amounts.

[00120] The method 600 progresses under execution of the processor 105 from step 630 to changing step 650. At step 650, similarly to step 450, camera parameters pi are set, including a focus change by the selected focus step size selected in step 630. In changing the focus by the selected step size, step 650 operates to change the focus parameter by the selected parameter step size. The method 600 progresses under execution of the processor 105 from step 650 to capturing step 660. In step 660, under execution of the processor 105, a second image of the scene is captured by the camera device 101 using camera parameters/¾. The method 600 progresses under execution of the processor 105 from step 660 to determining step 670, in which (similarly to step 470) a depth map is determined using DFD from the first captured image and the second captured image and using information from the camera parameters p\ and Pi- [00121] In a third implementation, camera parameter selection is performed during design or manufacture of the camera device 101. Fig. 7 shows operation of a method 700 for DFD using such an implementation. The method of Fig. 7 may be implemented as one or more submodules of the application 133 of the embedded controller 102. Execution of the method 700 is controlled under execution of the processor 105.

[00122] The method 700 generally operates in a similar manner to the method 400 of Fig. 4.

The method 700 starts at setting step 710, where (similarly to step 410) the user sets up the camera device 101 and aims the camera device 101 at the scene to be captured. Under execution of the processor 105, the method 700 progresses to setting step 720, in which (similarly to step 720) the focus, zoom and aperture of the camera device 101 are set. The focus, zoom and aperture of the camera device 101 may be set automatically by the camera device 101 based on the scene. For example, execution of the AF unit in the camera device 101 by the processor 105 may detect the distance of the main subject, and control the lens focus motor 182 to set the lens 180 to focus the main subject. Alternatively, the focus, aperture and zoom of the camera device may be set by manipulation of inputs of the camera device by the user, for example by using manual focus, exposure or zoom control, and detected by execution of the processor 105. The camera parameters of focus, aperture and zoom are recorded by the camera device 101 as camera parameters pi on the memory 109.

[00123] Under execution of the processor 105, the method 700 progresses from step 720 to retrieving step 730. The method 700 differs from the method 400 in step 730. In execution of step 730, the focus step size is retrieved from a look up table stored in a camera or lens memory, such as the memory 109 or the lens memory 184. In such implementations, the steps of the method 500 of Fig. 5 are carried out during design or manufacture of the camera device 101 for a set of test values based on multiple sets of camera, scene and algorithm parameters which represent the likely usage conditions for the camera. A focus step size determined likely to give the greatest depth accuracy is stored in the look up table in the appropriate camera memory for each set of camera, scene and algorithm parameters, for later use according to the method 700 when the camera device lOlis used to capture images and determine depth maps. Alternatively, the camera device 101 may have interchangeable lenses, in which case the focus step size selection is carried out for each lens, and a look up table of focus steps sizes is stored in the lens memory 184.

[00124] In another implementation, nominal camera and scene parameters including the aperture, focal length and main subject distance are selected during calibration of the camera device 101, and camera parameters are selected for DFD using the steps of the method 500 of Fig. 5 for the nominal parameters to create a nominal set of camera parameters. Empirical rules for scaling the depth accuracy according to the aperture, focal length and main subject distance are established during camera design. During image capture, the nominal parameters are scaled under execution of the processor 105 to the actual capture parameters using the empirical rules for scaling depth accuracy.

[00125] The method 700 progresses under execution of the processor 105 to capturing step 740, in which a first image of the scene is captured by the camera using camera parameters p\. The method 700 progresses under execution of the processor 105 from step 740 to changing step 750. At step 750, similarly to step 450, camera parameters p2 are set, including a focus change by the retrieved focus step size retrieved in step 730. In changing the focus by the retrieved step size, step 750 operates to change the focus parameter by the retrieved parameter step size. The method 700 progresses under execution of the processor 105 from step 750 to capturing step 760. In step 760, under execution of the processor 105, a second image of the scene is captured by the camera device 101 using camera parameters p2. The method 700 progresses under execution of the processor 105 from step 760 to determining step 770, in which (similarly to step 770) a depth map is determined using DFD from the first captured image and the second captured image and using information from the camera parametersp\ andp2.

[00126] The methods disclosed are normally executed just before a desired image is captured through assertive user action. The user points the camera device 101 at a scene to be photographed. The processor 105 executes a method, such as the method 500, to select a parameter step size, and typically captures two reference images (or a focus bracket), such as the first image and the second image of steps 440 and 460 of the method 400. The processor 105 executes to determine a depth map using the two captured reference images (step 470). Upon receiving an assertive instruction from the user to capture an image, for example by user manipulation of the input device 113 (typically depressing an image capture button), the processor 105 executes to capture the desired image of the scene and processes the captured image to apply blur to the captured image according to the determined depth map.

[00127] In an example use case of the disclosed methods of camera parameter step size selection for DFD, the depth map estimated using DFD is used, under execution of the processor 105, to enhance a portrait image. The user sets portrait shooting mode on the camera device 101 and points the camera device 101 at a person to be photographed. The camera device 101 identifies the main subject using face detection and performs AF. The application 133 is executed on the processor 105 such that a camera parameter step size is selected for the greatest depth accuracy for the main subject, and a focus bracket is captured. The focus bracket is used to determine a depth map under execution of the processor 105 using DFD. All of this occurs prior to user of the camera device 101 depressing a capture button to capture a desired imaged of the person. Because the parameter step size of the focus bracket was selected for the main subject, the depth map can be used by the camera device 101 to make a segmentation of the subject from the background. The camera device 101, under execution of the processor 105, applies additional blur to the background in order to emphasise the appearance of the main subject in the image.

[00128] In another example use case, a depth map determined using DFD is used to enhance a photograph of a group of people. The people in the group are at a range of different distances in the scene. The camera device 101 detects several people in the photograph by execution of a face detection process on the processor 105, and executes AF on the processor 105 to focus on the person closest to the centre of the image. Execution of AF the camera device 101 is also used to generate a low resolution depth map using a subset of pixels on the image sensor 189 which are masked and can be used for phase detection AF. Because there are multiple people in the photograph, the application 133 executes to select a camera parameter step size for the greatest depth accuracy over a working range which includes the distances of the faces in the photograph image, and a focus bracket is captured. Because the parameter step size of the focus bracket was selected for working range, the depth map can be used by the camera device 101 to make a segmentation of multiple subjects from the background. All of this occurs prior to user of the camera device 101 depressing a capture button to capture a desired imaged of the group of people. The camera device 101, under execution of the processor 105, applies additional blur to the background in order to emphasise the appearance of the subjects in the image.

[00129] The arrangements described are applicable to the image processing industries and particularly for the photograph and video processing industries.

[00130] The foregoing describes only some embodiments of the present invention, and modifications and/or changes can be made thereto without departing from the scope and spirit of the invention, the embodiments being illustrative and not restrictive.

Claims

1. A method of selecting a focus step size for capturing at least two images of a scene using an image capture device, said method comprising the steps of: determining, for at least one focus step size and at least one depth, a plurality of depth sensitivity measurements for the image capture device for each of a plurality of spatial frequencies from an optical transfer function associated with the image device, wherein a point spread function, associated with the optical transfer function at a given focus depth, has a nonzero radius; selecting a deterministic spectral scene function associated with the scene; determining depth accuracy from the plurality of determined depth sensitivity measurements, said depth accuracy being determined by weighting the depth sensitivity measurements with the selected deterministic spectral scene function; and selecting, based on the determined depth accuracy, the focus step size from at least one focus step size for capturing the at least two images of the scene.

2. A method of selecting a parameter step size for capturing at least two images of a scene using an image capture device, said method comprising the steps of: determining, for at least one parameter step size and at least one depth, a plurality of depth sensitivity measurements for the image capture device for each of a plurality of spatial frequencies from an optical transfer function associated with the image device, wherein a point spread function, associated with the optical transfer function at a given focus depth, has a nonzero radius; selecting a deterministic spectral scene function associated with the scene; determining depth accuracy from the plurality of determined depth sensitivity measurements, said depth accuracy being determined by weighting the depth sensitivity measurements with the selected deterministic spectral scene function; and selecting, based on the determined depth accuracy, the parameter step size from at least one parameter step size for capturing the at least two images of the scene.

3. The method according to claim 2, wherein the selected parameter step size is a zoom step size.

4. The method according to claim 2, wherein the selected parameter step size is an aperture step size.

5. The method of claim 2, wherein the selected parameter step size is a step size of at least one of focus step size, aperture step size and zoom step size.

6. The method according claim 2, wherein the parameter step size is selected based upon the depth accuracy at a target depth.

7. The method according to claim 6, wherein the target depth corresponds to a main subject of the scene determined from operation of an auto-focus system of the image capture device.

8. The method according to claim 6, wherein the target depth corresponds to a depth determined by the image device from a captured image using single image depth from defocus.

9. The method according claim 2, wherein the parameter step size is selected based upon a working range of the determined depth accuracy.

10. The method according to claim 2, wherein the deterministic scene function uses a natural scene spectrum function.

11. The method according to claim 2, wherein the deterministic scene function is selected according to a mode of the image capture device.

12. The method according to claim 2, wherein the deterministic scene function is selected based upon the scene of a prior captured image.

13. The method according to claim 2, wherein the optical transfer function with a non-zero blur radius at the given depth is determined using a Stokseth approximation.

14. The method according to claim 2, wherein the optical transfer function with a non-zero blur radius at the given depth is determined using a Gaussian approximation.

15. The method according to claim 2, wherein the optical transfer function with a non-zero blur radius at the given depth is determined using Fourier optics.

16. The method according to claim 2, wherein the optical transfer function with a non-zero blur radius at the given depth is determined using a Gaussian beam waist having a non-zero blur radius.

17. The method according to claim 2, wherein the depth sensitivity measurements are determined using a complex optical transfer function.

18. The method according to claim 2, wherein the depth sensitivity measurements are determined using a spatially varying optical transfer function.

19. A method of capturing an image using an image capture device, comprising: directing the image capture device at a scene, the image capture device selecting a parameter step size by execution of the method according to claim 2, the image capture device capturing two reference images according to the selected focus step size and determining a depth map from the two captured reference images, upon receiving an instruction from the user to capture an image, the image capture device capturing an image of a scene and processing the captured image to apply blur to the captured image according to the determined depth map.

20. A computer readable medium having a computer program stored thereon for selecting a parameter step size for capturing at least two images of a scene using an image capture device, said program comprising: code for determining, for at least one parameter step size and at least one depth, a plurality of depth sensitivity measurements of the image capture device for each of a plurality of spatial frequencies from an optical transfer function associated with the image device, wherein a point spread function, associated with the optical transfer function at a given focus depth, has a non-zero radius; code for selecting a deterministic spectral scene function associated with the scene; code for determining depth accuracy from the plurality of determined depth sensitivity measurements, said depth accuracy being determined by weighting the depth sensitivity measurements with the selected deterministic spectral scene function; and code for selecting, based on the determined depth accuracy, the parameter step size from at least one parameter step size for capturing the at least two images of the scene.

21. An apparatus for selecting a parameter step size for capturing at least two images of a scene using an image capture device, said apparatus comprising: means for determining, for at least one parameter step size and at least one depth, a plurality of depth sensitivity measurements of the image capture device for each of a plurality of spatial frequencies from an optical transfer function associated with the image device, wherein a point spread function, associated with the optical transfer function at a given focus depth, has a non-zero radius; means for selecting a deterministic spectral scene function associated with the scene; means for determining depth accuracy from the plurality of determined depth sensitivity measurements, said depth accuracy being determined by weighting the depth sensitivity measurements with the selected deterministic spectral scene function; and means for selecting, based on the determined depth accuracy, the parameter step size from at least one parameter step size for capturing the at least two images of the scene.

22. A system for selecting a parameter step size for capturing at least two images of a scene using an image capture device, said system comprising: a memory for storing data and a computer program; a processor coupled to the memory for executing said computer program, said computer program comprising instructions for: determining, for at least one parameter step size and at least one depth, a plurality of depth sensitivity measurements of the image capture device for each of a plurality of spatial frequencies from an optical transfer function associated with the image device, wherein a point spread function, associated with the optical transfer function at a given focus depth, has a non-zero radius; selecting a deterministic spectral scene function associated with the scene; determining depth accuracy from the plurality of determined depth sensitivity measurements, said depth accuracy being determined by weighting the depth sensitivity measurements with the selected deterministic spectral scene function; and selecting, based on the determined depth accuracy, the parameter step size from at least one parameter step size for capturing the at least two images of the scene.