GB2603115A - Imaging device - Google Patents

Imaging device Download PDF

Info

Publication number
GB2603115A
GB2603115A GB2100567.3A GB202100567A GB2603115A GB 2603115 A GB2603115 A GB 2603115A GB 202100567 A GB202100567 A GB 202100567A GB 2603115 A GB2603115 A GB 2603115A
Authority
GB
United Kingdom
Prior art keywords
image
images
single exposure
composite
image capture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
GB2100567.3A
Other versions
GB202100567D0 (en
Inventor
Helweg-Larsen Timothy
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Expodo Ltd
Original Assignee
Expodo Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Expodo Ltd filed Critical Expodo Ltd
Priority to GB2100567.3A priority Critical patent/GB2603115A/en
Publication of GB202100567D0 publication Critical patent/GB202100567D0/en
Priority to PCT/IB2022/050359 priority patent/WO2022153270A1/en
Publication of GB2603115A publication Critical patent/GB2603115A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/70Circuitry for compensating brightness variation in the scene
    • GPHYSICS
    • G03PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
    • G03BAPPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
    • G03B15/00Special procedures for taking photographs; Apparatus therefor
    • GPHYSICS
    • G03PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
    • G03BAPPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
    • G03B17/00Details of cameras or camera bodies; Accessories therefor
    • GPHYSICS
    • G03PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
    • G03BAPPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
    • G03B7/00Control of exposure by setting shutters, diaphragms or filters, separately or conjointly
    • G03B7/08Control effected solely on the basis of the response, to the intensity of the light received by the camera, of a built-in light-sensitive device
    • G03B7/091Digital circuits
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/414Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/414Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
    • H04N21/41407Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance embedded in a portable device, e.g. video client on a mobile phone, PDA, laptop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42204User interfaces specially adapted for controlling a client device through a remote control device; Remote control devices therefor
    • H04N21/42206User interfaces specially adapted for controlling a client device through a remote control device; Remote control devices therefor characterized by hardware details
    • H04N21/42222Additional components integrated in the remote control device, e.g. timer, speaker, sensors for detecting position, direction or movement of the remote control, microphone or battery charging device
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/4223Cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/426Internal components of the client ; Characteristics thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440245Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display the reformatting operation being performed only on part of the stream, e.g. a region of the image or a time segment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/485End-user interface for client configuration
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8146Monomedia components thereof involving graphical data, e.g. 3D object, 2D graphics
    • H04N21/8153Monomedia components thereof involving graphical data, e.g. 3D object, 2D graphics comprising still images, e.g. texture, background image
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/63Control of cameras or camera modules by using electronic viewfinders
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/70Circuitry for compensating brightness variation in the scene
    • H04N23/73Circuitry for compensating brightness variation in the scene by influencing the exposure time
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/70Circuitry for compensating brightness variation in the scene
    • H04N23/75Circuitry for compensating brightness variation in the scene by influencing optical camera components
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/80Camera processing pipelines; Components thereof
    • GPHYSICS
    • G03PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
    • G03BAPPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
    • G03B30/00Camera modules comprising integrated lens units and imaging units, specially adapted for being embedded in other devices, e.g. mobile phones or vehicles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42203Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42204User interfaces specially adapted for controlling a client device through a remote control device; Remote control devices therefor
    • H04N21/42206User interfaces specially adapted for controlling a client device through a remote control device; Remote control devices therefor characterized by hardware details
    • H04N21/42224Touch pad or touch panel provided on the remote control

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Graphics (AREA)
  • General Engineering & Computer Science (AREA)
  • Studio Devices (AREA)
  • Exposure Control For Cameras (AREA)

Abstract

A method for operating a computational imaging device to display, in near-real-time, a scene video comprising a series of composite images each formed from a series of single exposure images, the device comprising: an image capture device, capturing multiple single exposure images; combiner, combining a series of single exposure images forming the composite image; display, for a series of composite images as a video frame series; and presentation timer, timing composite image presentation. The method comprises: a) capturing series of single exposure images; b) combining N number of substantially most recent single exposure images, making the composite image, wherein single exposure images were captured within a total exposure time duration, DT, (composite images depict the scene over the total exposure time duration, DT), with DT substantially constant between consecutive combining operations; c) calculating plural video frame presentation start times, TV, using presentation timer, wherein TV < TR + DT, TR being the moment, real-time, when light recorded by one of the single exposure images first reaches the imaging device, to present composite images in near-real-time, and individual video frames depict the scene over a duration, DT, greater than a presentation duration, DV, of the individual frame; d) presenting a series of composite images as frames, at each of presentation start times, TV.

Description

IMAGING DEVICE
Technical Field
The present invention relates to methods of operating computational imaging devices and computational imaging devices.
Background
A computational imaging device is a device for creating images of a scene that may be stored locally, transmitted to another location, or both. The images may be individual still images or sequences of images constituting videos or movies.
The term "computational imaging device", as used herein, therefore refers to any lo device that can be used to capture, using one or more image capture devices, a plurality of images of a scene (i.e. that is within a field of view of the image capture device) and combine these captured images to form a composite image or sequence of composite images forming a video or movie, and includes photographic still cameras that have electronic image sensors, digital cameras, is video or movie cameras, camcorders, telescopes, microscopes, endoscopes, virtual cameras (used within virtual environments), computing devices with built-in cameras (including smadphones, smart watches, tablets, laptops, desktops, head mounted displays).
Figure 12 illustrates schematically a conventional Image Capture Device (30) zo comprising an enclosure (11), a lens (12), an optical diaphragm (13) defining an aperture (14), a shutter (electronic, computational, or physical) 15 and an image sensor (16). The lens (12) focuses the light reflected or emitted from objects that make up a scene that is within a field of view of the image capture device onto the image sensor (16). The shutter (15) controls the length of time that light can enter the image capture device (10). The size of the aperture (14) defined by the optical diaphragm (13) regulates the amount of light that passes through the lens (12) and controls the depth of field, which is the distance between the nearest and farthest objects in a scene that appear to be in focus. Reducing the size of the aperture (14) increases the depth of field, and conversely increasing the size of the aperture (14) decreases the depth of field. Depending upon the type of image capture device, the image sensor (16) will be either a light-sensitive material such as on an electronic image sensor (e.g. a charge-coupled device (CCD) image sensor or an active-pixel sensor (APS)/CMOS image sensor, or quantum dot sensor).
A conventional computational imaging device comprises one or more image 35 capture devices (30), a combiner (40), a display (60).
The one or more image capture devices (30) capture a plurality of images which are combined by the combiner (40) and presented to the user using the display.
Desirable Attributes of a Computational Imaging Device: When a human is operating a computational imaging device it is desirable to have a 40 video presentation of the Scene, which is at the same time: a) live and b) whose frames, from time to time, contain information gathered over an Extended Exposure Duration greater than frame presentation duration (1/ frame presentation rate), and c) accurately depicts the view of the scene as it will be outputted, saved, or transmitted or in the case of a Preview accurately depicts the qualities of the image (such as motion blur, image noise, depth of field) as they will apply to a final outputted, saved or transmitted image.
Conventional devices and methods do not combine all three of these attributes.
LIVE
io Displaying a live video on either an image capture device or a computational imaging device is useful because it gives the user feedback in near-real-time, such that they can see and efficienty respond to changes including in any of: * scene composition * effect of primary camera settings including aperture, time, ISO, * effect of combining methods such for example as mean, or median, and combining method settings such as for example N number of frames combined.
For a video to be perceived by the user as live (i.e. in Near-Real-Time) the presentation of zo its frames should have two linked attributes of a high frame rate, and low latency.
* A high frame rate being generally greater than 10fps, and preferably greater than 24fps such as to help create for the user an illusion of continuity.
* Low latency being a minimal delay between Real Time and the time at which a frame starts to be presented of generally less than 1/10th sec and preferably less than 1/24th sec.
Extended Exposure Duration In conventional image capture devices and their displays, exposure duration is often equal to frame presentation duration. While short frame presentation duration is desirable to enable high frame rate, low latency, and the perception of being 'live,' a short exposure duration has the problem of limiting the amount of light captured. And the quantity of light captured is (under conditions of low scene luminance) positively correlated with image quality.
smartphone and wearable image captured devices are further limited in their light gathering ability and therefore in their image quality potential by having relatively small image sensors (as compared to DSLR [Digital Single Lens Reflex]cameras). A smartphone image sensor typically has approximately 1/30th the light sensitive surface are of a DSLR. Conventional computational imaging devices often use relatively small image capture devices to enable a small, compact, or flat form factor. Smaller image capture devices are typically also cheaper than larger equivalents.
These relatively smaller image sensors capture relatively less light and therefore produce relatively poorer quality images than large sensors such as those found on DSLR cameras.
This light gathering deficiency can be countered by capturing light from multiple image sensors and or by extending the amount of time over which the image sensor(s) are exposed to light. Extending the exposure time duration however leads to two problems: 1) the photosensitive elements of the image sensor can reach a maximum capacity and then lose their sensitivity to further light. And 2) movement of the scene or image capture device relative to one another can be captured as motion blur in the image. In many use cases, motion blur represents a reduction in image quality.
Both problems, over exposure and motion blur, can be reduced by taking not one long exposure but multiple short exposures and then combining them to form a composite ss image.
Thus it is desirable that a computational imaging device be capable of extended total exposure durations (the sum of a number of single exposures), preferably to at least longer than frame presentation duration (i.e. longer than 1/frame rate). Capturing light over such extended durations allows more light to be gathered and thus more scene information to be gathered. This information can be used to enhance an image by any of a number of parameters. Extra light information may be used for example to reduce noise, improve colour accuracy, improve edge definition, or to capture and display changes in a scene over time. This process of combining multiple images to form a io composite image can be used either for a still image or in a repeated fashion for each of individual frames of a video.
Accurate View or Preview An accurate view or accurate preview of a scene being captured or imaged is one where the image displayed to the user is of a good likeness of the image that is being outputted or could be outputted by the device (still image or video frame), including a good likeness of the visual effects of the camera's variable parameters such as total exposure duration, aperture, ISO, and any combining methods such as a mean stack or zo a median stack that may be applied.
The benefit of an accurate preview is that it accurately informs the user of the effect of changes in their environment (e.g. lighting, composition, movement, facial expressions) and changes they or the device have made to camera parameters. In positioning the camera and or adjusting device settings to achieve the users goal or meet their preferences, a live preview provides a feedback loop that is short -having only two steps: (1) adjust parameter, (2) review result in near-real-time, (and repeat from step 1).
If a preview is inaccurate and if the user does want to accurately know the effect of a change to the environment or a change to a camera parameter, they need to add extra steps to achieve a feedback loop. Typically the steps of this feedback loop are: (1) adjust parameter, (2) shoot test image, (3) switch from camera mode to review mode, (4) review test image, (5) switch from review mode to camera mode, (and repeat from step 1).
Adding these extra steps breaks what would otherwise have been an uninterrupted flow of live feedback and therefore retards the user in reaching their goal.
A variant on this way of compensating for the live view not being accurate is Learned User Experience. In this variant, the user uses some version of an extended feedback loop to build for themselves a mental model of what the final image or video will look like based on the circumstances (lighting, composition, etc) of the scene and the settings of the device they are using. By performing this feedback loop many times and learning from it, the user can subsequently operate the device without needing to use a long and slow feedback loop because they can, with the aid of their mental model, predict or anticipate, with some confidence, the output. To build such a mental model usually requires a considerable amount of experience of going around a feedback loop. Viewed another way, we can see that if the feedback loop were shorter or even live, a user would be able to complete more feedback loops (of practice and experience) in a ss shorter time and thus build a good mental model in a shorter period of time. Having a good mental model is another way of describing a user who is skilled in operating an image capture device or computational imaging device.
Summary of Problems faced by conventional devices
conventional image capture devices and conventional computational imaging devices are not able to present a view of the scene that is at the same time; live, accurate and depicts an extended exposure duration.
io Consequently there are various problems that arise from the way in which conventional image capture devices and conventional computational imaging devices provide feedback to the user and these in turn impact on the still image or video that is captured. In periarticular the fact that video feedback is either not live or does not accurately depict the effects of Extended Duration Exposures makes the device unintuitive. The unintuitiveness both hampers a user in operating the device and retards their learning (of the device, of the effects of adjusting the device settings, and of the devices automatic setting adjustments in response in different scene conditions.
zo DEFINITIONS The term Computational Imaging Device (30) as used herein, is defined above. The term Image Capture Device (10) as used herein, is defined above.
The term Single Exposure Image as used herein, is an image captured by an Image Capture Device having a single continuous exposure to the light emanating from said scene. The photosensitive sites of an electronic Image Sensor are read once.
The term Multiple Exposure, as used herein, is an Image captured by an Image Capture Device having been exposed to light emanating from a scene (or multiple scenes) a plurality of times, such that the scene is blocked from view of said Image Capture Device one or more times, such that there are a plurality of periods of exposure to light onto the photosensitive surface of the image capture device. As with a Single Exposure Image, the photosensitive sites of an electronic Image Sensor, are read only once.
The term Composite Image, as used herein, is an image formed by combining a series of Single Exposure Images and or Multiple Exposure Images and may include images from one or multiple Image Capture Devices (30) The term Scene, as used herein is a scene within the field of view of the Image Capture Device or devices (30) or Computational Imaging Device (10).
The term Combining Method as used herein, is a method of utilising data from a substantially sequential series of images of a scene. The images captured by one or multiple image capture devices and including any of original captured images, preprocessed images, and previously combined images and forming from these a composite image. Combining Methods include two types of combining Method referred to here as Simple Combining Methods and Complex Combining Methods.
SIMPLE COMBINING METHODS
Simple Combining Methods apply an algorithm across each image of a series, wherein the images are aligned such that there is a registration point (in a basic example the registration may be a corner of each image being aligned with the same corner of the 55 other images -and for images having the same width and height of pixels -it may be said that any pixel position on one image may be found on all the other images. And so in this simple method, each pixel of the same position and for all images in the series is computed to arrive at a final pixel value for the composite image.
An algorithm may be as simple as finding the pixel with the brightest luminance value and selecting that, this being known as a 'maximum' or 'lighten method. Or taking the mean of each of the pixel values and using that result as the value for pixels in the final composite image. Or taking the median value and using that as the pixel value of the final composite image.
COMPLEX COMBINING METHODS
Complex Combining Methods may use information across more than one pixel and area of the image, the whole image, more than one image, and other sources, such as other image capture devices, machine learning, databases, or other information sources live is and stored, and calculated in order to arrive at pixel values of a final composite image. Nonetheless, information contained in the images is an input to the combining process.
ALIGNING
Aligning of images prior to, or while, combining them may be developed further by zo aligning other than with registering corners of the image frames but either by identifying common scene elements across images and aligning images to those scene elements, or by using homographic transformation such that each original Single Exposure Image is distorted such that it shares with other Images a 'common (virtual) view point of the scene'. Homographic transformation may additionally be applied to moving elements in a scene.
NOVEL IMAGE PROCESSING & DISPLAY PIPELINE The real-time image processing pipeline of this invention uses a set of N Single Exposure Images as Input Images to combine and produce a Composite Image. Each Composite Image is produced from a different set of Input Images. The Combining Set is a substantially sequential sub-set of a substantially sequential series of images captured by one or more image capture devices. The Images constituting the Combining Set changes from one combining operating to the next. Following a combining operation, the oldest image in the Combining Set is removed and a newly captured Single Exposure Image is added. The images of the Combining Set are then combined to form a Composite Image, which is then displayed as a video frame. On completion of the combining operation, the Combining Set is again updated as above, with an old image removed and a new image added. This process is repeated on an ongoing basis. And crucially instances of the operation are performed in overlapping time periods, such that before one is finished, another begins. [See FIGURE 13 for illustrative example] This operation is performed in real time so that composite images are displayed as soon as they have been produced. The method of the present invention is novel in taking the extra step of calculating a plurality of Video Frame Presentation Start Times, Tv, using said Presentation Timer (50), wherein Tv < TR ± DT, wherein TR is the moment in Real-Time when light recorded by one of said Single Exposure Images, first reaches said Image Capture Device (30), wherein Real-Time is the actual time at which a process or event occurs, such as to present said Composite Images in Near-Real-Time, and such that individual of said Video Frames depict said Scene over a duration, DT, so greater than a Presentation Duration, Dv, of said individual Video Frame.
In this way the scene being displayed in the video is updated at a video frame rate that is de-coupled from the total exposure duration of the video frames being displayed. In this way the invention is able to simultaneously achieve the three desirable attributes of a) Composite images formed that depict the scene over an extended duration, b) video ss frames accurately depict these Composite Images, c) video frames are presented live ( The meaning of Live as used herein is near-real-time, having a high frame rate and a low latency such as to appear continuous).
In this way the present invention reduces the steps required for user feedback and has access to the benefits described earlier.
A variation on the Image Processing Pipeline described above is that an Existing Composite Image is modified using a combining method to form a New Composite Image. The inputs to create the New Composite image include the most recently formed Existing Composite Image, a newly captured Single Exposure Image (whose information is to be 'added), and an oldest-image of n (some or all of whose information is to be 'removed' as part of the Combining process). In this variation the full and original set of Images in the Combining Set are not combined but a similar or identical Computational Image is formed. And that Computational image still has as inputs the same member images of the Combining Set. The difference is that the member images are not processed in their original form in the combining operation. This variation will in some cases have a reduced requirement for data storage and for computational processing, enabling the same or similar Composite Image result to be achieved in less time.
In either approach, the method is repeated so as to create a series of composite images zo each of which having as original source inputs a number of N captured single exposure images.
The apparatus of the present invention is novel by containing a Presentation Timer (50) for timing the presentation of said Composite Images on said Display (60).
The term Total Exposure Time Duration, DT, as used herein, is a time duration, within which one or more Single Exposure Images are captured with said Image Capture Device. There may be a time gap between the end of capture of one Single Exposure Image and beginning of capture of a subsequent Single Exposure Image. If there are gaps the combined exposure times of Single Exposure Images will be less than the total Exposure Time Duration but may approximate it if the gaps are short relative to a Single Image Exposure Duration.
The term Video Frame Presentation Duration, as used herein is approximately equal to 1/ 35 Video Frame Presentation Rate The term Video Frame Presentation Rate, as used herein is the number of video frames presented on said display within a given time period.
SUMMARY OF PROPOSSED SOLUTION
[DESCRIPTION OF CLAIM 1] -SEE FIG 1,2
To overcome the problems identified above, the present invention proposes a method for operating a Computational Imaging Device (10) to display a Video of a Scene, which produces Composite Images formed from Single Exposure Images captured over an extended duration capture window, and at the same time displays a video depiction of the scene that is both live and accurately depicts the Composite Images formed.
The Computational Imaging Device (10) comprising an Image Capture Device (30) for capturing multiple Single Exposure Images of the Scene, a Combiner (40) for combining a series of Single Exposure Images to form a Composite Image, a Display (60) for displaying a series of the Composite Images as a series of Video Frames and a Presentation Timer (50) for timing the presentation of the Composite Images on the Display (60); the method comprising steps a, b, c, d: (a) capturing a series of said Single Exposure Images using said Image Capture Device (30); (b) combining N number of said Single Exposure Images, to make said Composite Image io using said Combiner (40), wherein said Single Exposure Images were captured over a Total Exposure Time Duration, Di, such that said Composite Images depict said Scene over said Total Exposure Time Duration, Di; (c) calculating a multiple of Video Frame Presentation Start Times, T" using said Presentation Timer (50), wherein 1, < T, + DT, wherein TR is Real-Time, being the moment in time when light recorded by one of said Single Exposure Images, first reaches said Image Capture Device (30), such as to present said Composite Images in Near-Real-Time, and such that individual of said Video Frames depict said Scene over a duration, D, greater than a Presentation Duration, D,, of said individual Video Frame, This greater duration being an Extended Exposure Duration as discussed in the Background Section.
(d) presenting to a user, a series of said Composite Images as said Video Frames, using said Display (60) at each of said calculated Video Frame Presentation Start Times, I-PREFERRED EMBODIMENT OF CLAIM 1 -ARCHIEVING SMOOTH PREVIEW In a preferred embodiment, the delay D, between real time T. and Video Frame Presentation Start Time, T" is sufficiently short that video frames are presented in Near-Real-Time, and the delay is not noticed by a user, therefore creating the visual appearance of being a live display.
In a preferred embodiment, the Video Frame Presentation Rate is sufficiently fast as to create for the user the visual effect of appearing continuous. For instance a frame equal 40 to or faster than rate of 24, or 30, or 60 frames per second, these being standard frame rates any of which contribute to the appearance of continuity.
[EXTENDED EXPOSURE DURATION] In a preferred embodiment, the combining method has the effect of depicting motion blur in a scene where and when subjects are moving relative to the image capture device within the time period of the Total Exposure Duration Capture Window. In this case, the individual video frames may each present the scene over an extended duration and therefore depict any changes in the scene over that extended duration, including the appearance of motion. And yet the Video Frame Presentation Rate, may be sufficiently high that the video of the scene appears to be continuous and live.
[ACHIEVING ALL 3 DESIRABLE ATTRIBUTES] Thus the invention presented here is uniquely able to achieve simultaneously the three desirable attributes discussed earlier of: a video frame accurately depicting in each frame a scene, over an extended duration, while at the same time refreshing that moving image in near-real-time, thus creating the appearance of it being live.
[ADDITIONAL PROCESSING OF A COMPOSITE IMAGE TO FORM A VIDEO FRAME] There may be some processing of an image file between it being first combined to form a composite image and having been prepared to meet the requirements as a video frame in a video, these changes may be for instance include resizing, or down-sampling, or reducing the number of pixels to be displayed, or adjusting for display parameters such as brightness or colour tone. However, it is still the case that the video frame presented has a strong likeness to the composite image formed of the set of Single Exposure Images.
Further, Video Frame Presentation End Times and or Video Frame Presentation Durations may be calculated by said Presentation Timer, such that where two or more are known is or calculated of Video Frame Presentation Start Time, Video Frame Presentation End Time, and Video Frame Presentation Duration and possibly any small delay in between presenting frames.
Timings are updated by the Presentation Timer from time to time.
Video presentation times are calculated or known from record.
[DESCRIPTION OF CLAIM 2] -SEE FIG 3, 4
The computational Imaging Device may further comprise a Total Exposure Time Duration 25 Setter (21) which can be used by either the user or a computer to set Total Exposure Time Duration.
After a composite image has been created and presented to the user, the user will be able to see the effect of the current value of Total Exposure Time Duration. The user may decide that they wish to change this value and they can do so with said Total Exposure Time Duration Setter. The same can equally be achieved by a computer rather than a human user.
The Total Exposure Time Duration Setter may be used for other reasons including any of 35 reducing motion blur of subjects in a scene, camera shake from a moving Image Capture Device.
Thus a feedback loop may be created where a user views the displayed image or a computer reads the image information and metadata and choses to make a change, creating said feedback loop. This could happen even before the frame has been combined or presented. In both cases the user and computer may take into account other information beyond the image itself; including any of: settings, or environmental parameters such as a moving subject that is outside the scene but about to come in.
The purpose of setting Total Exposure Time Duration is to choose the time duration for which the final Composite Image will depict and there may be light emanating from a part of the scene or in a pattern or in a pattern over the scene during that duration that is desirable to capture as an input to the combining method and therefore as an input to the composite image, such that the composite image depicts some part or aspect of so the light emanating from the scene over the total duration.
An example of this is of a user choosing to show motion blur in a final Composite Image. A user may wish to extend the total exposure time duration while capturing single exposure images using the image capture device while steadied on a tripod or other steadying method or apparatus and as subjects within the field of view of the Image Capture Device move during the Total Exposure Time Duration of the Single Exposure Images being captured, light emanating from or reflected by those moving subjects and arriving at the Image Sensor of the Image capture device will be recorded. Then by using a Mean Combining Method wherein pixel luminance values for pixels of the final composite image are calculated using, in this example, the mean of pixels from each of the series of Single Exposure Images having been aligned, will result in the appearance of motion blur, and the length of motion blur visible in the composite image will be greater when the Total Exposure Time Duration is greater.
[DESCRIPTION OF CLAIM 3] -SEE FIG 5,6
The computational Imaging Device may further comprise a Number Setter (22), wherein said variable parameter, Number of Substantially Sequential Single Exposure Images, N, of step (b) is set using said Number Setter (22).
The Number Setter enables the number of images to be combined to be varied. Varying the size of N is one variable that has an effect on the final composite image.
One example is when using a combining method (for example mean, median, maximum, or minimum) -if N is small and DT is large, and if video is played at a speed that matches real-world speed, then the Video Frame Presentation Duration of an individual frame will be larger than if N had been large.
There are occasions when It is desirable for the purpose of combining images and for 25 presenting images that N be large and that the Exposure Duration of a Single Exposure Image be short. Here are examples illustrating why these characteristics are desirable.
(a) In very bright scene conditions individual photosite of an Image Sensor (16) may become saturated and therefore unresponsive if said Single Exposure Time Duration becomes too long. By having a shorter Single Exposure Time Duration the photosite can be read and re-set allowing the photosite to be read before it reaches capacity and therefore reading accurately the scene luminance across the whole image. These values can be combined across a plurality of Single Exposure Images, allowing more light information to be recorded and more accurately allowing for broadly the same amount of light to be captured and recorded as would have been achieved from a single long exposure duration image but without the risk of over-exposing some or all of the photosites.
(b) A second benefit is that having a value of N that is large enables a smoother video frame -when an image is combined with only a small number N of Single Exposure Images the gap in time between one and another becomes visibly evident. For example with a subject moving across the scene (angle of view being viewed) then in the time gap in between the images being captured, that time gap can become visible in the final composite images. If there are more images captured -if N is a larger number as then there are more gaps but it smooths out the image and if a further processing step is taken as part of the combining method to smooth the visual effect created by these gaps, it may be easier to achieve a smooth effect and a more accurate depiction of the the scene if the number N is large. This is because each additional Single Exposure Image adds temporal scene information that can be used to infer the position and brightness of scene elements during the time gaps.
(c) A third benefit of N being large is that for a given value of DT, single exposure duration will be small -which in turn reduces the minimum time between combining images, such that for example single exposure duration was 1 second, then you would 55 need to wait 1 sec before having a new individual exposure to feed into the combing method and process a composite image. Whereas if frames were being shot with an exposure duration of 1/30 sec you would only need to wait approximately 1 /30th sec before you would have a new image that could be combined and have a new Composite Image and that consequently means that the delay between Real-Time and Video Frame Presentation time is reduced from 1 second to 1/30'-second. This in turn means that the picture can be updated in near-real-time such that the user doesn't perceive a delay in the depicted video frames of the scene on the display.
These are three benefits of a short Single Exposure Duration. Setting Single Exposure Duration may be done directly or it may be set indirectly as a consequence of Total Exposure Duration Time, and number of Single Exposures N that are going to be combined. The value of N while relevant to step (b) of claim 1 may be set at any time using said Number Setter (22).
[DESCRIPTION OF CLAIM 4] -SEE FIG 5,6
The computational Imaging Device may further comprise a Video Frame Rate Setter (23), wherein a Video Presentation Frame Rate, is set using said Video Frame Rate Setter (23).
The term Video Presentation Frame Rate, as used herein is the number of video frames presented on said display within a given time period. Said frames are representative of (have a good likeness to) said composite images. Adjusting the Video Frame Presentation Rate will vary the file size of the video if exported, outputted, or stored and the processing load, when outputted, stored or displayed and may have an effect on the perceived smoothness of the video by the user.
A higher frame rate may have an impact on the users perception of the smoothness of the video. A higher framerate create the appearance of continuity more fully and for a wider subset of the population. Having a very low video frame rate will reduce processing load, storage, and power consumption. It may also be desirable to set video frame rate to align with the requirements of a display or other video footage the display's supporting processing/storage,and other characteristics such that it is compatible with these.
[DESCRIPTION OF CLAIM 5] -SEE FIG 5,6
The Computational Imaging Device may further comprise a Single Exposure Setter (24) wherein one or more of a set of Variable Parameters of said Single Exposure Image, is set using said Single Exposure Setter (24), wherein said Variable Parameters and not limited to but include one or more of: single exposure duration, aperture size, light sensitivity of image sensor, electrical amplification of image signal, digital amplification of image signal, Neutral Density Filter presence, Optical Strength of Neutral Density Filter, luminance of controlled light source including flash, duration of controlled light source including flash, focal length, foal distance, sensor selection, lens selection, image stabilisation, image alignment, 3D orientation, 3D position, 3D velocity of said Image Capture Device (30) and or subcomponents of the same, including but not limited to image sensor, lens (12), and lens elements (37), By setting or otherwise controlling one or more variable parameters of the apparatus as it pertains to capturing a single exposure image, the Computational Imaging Device on its so own or combined with the user can change these parameters based on feedback from having captured images.
In the case of the Computational Imaging Device, it can read the single exposure images change these parameters based on the feedback based on having captured 55 images. in the case of the apparatus it can read the sigle exposure images, and combined images, and other inputs and choose to make changes to the apparatus as they pertain to the variable parameters of a single exposure.
And so too can the user. Individual frames of composite images are viewable by said user and based on the information of these frames plus the environment around them, the field of view of the Scene and wider and any other knowledge, skill, and readout of the apparatus or other apparatus they can choose to make changes to said Single Exposure Parameters.
io One example is of a user who wishes to create the effect of background blur in a still or moving image and after having an image presented on a display, the user may look at that, and judge that they want to achieve a greater amount of background blur and consequently set the aperture to be larger so as to achieve a shallower depth of filed and a greater background blur.
The term Single Exposure Duration as used herein is the exposure time of a Single Exposure Image captured wherein there is no obstruction in the apparatus such as a closed shutter inhibiting light from reaching an Image Sensor (16).
Aperture size being the size of the aperture of the image capture device.
Neutral Density Filter (17) presence. Being the physical presence of a neutral density filter between the scene and the Image Senor (16) of the Image Capture Device (30). The 25 Neutral Density Filter (17) being of fixed or variable transmittance of light.
Optical strength of Neutral Density Filter (17) being a measure of the transmission of light through said Neutral Density Filter (17), being fixed or variable. And the optical strength being a default strength or the strength set.
Luminance of controlled light source including flash. Also including a lamp on said Computational Imaging Device (10) or any light source connected by wired or wireless fashion and controlled from said Computational Imaging Device (10). Said light source being visible or invisible including Infra Red and Ultraviolet wavelengths.
Duration of the light source. Noting that the integral of luminance over time results is total photon emission and is related to the scene luminance in the case of a controlled light source sufficiently -being pointed at a scene and being close enough to the Image Capture device to reflect and receive light.
Image Sensor (16) selection, in an instance where Computational Imaging Device (10) or one of its Image Capture Devices (30) has more than one sensor and is capable of sensor selection. Has more than one sensor.
Lens (12) selection, in a Computational Imaging Device (10) or one of its Image Capture Devices (30) having more than one lens (12) or lens sets and is capable of selecting between these.
Image stabilisation, including manipulating the position of Image Sensor (16), lens (12), or 50 lens elements (37).
Image alignment, including alignment of one image to another.
3D orientation, 3D position, 3D velocity of said Image Capture Device and or subcomponents of the same, including but not limited to image sensor, lens, and lens 55 elements, whether this is manipulated by the apparatus, the user, handheld, tripod, gimble, or other moveable device.
[DESCRIPTION OF CLAIM 6] -SEE FIG 5,6
The Computational Imaging Device (10) may further comprise a Combining Method Setter (25) wherein one or more a plurality of Combining Methods is set using said Combining Method Setter (25), wherein said Combining Methods include one or more of, but are not limited to those 10 known in the art as Image Stacking Algorithms, including but not limited to: Mean Stacking, Median Stacking, Maximum Stacking, Minimum Stacking, Focus Stacking, and High Dynamic Range.
wherein said Combiner (40) uses one or more of said Combining Methods, in any permutation or combination.
[DESCRIPTION OF CLAIM 7] -SEE FIG 7
The Computational Imaging Device may provide for any of the before mentioned Settings to be set by a User, either using the apparatus or indirectly using a connected apparatus and by using any of; physical controls, or a touch screen, or voice commands, or gesturers, or thought.
[DESCRIPTION OF CLAIM 8] -SEE FIG 8
The Computational Imaging Device wherein any of said Settings are set by an Input from one or more of an External Device Connection (32), a Processor (33), and an Image Sensor (16), of said Computational Imaging Device (10).
examples include but are not limited to: a scene luminance reading, a motion sensor reading such as from an accelerometer, a timer reaching a predetermined time, a proximity sensor judging the distance of a target subject increasing by a predetermined amount, Any of said inputs could feed in to changes in the parameters to be set, either directly or via processing resulting in a processed setting change using said Processor (33)
[DESCRIPTION OF CLAIM 9] -SEE FIG 9, 10
The Computational Imaging Device may further comprise a Single Long Exposure Trigger (35), wherein after creating a Composite Image, a Single Long Exposure is triggered using said Single Long Exposure Trigger (35), and consequently: a Single Exposure Duration, a, is set to substantially equal DT, using said Single Exposure 40 Setter (24), Then one or more Single Exposure Images are captured using said Image Capture Device (30), Then said Single Exposure Duration, D,, is set to substantially equal DT/N using said Single Exposure Setter (24).
An example application of the Single Long Exposure Trigger is to produce the technical effect of having a live preview video formed of Composite Images depicting motion blur formed from a large Total Exposure Time Duration of a scene with moving subjects, but then being able to capture a single long exposure.
The Computational Imaging Device presents to the user a video on a display where the frames are composite images providing a near-real-time accurate representation of long-exposure.
The Computational Imaging Device is capable of capturing a single long-exposure Image and the user requires a video preview screen that gives them full feedback of the visual effect of long exposure, be that motion blur, camera shake and image sharpness in the situation where scene and or camera are moving.
And so in this way the user is able to switch between two modes of operation. One mode being short single exposure durations being substantially equal to DT/N which are combined and displayed as composite images as video frames. The second mode being long exposure durations, being substantially equal to DT, and thus the composite image presents a good likeness of the visual effects of motion blur that would be present in the single long exposure as long as the scene is the same and the moving subjects and camera position are the same. And so this arrangement of switching between two io modes and using the single long exposure trigger as the switch between modes allows the user to benefit from near-real-time feedback while preparing to take a high quality long exposure capture. They can then capture a high quality Single Exposure Image, confident in the knowledge that it will be an accurate depiction of what was displayed on the video preview. This arrangement reduces the number of steps in a feedback loop that the user would otherwise need to go through.
[DESCRIPTION OF CLAIM 10]
The Computational Imaging Device may further comprise one or more of a Supplementary Image Capture Device (31), wherein a Supplementary Image is 20 captured using said Supplementary Image Capture Device (31).
One example application is to capture a wide field of view using a plurality of narrower field of view Image Capture Devices (30). Further there may be advantages of having cameras with different parameter values such as different sensitivity to wavelength, or sensitivity to light, or able to capture at different resolutions. Or being smaller or compact or remotely sited, or pointing at a different scene or aspect of the same scene. By using a plurality of Image Capture Devices (30) with different properties and being abele to switch between them the user can prioritise the type and quality of image information they wish to collect.
[DESCRIPTION OF CLAIM 11]
A further embodiment wherein said Image Capture Device (30) and said Supplementary Image Capture Device (31) share a substantially similar View Point, and each captures a series of said Single Exposure Images and a series of Supplementary Single Exposure Images respectively, such that individual images captured by each of said image capture devices, at the same time, share a substantially similar View Point, One example of this embodiment is to have a smartphone digitally connected and physically mounted to a DSLR camera. The two Image Capture Devices sharing a substantially similar viewpoint. The smartphone has the primary Image Capture Device and is used to capture Single Exposure Images, combine them to form Composite Images and display these as video frames. In this way the smartphone can display the effects of extended exposure durations and in this instance a combining method of averaging images to show motion blur. The connected DSLR has a Neutral Density filter as applied to the lens to reduce the amount of light entering the camera and it is set to capture an extended exposure duration, e.g. of 10 seconds. The smartphone video display previews the effect of a 10-sec exposure on its display. The user composes the image and at a time of their choosing activates said Single Long Exposure Trigger (35) but in this case, the trigger causes a Single Long Exposure to be captured by the DSLR.
so Said DSLR has a larger Sensor making combining operations slower, it also has less onboard computing power making combining operations slower, and it has an ND Filter and the on-board ability to capture a very long exposure, and it has larger photosites which aids in capturing a single longe exposure. These characteristics make the DSLR well suited to capturing a high quality Extended Duration Single Exposure Image. And by ss complementing the DSLR with the smartphone, the two Image Capture Devices (30, 35) form parts of the whole Computational Imaging Device (10) and enable a live preview that is accurate and a very high quality long exposure all with a user experience that involves live feedback and is therefore highly intuitive.
[DESCRIPTION OF CLAIM 12]
A further embodiment wherein said Single Exposure Image is said Supplementary Image, One example of this embodiment may be a smartphone digitally connected to and mounted on a DSLR camera, together being a Computational Imaging Device (10). The io Computational Imaging Device (10) may operate in such a way that only one Image Capture Device (30) is used to capture, preview images, and store images, but that the user may select which of the two Image Capture Device is to be used at any one time. Thus the user may for example select to use the DSLR, being the Supplementary Image Capture Device to capture Supplementary Images, these being the only images is captured are the Single Exposure Images that are combined to form composite images which are both displayed as video frames and form the final outputted result for which the user is operating the Computational Imaging Device.
[DESCRIPTION OF CLAIM 13]
A further embodiment wherein said Composite Image is a Supplementary Composite Image formed by combining a series of said Supplementary Images.
[DESCRIPTION OF CLAIM 14]
The Computational Imaging Device may further comprise one or more of a Supplementary Combiner (41); wherein said Combiner (40) is said Supplementary Combiner (41).
One example of this embodiment is of a Computational Imaging Device (10) having both a primary and a supplementary: Image Capture Device and Combiner, where one said Combiner (40) produces Composite Images to be displayed to the user and the Supplementary Combiner (41) produces Supplementary Composite Images from higher quality supplementary Single Exposure Images from a Supplementary Image Capture Device (31) with a larger Image Sensor (16) and the Supplementary Composite Images are produced not in real time, but to a higher quality.
[DESCRIPTION OF CLAIM 15] -SEE FIG 11
The Computational Imaging Device may further comprise an Outputter (61), wherein an Outputted Image is outputted using said Outputter (61), to a Connected Image Receiver (62), wherein said Connected Image Receiver (62) is not said connected Display (60), wherein said Connected Image Receiver (62) may be any of but not limited to; a Presentation Device (63), a Storage Device (64), a Transmission Device (65), a Processing Device (66).
[DESCRIPTION OF CLAIM 16]
A further embodiment wherein said Outputted Image is a Still Outputted Image, being any of: said Single Exposure Images, said Composite Images, said Supplementary Images, or said Supplementary Composite Images.
[DESCRIPTION OF CLAIM 17]
so A further embodiment wherein said Outputted Image is a Moving Outputted Image, being a series of any of: said Single Exposure Images, said Composite Images, said Supplementary Images, said Supplementary Composite Images.
[DESCRIPTION OF CLAIM 18]
The Computational Imaging Device may further comprise an Optical Image Stabiliser (36), wherein the position and or orientation, of said Image Capture Device (30) and or its subcomponents changes during and or between captured images, wherein during step (a) of Claim 1, said Image Capture Device (30) and or one or more sub-components of the same, including but not limited to a Lens Element (37) and an Image Sensor (16), are translated and or rotated by said Optical Image Stabiliser (36).
Said Optical Image Stabiliser (36) may be used to stabilise the image projected by the Lens Elements (37)of an Image Capture Device (30) onto the Image Sensor (16) of said device during the a Single Exposure Image, such that the Single Exposure Image is a io sharper depiction of the scene having countered some of the motion of the device that would lead to Camera Shake blur in the image.
Said Optical Image Stabiliser (36) may additionally or alternatively be used to stabilise said device over a plurality of Single Exposure Images. Such that they more closely share a common viewpoint.
[DESCRIPTION OF CLAIM 19-ALIGNER]
The Computational Imaging Device may further comprise an Aligner (39), wherein prior to step (b) of claim 1, each of said N Single Exposure Images are aligned with one another by applying an Homography Transform using said Aligner (39), such that each 20 image shares a common Virtual Viewpoint.
[DESCRIPTION OF CLAIM 20-VIRTUAL TRIPOD]
The Computational Imaging Device may further comprise a Virtual Tripod (70), itself comprising a Processor (33), and any of a Gimble (79), said Optical Image Stabiliser (36), and said Aligner (39), and any of; an Accelerometer (72) and a Ground Truther (71). The Ground Truther, comprising, as Optical Distance Measuring Device (73), including but not limited to, a LIDAR (74), a Holographic LIDAR (75), a Digital Holographic Camera (77), a Stereoscopic Image Capture Device (76), a Structured Light Device (80), a Time of Flight Camera (81).
Said Ground Truther (71) may be used to perform the tasks of stabilising an Image Capture Device (30) or its sub components, and or to align two or more of a series of sequentially captured images to more closely share a Common Virtual Viewpoint. The outputs of said Ground Truther (71) are processed using said Processor (33), to produce one or more values of: relative position, relative orientation, relative velocity, relative acceleration, absolute position, absolute orientation, absolute velocity, absolute acceleration, of any of said Ground Truther (71), said Virtual Tripod (70), said Image Capture Device (30) or sub-component thereof, including Lens Elements (37) and image sensor (16).
When said Image Capture Device (30) is pointed forwards, that is oriented to have a 40 Viewing Direction that is substantially parallel to the plane of the horizon, being +/-15 degrees to said Plane, said Ground Truther has a Viewing Direction that is orthogonal to that of said Image Capture Device +/-45 degrees.
In a preferred embodiment said Ground Truther (71) has a combination of Viewing Direction and Angle Of View such that its field of view is largely filled with the ground. In a further embodiment, said Ground Truther, when used in an Indoor Environment, has a field of view that largely sees some combination of ground, walls, and ceilings. Said Ground Truther (71) passes said processed values to either or both of said Optical Image Stabiliser (36) and said Aligner (39).
A virtual Tripod (70) can be used as part of a system for increasing image quality. It can do so it two ways.
It can allow for more accurate adjustments in real time for optical image stabilisation to the Lens Elements (37) and or Image Sensor (16) and or with a Gimbal (79).
The other way it can increase accuracy is by enabling more accurate homographic transformation of images in a series to share or closely share a Common Virtual Viewpoint such that they can be combined to form a single high quality image, such as with de-noising using, for example, a Mean Combining Method.
In seeking to produce a higher quality image from a series of Single Exposure Images, the images need to be accurately aligned to a common viewpoint. Conventionally this is achieved best by using a tripod. Thus the Image Capture Device has little to no movement between shots and each shot is accurately aligned with the others. They all share a common viewpoint. This is true for static subjects in the scene, not for moving subjects which are discussed later.
If the Image Capture Device (30) is handheld or worn rather than tripod mounted, then there will be some movement between a sequence of shots even if they are taken in is very quick succession. This sequence of shots with similar but different viewpoints may still be combined to form a high quality image such as with Mean Combining Method. However the images must first be aligned with accuracy. The greater the accuracy of aligning the higher the potential image quality of the final image.
A set of images captured from slightly different viewpoints can be homographically transformed such that their transformed selves all share a Common Virtual Viewpoint. This homographic transformation is simplest when the scene is essentially flat, such as with high altitude drone footage of a flat landscape. If the scene is more three dimensional, such as with a statue in the foreground, with buildings in the middle distance, and with mountains in the far background. In this case it is helpful to have a detailed depth map of the scene that has been imaged. With this depth map, each image can be segmented by depth and homographic transformations can be performed on each segment and then each transformed segment can be combined to form one high quality image.
If some or all subjects in the scene are moving relative to the ground and or relative to one another, then the Image Capture Device (30) or it's user needs to decide if this movement will be depicted in the combined image as motion blur or whether moving subjects will have the appearance of their motion reversed such that not only is the scene overall homographically transformed to share a common virtual viewpoint but that individual moving subjects or moving parts of subjects are adjusted so that those subjects are better aligned frame to frame enabling a sharper depiction of them in the final composite image.
To solve for both depth and movement it is necessary to know the camera's position relative to both the static and moving subjects of the scene at each frame. The position of the Image Capture Device (30) can be determined using triangulation or trilateration or hybrid of the two techniques. To use these methods the camera needs to identify points that it can reference over time. Reference points include corners, lines and surfaces. Reference points preferably include sharp corners, straight lines, and flat surfaces. These reference points will preferably be spread over wide angles to improve accuracy. The reference points will preferably be many. And in the case of using a lighting source to aid the depth mapping (e.g. a Near Infra-Red flash for LIDAR, Time of Flight, or structured light) it is helpful if the reference points are in close proximity to the light source (due to the inverse square law) so that reflected light from said reference points reaches the depth mapping sensor with sufficient intensity as to be identified with accuracy.
Conventional Computational Imaging Devices (10) are limited in their ability to see reference points that strongly meet these criteria because they are generally limited to looking forwards. That is the Image Capture Device (30) and any optical depth mapping or Optical Distance Measuring Device (73) is generally pointed forwards. Forwards being orthogonal to down (where Down is the positive direction of gravity). Cameras are generally pointed forward when they are operated by humans because humans have eyes in the front of their heads. However a forward looking viewpoint will see a scene that is less than ideal for accurately finding reference points. When outside, the upper half of the scene will generally be filled with sky, with little or no sharp static corners, lines or flat surfaces. The lower middle quarter to lower half of the scene will generally contain scene elements that are receding into the distance, which, especially in low light conditions, makes them difficult to see in visible spectrum light and difficult to illuminate with visible or infra red light from the device. Note that scene illumination power will be limited based on the device being a battery operated device as it is a smartphone or other handheld or wearable.
If however the optical depth mapping or Optical Distance Measuring Device (73) is pointed either straight down or at up to a 45 degree angle, then generally at least half of the field of view will contain a higher concentration of potential reference points that meet the desirable criteria. The field of view is likely to contain more corners, edges and surfaces in close proximity to the device than would be the case if forward looking. In the case of the device being handheld or wearable (by a human generally less than 2m tall), the downward facing depth sensor will generally be within 2 meters of the ground directly underneath it and even with a wide field of view will mostly be filled with static objects that are less than 5m distant. Therefore it is desirable to have a depth sensor that looks down.
In one preferred embodiment the Computational Imaging Device (10) would have one Optical Distance Measuring Device (73) that shares the same field of view as the Image Capture Device (30) and one that faces down. The two Optical Distance Measuring devices (73) would work together using data from both to determine the position of the camera relative to the ground and all objects that are static relevant to the ground and determining the position of the camera relative to moving subjects in the scene whose images are being captured.
In a further embodiment, the Computational Imaging Device (10) would have one Optical Distance Measuring Device (73) whose field of view covered that of the Image Capture Device(s) (30) and extended further to look at either the ground or an area of wall or ceiling.
[DESCRIPTION OF CLAIM 21]
A further embodiment wherein said Combining Method incorporates any of: Additive Stacking, Mean Stacking, Median Stacking, such as that said Composite Image achieves a higher signal to noise ratio, and greater level of detail compared to that of an individual of said Single Exposure Image from which said Composite Image is formed.
[DESCRIPTION OF CLAIM 22]
A further embodiment wherein said Combining Method incorporates Median Stacking, wherein when one or more subject in said Scene is moving relative to said Virtual Viewpoint, during said Total Exposure Time Duration, DT, Such that when said Subject moves across said Scene it is present in any given part of said Scene for a period less so than a Threshold period, (for example <= D1/2), said subject will not be depicted in said final Composite Image.
[DESCRIPTION OF CLAIM 23]
The Computational Imaging Device (10) may further comprise a plurality of Image 55 Capture Devices (30), capturing a plurality of series of substantially sequential Single Exposure Images such that images may be combined both over time and from multiple Image Capture Devices (30), resulting in a greater quantity of input data (from more light) to the Combining Process, such that in a preferred embodiment said plurality of Image Capture Devices (30) share a common direction of view and when used with de-noising combining methods, including but not limited to Median Combining Method or Mean Combining method, a Composite Image with lower noise level may be produced than from a an equivalent combining operation that used only one Image Capture Device (30).
io In a further embodiment where said plurality of Image Capture Device (30) have different but overlapping fields of view, a panorama combining method may be employed to achieve a composite image having a field of view of the scene that is wider than from an equivalent combining operation that used only one Image Capture Device (30).
[DESCRIPTION OF CLAIM 24]
A Computer Readable Medium storing computer interpretable instructions which when interpreted by a Programmable Computer cause said Programmable Computer to 20 perform a method in accordance with any preceding claim.
[ Further Aspects of the Invention] [ Automatic Setting of White Balance]
[BACKGROUND]
Conventional cameras find it challenging to accurately know the white balance of an image. This can be a problem in correctly reproducing colour tint for instance correctly representing a white wall during midday as white or correctly recording different skin tones in different lighting conditions.
A conventional methods for calibrating white balance is of photographing a standardised White Card or Colour Checker Card, and using the part of the image that is of the card to calibrate white balance and colour tints. This method creates an extra cost, of the cards, and inconvenience of having to carry the cards and take a test shot at the beginning of a photography session or when the lighting changes colour. For instance when moving from outdoor natural lighting to indoor artificial lighting, or over transition period outside between 'Golden Hour' when the evening sky is more orange, to 'Blue Hour' when the evening sky is more blue.
To overcome the problems identified above, the present invention proposes a method for operating a Computational Imaging Device (10) to automatically determine the white balance of a scene.
The Computational Imaging Device (10) comprising a Colour Calibrator (90) comprising an Image Capture Device (30) for capturing multiple Single Exposure Images of the Scene, a Subject Recogniser (91) for recognising subjects, and a Colour Sampler (92); the method comprising steps (a) to (e) of: a. capturing an image of a scene using said Image Capture Device (30), b. recognising an Subject within said scene using said Subject Recogniser (91), c. Matching said subject either against a known subject from a Database of subjects whose colour profile is known d. Sampling the colour recorded in said captured Image from said identified subject using said Colour Sampler (92), e. Measuring the difference in colour tone between Said Sampled colour and said Known Colour.
f. Setting White Balance of said Image based on said colour difference reading.
Wherein known subjects may include learned subjects, for example: a highway advertising billboard, having a known pattern, text and colouring, a known car, known to be painted a known RAL colour.
A known wall inside a known house whose tone is known under controlled lighting.
A known person whose teeth, and the whites of their eyes, or any other part of their body as or clothing or apparel are a known colour under known lighting conditions.
The Method may further comprise a Calibration Set Up procedure. The method comprising steps 1,2,3 1) capturing an image of a known subject under controlled or semi controlled lighting so conditions.
For example capturing an image of a wall in a room in a house, at night, where the only or the main source of lighting is controlled by the image capture device, such as a flash.
2) Identifying one or more subjects in the scene.
For example capturing an image of a wall or of a person at night, in a room where the main source of lighting is controlled by the image capture device, such as a flash, and identifying individual teeth, the whites of each eye.
The Computational Imaging Device (10) may further comprise a User Facing Camera (93).
The method may furher comprise pointing said User Facing Camera (93) in the direction of said user, capturing a Self Portrait Image (Se!fie) with said User Facing Camera (93), Wherein said Selfie includes the user of the Image Capture Device.
Recognising one or more features of the user's face, including but not limited to the whites of io their eyes, their teeth, Performing the steps (a) to (f) above The Computational Image Device (10) may further and additionally comprise a Supplementary Image Capture Device (31) other than said User Facing Camera.
is The white balance calibration may be performed using one or more images captured from said User facing camera and applied to images captured using said Supplementary Image Capture Device (31).
The method may further incorporate a Lighting Model, where light sources are identified and 20 their influence on the colour of objects in 3D space is calculated and adjustments are made to the colour of a scene being photographed by said Further Image Capture Device and oriented differently to said User Facing Camera [Wide Spectrum Image and Depth Sensor (100) ]
[BACKGROUND]
Conventional depth cameras, including; Time Of Flight cameras, structured light cameras, lidar cameras and holographic cameras, generally use an Infra-Red light source to illuminate a scene and record the reflected light from the scene. In this way depth values are ascertained.
Conventional visible spectrum image sensors have RGB pixels to perceive colour. To apply depth values to a colour image captured with a visible spectrum image sensor, the depth values of the Time of Flight camera have to be transformed to the viewpoint of the colour image sensor. This operation takes processing power and time and potentially introduces inaccuracies.
[INVENTION] To overcome the problems identified above, the present invention proposes an Wide Spectrum Image and Depth Sensor (100), comprising an Infra-Red light source (109), and an Image Sensor (16), Said Image Sensor (16) having photosite sensitive to Infra-Red wavelength distributed either continuously or in regular mosaic pattern across the light sensitive surface of said Image Sensor (16) and having photosites that are sensitive to visible spectrum light, distributed either continuously or in regular mosaic pattern across the light sensitive surface of the same said Image Sensor (16).
In this way, photosites sensitive to Infra-Red record the reflected Infra-Red light and this is used to calculate depth (z axis data), and photosites sensitive to visible spectrum light (either monochrome, three wavelength ranges of red green and blue, or more than three wavelength ranges), record the reflected Infra-Red light and this is used to calculate visible spectrum image.
Vsible spectrum and z axis data is recorded for the full scene viewed by the sensor.
[ Light-Porous Image Sensor (110) ] [BACKGROUND]
Conventional Image Sensors (16) use a two dimensional array of photosites.
The ability to record light is limited to the Cross Sectional Area of the Image Sensor (16).
[INVENTION] The present invention proposes a Light Porous Image Sensor (110) comprising an image 5 sensor formed of a plurality of stacked layers of photosites. On one or more of said Photosite Layers has a pattern of Holes.
The term Hole as used herein is a volume of air, of gas, or of another material partially or fully transparent to a range of light wavelengths.
Within said holes, and or between said Photosite Layers and or above the topmost of io said Photosite Layers is an array of micro lenses and or a metamaterial micro lens array. Some of the Individual micro lenses focus a light through Holes in one or more Photosite Layers and onto a lower of said Photosite layers.
Further, each of said Holes may be of smaller diameter than the micro lens that focusses light through it.
Further, some micro lenses may focus light through multiple Photosite layers.
Further, some micro lenses may be a beam steerer, changing the direction of light such that it travels down at some angle such that it passes through a hole whose centre is not vertically aligned with the centre of a micro lens that is focussing light through it.
Further that there may be micro lens arrays between Photosite layers.
Further, that light having been focused through a hole may then diverge and then land on a photosite on a Photosite layer.
By using a plurality of Photosite Layers in this way, said Light-Porous Image Sensor (110) may have a total surface area of photosites that it is larger than its cross sectional area.
[ Wide Spectrum Spectrographic Image and Depth sensor] [INVENTION] The present invention proposes an Wide Spectrum Spectrographic Image and Depth Sensor (101) comprising a grid array of pixels, said pixels are preferably arranged in either a square or hexagonal pattern. Each pixel comprises a stacked series of photosite and a Hole next to the stack through which light may pass and hit the sides of one or more photosites in the stack. Said Gap being either a volume of air, of gas, or of a material partially or fully transparent to a range of light wavelengths.
Within said Gap, and above the top surface of said photosites is a device for splitting light according to wavelength, the light of different wavelengths being directed at different angles to different layers of the photosite stack.
In this way a pixel can record light brightness (amplitude) levels at each of a range of 40 wavelengths, thus building up a spectrogram for each pixel.
The spectrogram contains within it the RGB information to provide the colour of the image at that pixel point.
Further, the spectrogram, may be wide enough to incorporate Infra-Red light. The sensor may therefore be used as the image sensor of said Wide Spectrum Image and Depth as Sensor (100).
In this way the image sensor may be able to record simultaneously and for each pixel in an imaged scene, visible light image, depth data, and spectroscopic data.
Further, the depth data and spectrogram data may be used in conjunction with other methods to enhance image quality. This includes using depth and or spectrogram data as means of tagging points or areas of the Scene based on their 3D position, and spectrographic signature (indicative of material composition e.g. wood, aluminium, water), such that said points or areas may be better aligned from one image to the next in a series of images to be combined [ Computational Imaging Device on a Chip] [INVENTION] Said Computational Imaging Device (10) may further comprise a Stacked Image Sensor, comprising an Image Sensor Layer (103) stacked on top of a Logic Layer (104). The Logic Layer (104) being capable of perform one or both of two functions, being Homographic Transformation, and combining a plurality of images to form a composite image.
In a preferred embodiment, Said image sensor is said Wide Spectrum Image and Depth Sensor (100). The visible light image data and depth data are transferred over a short distance from said Sensor Layer (103) to said Logic Layer (104). The Logic Layer can io perform the Virtual Tripod (70) operation on a series of images aligning them, Image Segments are then combined to form a Composite Image. Said Composite Image may then be outputted from the Stacked Image Sensor.
With sufficient processing power on the Logic Layer (104), this operation may be performed multiple times per second, in Near-Real-Time. One family of combining methods, including averaging, enable the image sensor to output higher quality images by combining images over time and reducing noise.
The invention provides a computational imaging camera on a chip (106). By combining Single Exposure Images over time as per Claim 1, the Sensor may output high quality, low noise, full colour video in low light conditions.
[ Collaborative Image Sensor (111) ] A Collaborative Image Sensor (111) comprising said Computational Image Sensor on a Chip (106) additionally comprising a plurality of connections, preferably on two or more side edges, such that two or more of said Chips (106) may be connected together, either in close packed formation, or with an extended wired or wireless connection between them. In a preferred embodiment of the chip is of a hexagonal shape, having a hexagonal image sensor, which can capture a larger portion of the projected image of a lens than can a rectangular sensor. Close packing hexagonal camera modules (Comprising a hexagonal Chip (106) and a lens) provides a high efficiency in gathering light per cross sectional area of incident light as compared to close packing square camera modules (Comprising a square Chip (106) and a lens).
Said Collaborative Image Sensors (111) can work as a Team, sharing image data, and or depth data, and or and spectroscopic data. This teamwork can be used to output a single still image or series of video frames each benefiting from the extra data of the other Chips (106) in the array. Outputted images cloud be composite images of higher image quality, having gathered light from an array of Chip (106) cameras that all look at the same scene from similar fields of view. These could for instance provide a video feed for a full colour night vision or low light set of glasses or helmet. An array of chip cameras could also be used to output images of extended field of view, including 360 degree panoramas.
[ Auto Adjust N number of Single Exposure Images to be Combined] The Computational Imaging Device may further comprise an N Limiter, The method comprising noting how much movement there is between the first and the last of N images and comparing this to a reference value. When the movement becomes greater than the reference value, the number N is reduced. In this way the combiner will only combine images that are sufficiently aligned (either with or without applying an aligning method) as to create a high quality image, being one with limited blur from camera movement.
One example is of a set of full colour night vision goggles, comprising a display and a plurality of image capture devices, wherein a series of single exposure images from each of a plurality of image capture devices are combined to form a less noisy image. No work is done to align the images based on movement of the device itself. Thus when the user stands still, a large number of short exposure images will be approximately aligned from said plurality of Images Sensors and over time, and when the use is moving fast, then only one or two sequential images over time are approximately aligned. The device can monitor movement and when the user is moving they don't combine images over time but only between cameras. When the user is moving slowly they combine a few images over time. And when the user is very still they combine many pictures over time. Thus when the user is still they are presented with a higher quality video feed formed of more images captured over a longer Total Exposure Time Duration.
[ Wide Eyed Holographic Camera (107)] io The invention proposed here is of a digital holographic camera comprising; a coherent light source, a beam splitter, a multi-beam combiner for combining more than two beams, a plurality of apertures, and an image sensor.
Conventional holographic digital cameras have a beam combiner that combines a reference beam from the coherent light source and a beam gathered from an aperture 15 that received coherent light reflected from the Scene.
A problem with conventional Holographic cameras is that resolving power is linked to aperture size. Conventionally to achieve high resolving power requires a large lens that is large in 3 dimensions.
The invention here allows a digital holographic camera to have a synthetic aperture zo diameter equal to the spacing between the plurality of apertures.
In one example embodiment, a compact digital holographic camera uses a compact image sensor as found on current smartphones. The device has lens units and each of approximately lOmm diameter, positioned at the top left and the top right of a smartphone (approx 50mm apart) and a beam combiner positioned equidistant between these. The beam combiner combines the two incoming beams from the two apertures and the reference beam and presents the combination of beams to the image sensor. Thus the image sensor records a holographic interference pattern with information gathered from a virtual aperture that is many times wider than either of the individual apertures, thus improving the resolving power of the Holographic image.
Further, said Wide Eyed Holographic Camera (107) may have as its Image Sensor (16) any of a Wide Spectrum Image and Depth sensor (101), a Wide Spectrum Spectrographic Image and Depth sensor (102), or Computational Imaging Device on a chip (106) [ Wide Eyed Time of Flight Camera (108) ] The invention proposed here is of a Wide Eyed Time of Flight Camera (108) comprising; a a multi-beam combiner for combining two or more beams, a plurality of apertures, an Infra-Red light source and an image sensor.
Conventional The Wide Eyed Time of Flight Camera (108) has a synthetic aperture diameter equal to the spacing between the plurality of apertures.
In one example embodiment, a compact Time of Flight Camera uses a compact image sensor as found on current smartphones. The device has lens units and each of approximately lOmm diameter, positioned at the top left and the top right of a smartphone (approx. 50mm apart) and a beam combiner positioned equidistant between these. The beam combiner combines the two incoming beams from the two apertures and presents the combination of beams to the image sensor. Thus the image sensor records a an image from a virtual aperture that is many times wider than either of the individual apertures, thus improving the resolving power of the Time of Flight Camera.
Further, said Wide Eyed Time of Flight Camera (108) may have as its Image Sensor (16) any of a Wide Spectrum Image and Depth sensor (101), a Wide Spectrum Spectrographic Image and Depth sensor (102), or Computational Imaging Device on a ss chip (106)

Claims (1)

  1. CLAIMS1. A method for operating a Computational Imaging Device (10), to display, in Near-Real-Time, a Video of a Scene, comprising a series of Composite Images each formed from a series of Single Exposure Images, Said Computational Imaging Device (10) comprising: an Image Capture Device (30), for capturing multiple said Single Exposure Images of said Scene; io a Combiner (40), for combining a series of said Single Exposure Images to form said Composite Image; a Display (60), for displaying a series of said Composite Images as a series of Video Frames; a Presentation Timer (50) for timing the presentation of said Composite Images on said Display (60); the method comprising steps a, b, c, d, where: a) capturing a series of said Single Exposure Images using said Image Capture Device (30); b) combining N number of substantially most recent said Single Exposure Images, to make said Composite Image using said Combiner (40), wherein said Single Exposure Images were captured within a Total Exposure Time Duration, DT, such that said Composite Images depict said Scene over said Total Exposure Time Duration, DT; wherein DT is substantially constant between consecutive combining operations, c) calculating a plurality of Video Frame Presentation Start Times, Tv, using said Presentation Timer (50), wherein Tv < TR ± DT, wherein TR is the moment in Real-Time when light recorded by one of said Single Exposure Images, first reaches said Image Capture Device (30), wherein Real-Time is the actual time at which a process or event occurs, such as to present said Composite Images in Near-Real-Time, and such that individual of said Video Frames depict said Scene over a duration, Dr, greater than a Presentation Duration, Dv, of said individual Video Frame, d) presenting to a User, a series of said Composite Images as said Video Frames, using said Display (60) at each of said calculated Video Frame Presentation Start Times, Tv, 2. The method of claim 1, and wherein the apparatus further comprises a Total Exposure Time Duration Setter (21), and wherein the method further comprises after creating one of said Composite Images: setting variable parameter of said Total Exposure Time Duration, Dr, using said Total Exposure Time Duration Setter (21).3. The method of claim 1, and wherein the apparatus further comprises a Number Setter (22), and wherein the method further comprises setting said variable parameter, Number of Substantially Sequential Single Exposure Images, N, of step (b) of claim 1 using said Number Setter (22).4. The method of claim 1 and wherein the apparatus further comprises a Video Frame Rate Setter (23), and wherein the method further comprises setting a variable parameter, Video Presentation Frame Rate, using said Video Presentation Frame Rate Setter (23).5. The method of claim 1, and wherein the apparatus further comprises a Single Exposure Parameter Setter (24), and wherein the method further comprises setting one or more variable parameters of said Single Exposure Image, using said Single Exposure Parameter Setter (24), wherein said Variable Parameters include one or more of: single exposure duration, aperture size, light sensitivity of image sensor, electrical amplification of image signal, digital amplification of image signal, Neutral Density Filter presence Optical Strength of Neutral Density Filter, luminance of controlled light source including flash, duration of controlled light source including flash, focal length, foal distance, sensor selection, lens selection, image stabilisation, image alignment, 3D orientation, 3D position, 3D velocity of one or more of: said Image Capture Device (30), or subcomponents of said Image Capture Device (30), including: image sensor, lens, and lens elements, 6. The method of claim 1, and wherein the apparatus further comprises a Combining Method Setter (25), and wherein the method further comprises setting one or more Combining Methods using said Combining Method Setter (25), wherein said Combining Methods include those known in the art as Image Stacking Algorithms, including one or more of: Additive Stacking, Mean Stacking, Median Stacking, Maximum Stacking, Minimum Stacking, Focus Stacking, and High Dynamic Range.wherein said Combiner (40) uses one or more of said Combining Methods, in any permutation or combination.7. The method of claims 2-6; wherein any of said variable parameters, or Combining Methods are set by a User.8. The method of claims 2-6; and wherein the apparatus further comprises any of an External Device Connection (32), a Processor (33), and a Sensor (34), and wherein the method further comprises setting any of said variable parameters, or Combining Methods using an Input from one or more of said External Device Connection (32), said Processor (33), and said Sensor (34).9. The method of any preceding claim, and wherein the apparatus further comprises a Single Long Exposure Trigger (35), and wherein the method further comprises, after creating a composite image, triggering a Single Long Exposure using said Single Long Exposure Trigger (35), and consequently: setting a Single Exposure Duration, Ds, to substantially equal Dr, using said Single Exposure Parameter Setter (24), then capturing one or more Single Exposure Images using said Image Capture Device (30), 10. The method of any preceding claim, and wherein the apparatus further comprises a Supplementary Image Capture Device (31), and wherein the method further comprises capturing a Supplementary Image using said Supplementary Image Capture Device (31).11. The method of claim 10, wherein said Image Capture Device (30) and said Supplementary Image Capture Device (31) are in close proximity to one another and share a substantially similar direction of view, being within 15 degrees of one another and each captures a series of said Single Exposure Images and a series of Supplementary Single Exposure Images respectively, such that individual images captured by each of said image capture devices, at the same time, share a substantially similar View Point.12. The method of claim 10, wherein said Single Exposure Image is said Supplementary Image.13. The method of claim12, wherein said Composite Image is a Supplementary Composite Image formed by combining a series of said Supplementary Images.14. The method of any preceding claim, and wherein the apparatus further comprises one or more of a Supplementary Combiner (41); wherein said Combiner (40) is said Supplementary Combiner (41).15. The method of claim 1, and wherein the apparatus further comprises: an outputter (61) for creating an outputtable image, and a Connected Image Receiver (62) for receiving an outputtable image, and wherein the apparatus further comprises any of a Presentation Device (63), a Storage Device (64), a Transmission Device (65), a Processing Device (66) wherein said Connected Image Receiver (62) is not said connected Display, wherein said Connected Image Receiver (62) may be any of: said Presentation Device (63), said Storage Device (64), said Transmission Device (65), said Processing Device (66), and wherein the method further comprises outputting said outputtable image using said Outputter (61) to said Connected Image Receiver (62), 16. The method of claim 15, wherein said Outputted Image is a Still Outputted Image, being any of: said Single Exposure Images, said Composite Images, said Supplementary Images, io said Supplementary Composite Images.17. The method of claim 15, wherein said Outputted Image is a Moving Outputted Image, being a series of any of: said Single Exposure Images, said Composite Images.said Supplementary Images, said Supplementary Composite Images.18. The method of any preceding claim, and wherein the apparatus further comprises an Optical Image Stabiliser (36), and wherein the apparatus further comprises any of a Lens Element (37) and an Image Sensor (38), and wherein the method further comprises, changing any of the position, and orientation, of any of: said Image Capture Device (30), and any of its sub-components, including any of: said Lens Element (37) and said Image Sensor (38) using said Optical Image Stabiliser (36), 19. The method of any preceding claim, and wherein the apparatus further comprises an Aligner (39), and wherein the method further comprises, after step (a) of claim 1 and before step (c) of claim 1, aligning each of said N Single Exposure Images by applying a Homography Transform using said Aligner (39), such that each image shares a common Virtual Viewpoint.20. The method of claim 18 or 19, and wherein the apparatus further comprises a Virtual Tripod (70), comprising a Tripod Processor (78), and any of; an Accelerometer (72) and a Ground Truther (71), said Ground Truther, comprising one or more of an Optical Distance Measuring Device (73), wherein said Optical Distance Measuring Device (73) includes any of a LIDAR (74), a Holographic LIDAR (75), a Stereoscopic Image Capture Device (76), a Digital Holographic Camera (77), wherein the method further comprises processing outputs of said Ground Truther (71) using said Tripod Processor (78), to produce one or more values of; relative position, relative orientation, relative velocity, relative acceleration, absolute position, absolute orientation, absolute velocity, absolute acceleration, of any of said Ground Truther (71) , said Virtual Tripod (70), said Image Capture Device (30), sub-component of said Image Capture Device (30), including said Lens Element (37) and said image sensor (38), wherein, that when said Image Capture Device (30) is oriented to have a Viewing Direction that is substantially parallel to the plane of the horizon, being +/-15 degrees to said Plane, said Ground Truther (71) has a Viewing Direction that is substantially orthogonal to that of said Image Capture Device (30), +1-45 degrees, and in one preferred embodiment said Ground Truther (71) having a field of view largely viewing the ground, and in a further embodiment, said Ground Truther (71), when used in an Indoor Environment, having a field of view that largely sees some combination of ground, walls, and ceilings, passing said processed values to either or both of said Optical Image Stabiliser (36) and said Aligner (39).21. The method of claims 18 to 20, wherein said Combining Method incorporates any of: Additive Stacking, Mean Stacking, Median Stacking, such as that said Composite Image achieves a higher Signal To Noise Ratio, and greater level of detail compared to that of an individual of said Single Exposure Image from which said Composite Image is formed.22. The method of claim 18 to 20, wherein said Combining Method incorporates Median Stacking, wherein one or more of a Subject in said Scene is moving relative to said Virtual Viewpoint, during said Total Exposure Time Duration, DT, Such that if said Subject moves across said scene such that it is present in any given part of said scene for a period less than a Threshold value, being normally Dr/2, said subject will not be depicted in said final Composite Image 23. The method of any preceding claim, and wherein the apparatus further comprises a plurality of Image Capture Devices (30), and wherein the method further comprises capturing a plurality of series of substantially sequential Single Exposure Images, such that images may be combined both over time and from multiple Image Capture Devices (30), resulting in a greater body of input data to the Combining Process, Such that In a preferred embodiment said plurality of Image Capture Devices (30) share a common direction of view and when used with de-noising Combining Methods including but not limited to Median Stacking or Mean Stacking, a Composite Image with lower noise level may be produced than from an equivalent combining operation that used only one Image Capture Device (30), and such that in a further embodiment where said plurality of Image Capture Device (30) have different but overlapping fields of view, a Panorama Combining Method may be employed to achieve a Composite Image having a field of view of the Scene that is wider than from an equivalent combining operation that used only one Image Capture Device (30).24. The method of any preceding claim, and wherein said computational imaging device (10) is hand held or wearable.25. A Computer Readable Medium storing computer interpretable instructions which when interpreted by a Programmable Computer cause said Programmable Computer to perform a method in accordance with any preceding claim.26. A Computational Imaging Device (10) comprising: an Image Capture Device (30), for capturing multiple said Single Exposure Images of said Scene; a Combiner (40), for combining a series of said Single Exposure Images to form said Composite Image: a Display (60), for displaying a series of said Composite Images as a series of Video Frames; a Presentation Timer (50) for timing the presentation of said Composite Images on said Display (60).20 25 30
GB2100567.3A 2021-01-15 2021-01-15 Imaging device Pending GB2603115A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
GB2100567.3A GB2603115A (en) 2021-01-15 2021-01-15 Imaging device
PCT/IB2022/050359 WO2022153270A1 (en) 2021-01-15 2022-01-17 Imaging device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB2100567.3A GB2603115A (en) 2021-01-15 2021-01-15 Imaging device

Publications (2)

Publication Number Publication Date
GB202100567D0 GB202100567D0 (en) 2021-03-03
GB2603115A true GB2603115A (en) 2022-08-03

Family

ID=74678910

Family Applications (1)

Application Number Title Priority Date Filing Date
GB2100567.3A Pending GB2603115A (en) 2021-01-15 2021-01-15 Imaging device

Country Status (2)

Country Link
GB (1) GB2603115A (en)
WO (1) WO2022153270A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130155264A1 (en) * 2011-12-15 2013-06-20 Apple Inc. Motion sensor based virtual tripod method for video stabilization
EP3748953A1 (en) * 2014-01-07 2020-12-09 ML Netherlands C.V. Adaptive camera control for reducing motion blur during real-time image capture

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130155264A1 (en) * 2011-12-15 2013-06-20 Apple Inc. Motion sensor based virtual tripod method for video stabilization
EP3748953A1 (en) * 2014-01-07 2020-12-09 ML Netherlands C.V. Adaptive camera control for reducing motion blur during real-time image capture

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
FABRIZIO LA, ROSA MARIA CELVISIA, VIRZì FILIPPO, BONACCORSO MARCO BRANCIFORTE, STMICROELECTRONICS: "Optical Image Stabilization (OIS)", 12 October 2017 (2017-10-12), pages 1 - 26, XP055831039, Retrieved from the Internet <URL:https://www.stmicroelectronics.com.cn/content/ccc/resource/technical/document/white_paper/c9/a6/fd/e4/e6/4e/48/60/ois_white_paper.pdf/files/ois_white_paper.pdf/jcr:content/translations/en.ois_white_paper.pdf> [retrieved on 20210809] *

Also Published As

Publication number Publication date
WO2022153270A1 (en) 2022-07-21
GB202100567D0 (en) 2021-03-03

Similar Documents

Publication Publication Date Title
US11375085B2 (en) Systems and methods for capturing digital images
KR102377728B1 (en) Image processing method, computer readable storage medium, and electronic device
US8558913B2 (en) Capture condition selection from brightness and motion
CN101888487B (en) High dynamic range video imaging system and image generating method
US11431950B2 (en) Photographic directional light reference for articulating devices
CN105120247B (en) A kind of white balance adjustment method and electronic equipment
WO2014103094A1 (en) Information processing device, information processing system, and information processing method
RU2565855C1 (en) Image capturing device, method of controlling said device and programme
WO2015184978A1 (en) Camera control method and device, and camera
JP2023509137A (en) Systems and methods for capturing and generating panoramic 3D images
JP2015144416A (en) Imaging apparatus and control method of the same
WO2019019904A1 (en) White balance processing method and apparatus, and terminal
WO2019047620A1 (en) Imaging device and imaging method
CN107431755A (en) Image processing equipment, picture pick-up device, image processing method, program and storage medium
CN108347505A (en) Mobile terminal with 3D imaging functions and image generating method
WO2021051304A1 (en) Shutter speed adjustment and safe shutter calibration methods, portable device and unmanned aerial vehicle
JP5183441B2 (en) Imaging device
WO2016202073A1 (en) Image processing method and apparatus
GB2603115A (en) Imaging device
WO2020084894A1 (en) Multi-camera system, control value calculation method and control device
WO2022245344A1 (en) Mobile device support for capture and synthesis of extreme low-light video
WO2023047645A1 (en) Information processing device, image processing method, and program
WO2024004584A1 (en) Information processing device, information processing method, program
WO2023106118A1 (en) Information processing device, information processing method, and program
WO2023176269A1 (en) Information processing device, information processing method, and program